I run local LLMs on consumer hardware and write up what the hardware does: tokens per second, prompt processing, acceptance rates, time to first token, and what actually fits in memory. Real models, repeatable runs, numbers published as I measure them. This page is for vendors who want their hardware tested the same way.

What I run

  • Bosgame M5: Ryzen AI Max+ 395, gfx1151, 96GB allocated to the GPU, Fedora Server, llama.cpp on Vulkan and ROCm.

  • A second box: Ryzen 9 9950X3D with an RX 9070 XT (16GB), CachyOS.

I test across backends and across llama.cpp builds, because a single build or a single flag can change the result. I've published the cases where it did.

How a test works

Send me the hardware and I run it through the same benchmarks I run on everything else, then publish the writeup. The numbers are whatever the hardware produces. I keep editorial control, and I don't do guaranteed positive coverage. If a box is on loan or a test is paid, that's stated plainly in the post, and affiliate links are marked where they exist.

If the hardware is good, the numbers show it.

What the writeups look like

Same detail and tone there. Independent, specific, no marketing copy.

Contact

erik@thefrontierlab.ai. Tell me what the hardware is and what you want tested.