Logging · 001

Engineering's view on local frontier AI — what actually runs on hardware you control.

A weekly engineering journal benchmarking open-weights models on consumer silicon. Real harnesses, real failures, no vendor decks.

Recent dispatches
  1. 1 A BIOS Update Won't Fix #6182 — I Tried the Newest One The Bosgame M5's ROCm bug is board-specific, not chip-specific — so firmware is the obvious lever. I flashed Bosgame's newest official BIOS hoping to dodge it. It didn't work, and the negative narrows where the fault actually lives. 4 min read · Jun 11
  2. 2 Full Context on a Vulkan-Only Strix Halo: The Decode-Drop Reproduces, but the Sweet Spot Moves kmarble showed ROCm decode collapses 64% at full context on Strix Halo, and ROCm+MTP cures it. My board can't run ROCm. The Vulkan half reproduces the drop — but the MTP sweet spot from last week walks left at depth: by 76k, drafting too deep is slower than no speculation at all. 12 min read · Jun 04
  3. 3 MTP Defaults Are a Trap: What 260 Runs Showed About Speculative Decoding on Qwen3.6 Until May 19, the llama.cpp speculative-decoding default was 16. On Qwen3.6's single MTP head, that default cost up to 75% of generation throughput. Here's where the real sweet spots are — and why they're architecture-specific. 11 min read · May 28
  4. 4 ROCm 7.x on the Bosgame M5: 14 Configurations, 14 Failures We promised a ROCm 7.x revisit. We got a comprehensive workaround sweep instead. Both are useful. 9 min read · May 22
  5. 5 Vulkan/RADV vs ROCm 6.4 on Strix Halo: What 128 Benchmark Runs Actually Showed The headline isn't where Vulkan wins. It's where ROCm doesn't run at all. 9 min read · May 14
  6. 6 What 96GB of VRAM on Unified-Memory Hardware Actually Gets You for Local LLM Inference An honest practitioner take from a Bosgame M5 running Strix Halo at full BIOS allocation. 8 min read · May 09