June 20, 2026
174 reboots and zero mercy
I restarted a 10 year old Xeon 174 times to delete 12 flags and gain 4 tps
He rebooted an ancient server 174 times, and the internet roasted both the setup and the writing
TLDR: A developer spent ages rebooting an old server 174 times to find out which AI settings were actually useful, gaining a small but real speed boost. Commenters were split between impressed and savage, mocking the writing, questioning the reboots, and joking that he reinvented science the hard way.
A programmer went full mad scientist on a nearly 10-year-old computer, restarting it 174 times just to figure out which settings actually mattered when running a giant AI model on old hardware with no fancy graphics card. The payoff? He stripped out 12 useless settings and squeezed out a tiny speed boost — about 4 extra tokens per second, basically a little more reading speed from a machine many people would call e-waste. On paper, it’s a patient, nerdy victory. In the comments, though, the real show was the collective eye-roll, nitpicking, and comedy.
Some readers were impressed by the sheer grind, but others immediately went for the throat. One of the loudest complaints wasn’t even about the experiment — it was about the writing style. A commenter called it almost unreadable, accusing it of sounding like AI-generated mush. Another declared they never wanted to read the phrase “heavy lifting” again, while someone else dragged the wording “worth sitting with” as painfully overused internet prose. Ouch.
Then came the classic tech-forum split: Why do this manually at all? One person argued the whole process screams for automation, while another snarked, essentially, congrats on discovering the scientific method. And perhaps the funniest jab of all: why reboot the whole machine every time? So yes, the post delivered hard-won results — but the community turned it into a live referendum on writing, workflow, and whether suffering for a tiny speed gain is genius or just glorified self-inflicted pain.
Key Points
- •The article re-examines a previously published 25-flag Gemma 4 inference configuration to determine which flags actually improve performance on a 2016 Xeon system.
- •The author used an ablation method, disabling one flag at a time and rerunning tests, to measure each flag’s contribution within the full configuration.
- •The benchmarking campaign required 174 runs, with each run reloading roughly 25 GB of weights from disk on a CPU-only machine with 128 GB of DDR3 memory.
- •The test environment used a Xeon E5-2620 v4, ik_llama.cpp on the feat/gemma-4-mtp branch, and gemma-4-26B-A4B-it with an MTP drafter, both at Q8_0.
- •Partial results shown in the article indicate that some drafter settings outperformed the published autotuned configuration on certain workloads.