June 17, 2026
Rebooted Into the Comment Wars
I restarted a 10 year old Xeon 174 times to delete 12 flags and gain 4 TPS
He rebooted an ancient server 174 times—and the comments instantly turned into a nerd slap-fight
TLDR: A programmer tested an old computer 174 times to prove that many of the fancy AI settings people copied from his earlier post barely mattered. Commenters loved the obsession but also dragged the setup, with some accusing him of sharing options that were useless on his kind of machine.
A tech tinkerer went full lab-rat mode, restarting a nearly decade-old computer 174 times just to figure out which settings actually mattered when running a giant AI model on old hardware. The punchline? After all that pain, only a handful of changes really helped, and the reward was a measly but very real speed bump: about 4 extra responses per second. For normal people, that sounds absurd. For the internet, it was catnip.
The community reaction was split between awe, mockery, and backseat-driving. One camp treated the whole thing like heroic science: tedious, obsessive, and weirdly noble. The line “the count is the point” became instant comment-bait, with one user joking, “Thanks Claude,” as if the post had the suspiciously polished vibe of an AI confession. Another crowd was much less impressed and came in swinging with the classic internet accusation: this guy must not have read the instructions. One commenter bluntly claimed some of the settings were pointless on a machine with no graphics card at all, basically saying the original viral command was part magic spell, part cargo cult.
That’s where the real drama lives: not just in the experiment, but in the quiet panic that tons of people may have copied a giant command they didn’t understand from a Hacker News post and hoped for miracles. The article says, in essence, you’re holding it wrong—and the comments turned that into a mix of roast session, fact-check, and support group for people who love squeezing absurd life out of old machines.
Key Points
- •The article revisits a previously shared 25-flag Gemma 4 inference command to determine which flags actually improve performance on a 2016 Xeon CPU-only system.
- •The author used an ablation approach, changing one flag at a time and measuring the result across repeated runs.
- •The benchmark required 174 fresh server runs, each reloading about 25 GB of weights from disk before inference.
- •The test system used a Xeon E5-2620 v4, 128 GB of DDR3, no GPU, no swap, and ik_llama.cpp on the feat/gemma-4-mtp branch.
- •Benchmarks covered three prompt types and were run through llama-server to capture speculative-decoding telemetry and verify whether the drafter activated.