May 14, 2026

Copy-paste vs clean code chaos

Notes from Optimizing CPU-Bound Go Hot Paths

Go fans are torn as one coder says speed means copying code by hand

TLDR: A Go developer says the fastest way to speed up a heavy workload was to duplicate code by hand instead of using cleaner reusable patterns. Commenters agreed the problem is real, then immediately turned it into a nerdy showdown about assembly tricks, alternative tools, and brag-worthy benchmarks.

A Go programmer went looking for raw speed while porting the Brotli compression tool, and the result has the community doing what it does best: applauding, nitpicking, and instantly turning it into a competition. The big claim is simple enough for non-experts: the neat, tidy way of writing Go code often wasn’t the fastest, and the quickest fix was the least glamorous one imaginable — copying the same function over and over with tiny changes. For fans of clean code, that’s basically a horror story. For performance diehards, it’s just Tuesday.

The strongest reaction was a mix of “yep, that’s real” and “wow, that’s ugly.” One commenter backed the complaint and pointed to Apache Arrow’s Go approach, which more or less says: if you want serious speed, go around the problem entirely. Another took the discussion straight into assembly-language wizardry, casually dropping branch prediction strategy like everyone keeps that in their back pocket. And then, in classic internet fashion, someone ignored the hand-wringing and turned the whole thing into a benchmark flex: how many simulated boids can one CPU run at 60 frames per second with no goroutines? Answer: apparently 8,192, because of course someone had receipts.

The funniest mini-drama is that this wasn’t really a comment section — it was a performance fight club. Some readers treated the post as proof that Go still makes speed freaks suffer, while others basically said, “fine, then write weirder code.” Even TinyGo got dragged into the gossip, because no optimization thread is complete without someone asking whether a smaller, scrappier version of the language might secretly do it better.

Key Points

  • The article says that while porting Brotli to pure Go, the author repeatedly found that specialized concrete code outperformed idiomatic abstractions in hot paths.
  • It states that generics, interface dispatch, and closures often prevented the Go compiler from producing code equivalent to a concrete implementation, largely because of missed inlining opportunities.
  • The article explains that Go generics use GC Shape Stenciling rather than full monomorphization, and that method calls on type parameters can still involve interface-like dispatch.
  • According to the article, the author resolved a hot-path performance issue by duplicating concrete functions instead of abstracting them, resulting in 16 near-identical function variants.
  • The article includes a deeper benchmark-oriented section built around a reduced real-world example comparing concrete, generic, polymorphic, and closure-based parameterization approaches.

Hottest takes

"How many classic Reynolds boids can you run on 1 CPU at 60FPS" — coldstartops
"They effectively compile C code to Go assembly" — nasretdinov
"i wonder how this compares to tiny go" — K0IN
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.