GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

A maker showed off a home-built AI chip running a tiny text generator on an FPGA, a reprogrammable piece of hardware, and bragged about 56,000+ tokens per second at just 80 MHz. On paper, that sounds like sci-fi garage genius energy: no regular processor, no graphics card, just raw digital circuitry spelling out names. But the real action wasn’t in the demo — it was in the comment section, where the applause quickly turned into a courtroom drama.

The harshest reaction came from people saying the headline was doing a lot of heavy lifting. One commenter dropped a link and delivered the brutal counterpunch: a single MacBook CPU core was allegedly 71 times faster on this tiny model. Another went for the jugular by pointing out the system’s memory was only 16 characters, basically accusing the whole “tokens per second” flex of being flashy but not meaningful in the real world. Ouch.

Still, not everyone was in roast mode. Some commenters played the “yes, but…” card, saying this is still genuinely impressive as a proof of concept — especially because bigger AI systems get much harder to run as they grow. That sparked the nerdy dream scenario: could future chips put memory and compute side by side and become monsters at this kind of work? So the mood was split between “cool hack” and “nice stunt, but come back when it matters.” In other words: classic internet tech drama, with a side of meme-worthy skepticism.

June 16, 2026

Chip happens: comment war edition

Homemade AI chip stuns at first glance, but the comments came for blood

Key Points

Hottest takes

June 16, 2026

Chip happens: comment war edition

GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

Homemade AI chip stuns at first glance, but the comments came for blood

Key Points

Hottest takes

Save News