Our eighth generation TPUs: two chips for the agentic era

Google drops two AI mega‑chips; commenters split: ‘All hail’ vs ‘I’ll build my own’

TLDR: Google unveiled two specialized AI chips—TPU 8t for training and TPU 8i for super-fast answers—boasting huge scale and slick cooling. Commenters split between crowning Google the quiet frontrunner and cheering DIY researchers who’d rather build under-the-desk rigs than rent Big Cloud, making this a showdown of scale vs. scrappiness.

Google just rolled into Google Cloud Next with not one but two new AI chips—TPU 8t for training (teaching models) and TPU 8i for inference (answering questions fast). The flex is huge: claims of 3x more oomph per cluster, a superpod of 9,600 chips, two petabytes of shared memory, and liquid-cooling that looks straight out of a sci‑fi sauna. Designed with DeepMind for the agent era (think bots that plan and act), this is Google betting big on speed and scale.

But the comments? Spicier than a server rack at full tilt. The awe squad is gawking at the “unbelievable density” and whispering that Google’s quiet comeback is real—“like a tide, just growing all around.” The winner-takes-all crowd says it’s basically Google’s game to lose—maybe Apple at the edge—but the throne feels reserved. Meanwhile, the rebels are unimpressed: why rent a mega-factory when you can tinker under the desk? One skeptic even shrugs, “Seems impressive… maybe it’s not,” capturing that classic internet side-eye.

Memes flew about hot chips and colder coolant, with jokes about server soup and under‑desk space heaters. Translation: massive hardware drop, massive feelings—cloud power vs. garage power, and everyone’s picking a side.

Key Points

  • Google introduced its eighth-generation TPUs with two specialized chips: TPU 8t for training and TPU 8i for inference.
  • TPU 8t delivers nearly 3x compute performance per pod over the previous generation and targets reducing frontier model development cycles from months to weeks.
  • A TPU 8t superpod scales to 9,600 chips with 2 PB of shared HBM, double interchip bandwidth, and 121 exaflops of compute.
  • Near-linear scaling to up to a million chips is enabled by the Virgo Network integrated with JAX and Pathways; TPUDirect provides 10x faster storage access and >97% goodput target.
  • TPU 8i is optimized with higher memory bandwidth for latency-sensitive inference, while both chips can run various workloads but gain efficiency through specialization.

Hottest takes

“struggle to see how it doesn’t end with Google winning” — amazingamazing
“they’re like a tide.. just growing all around” — Keyframe
“scientists… want to burn hardware under their desks” — aliljet
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.