The eighth-generation TPU: An architecture deep dive

Google’s “two-chip” move has Reddit yelling: It’s not about speed, it’s about memory

TLDR: Google unveiled two eighth‑gen AI chips—one for training and one for serving—designed to move data faster and waste less power. Commenters say the big story isn’t speed but memory and energy, joking about giant RAM hauls and complaining about missing low‑level docs, while debating who wins the efficiency race.

Google just split its new AI hardware into two lanes—TPU 8t for training and TPU 8i for serving—and the internet is treating it like a plot twist. The company says newer AI needs different muscle: 8t packs tricks like a “SparseCore” for messy lookup tasks, native FP4 (a tiny 4‑bit number format), and a beefed-up “Virgo” network promising up to 4x more data movement, plus Arm-based Axion CPUs to keep the chips fed. Translation: less waiting, more doing.

But the crowd has thoughts. One top take declares this split an admission that the real bottleneck isn’t raw math (FLOPs), it’s memory bandwidth and latency—how fast chips can fetch stuff. Another camp says the only metric that matters now is power: “no energy, no AI,” and Google’s efficiency might be its secret weapon. Then the meme machine kicked in: a viral quip joked that TPU 8i hoards “2.764 petabytes of RAM,” poking fun at memory arms-race vibes. Meanwhile, the “dupe police” rolled up with a link, and one frustrated reader begged for an instruction manual instead of glossy diagrams.

In simple terms: Google built specialist chips for different AI jobs. The community is split between “memory is king,” “energy is destiny,” and “show me the docs” — with a side of dupe sirens and RAM jokes.

Key Points

•Google unveiled eighth-generation TPUs with two systems: TPU 8t for pre-training and TPU 8i for serving, both part of Google Cloud’s AI Hypercomputer.
•TPU 8t scales a 3D torus topology to 9,600 chips per superpod and targets embedding-heavy, massive pre-training workloads.
•SparseCore in TPU 8t accelerates embedding lookups and data-dependent collectives to avoid zero-op bottlenecks.
•TPU 8t introduces native FP4, doubling MXU throughput, reducing data movement, and enabling larger layers to fit in local buffers.
•A new Virgo Network topology provides up to 4x higher data center network bandwidth to support TPU 8t’s data demands.

Hottest takes

“Admission the bottleneck is memory, not math” — zshn25

“Google could win simply on efficiency” — ttul

“So that’s where all the RAM went” — speedping

April 22, 2026

Silicon soap opera begins

Google’s “two-chip” move has Reddit yelling: It’s not about speed, it’s about memory

Key Points

Hottest takes

April 22, 2026

Silicon soap opera begins

The eighth-generation TPU: An architecture deep dive

Google’s “two-chip” move has Reddit yelling: It’s not about speed, it’s about memory

Key Points

Hottest takes

Save News