Flat Datacenter Networks at Scale at Amazon

Amazon says the best way to build giant server networks might be to just get a little chaotic

TLDR: Amazon says a giant server network can run better when connections are arranged more randomly instead of in neat layers. Commenters loved the twist, with some calling it obvious because the Internet already works that way, while others treated the paper like must-read geek drama.

Amazon just dropped a surprisingly spicy idea: after years of trying tidy, layered designs for the massive networks inside its data centers, the team found that more randomness might actually work better. The backstory is almost sitcom material. Researchers first flirted with a beautiful, artsy geometry idea, hit a wall, and then, by their own inside joke, landed on: “just be random!” That’s the kind of plot twist commenters live for.

And the comment section absolutely leaned in. One fan basically treated a James Hamilton post like a season premiere — “Oh man, James Hamilton blog posts, I love these things!” — then immediately arrived with receipts, dropping the paper and extra reading like the class overachiever everyone secretly appreciates. Another commenter shrugged and delivered the hottest low-key take in the thread: this isn’t even that weird, because the Internet itself is already kind of a mess held together by semi-random paths. That instantly reframed Amazon’s “bold new idea” as either genius or a very expensive version of “nature was already doing that.”

The vibe wasn’t outrage so much as nerdy delight. People compared the approach to their own systems, shared explainers, and posted obscure research links like proud music snobs recommending deep cuts. The only real drama was deliciously subtle: is Amazon unveiling a breakthrough, or just finally admitting that chaos, carefully managed, beats obsessively neat planning? Either way, the crowd seemed entertained by the same conclusion: sometimes the smartest plan is to stop forcing one.

Key Points

  • The article traces the theoretical basis for flat, high-connectivity networks from 1970s expander-graph research to later results showing random graphs can approach optimal expansion.
  • Networking practice instead largely adopted hierarchical fat-tree architectures based on Clos interconnects, with VL2 becoming a notable 2009 milestone in scalable data center networking.
  • The 2012 Jellyfish proposal showed how random graphs could be applied to data center networks, but routing, cabling, and operations remained unresolved for large-scale deployment.
  • In 2023 and 2024, Amazon researchers Giacomo Bernardi and Ratul Mahajan explored flat network designs, first through Penrose tiling and then through random graph approaches after simulations showed better results.
  • The article states that Amazon’s team brought in Seshadhri Comandur and developed Spraypoint to address routing in random graph-based flat networks.

Hottest takes

"Oh man, James Hamilton blog posts, I love these things!" — epistasis
"It’s not that dissimilar to how the Internet works" — kev009
"the best way I could find" — socketcluster
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.