May 3, 2026

Bucket list? More like bucket mess

Buckets and objects are not enough

S3 turns 20 and the internet says your giant junk drawer bucket is the real problem

TLDR: The article argues that Amazon S3, despite being a beloved cloud storage tool, still lacks a simple way to treat related files as one clear group. Commenters split between saying that’s totally fine because S3 is meant to stay basic, and asking why teams created one giant mess of storage in the first place.

Amazon’s wildly popular online storage system, S3, just hit 20, and instead of a calm birthday toast, the community turned it into a full-on intervention. The article’s big complaint is simple: S3 is great at storing endless files, but terrible at understanding which files belong together as one meaningful group. In plain English, companies keep stuffing everything into giant storage bins, then act shocked when nobody knows what belongs to whom, what costs money, or what can be safely deleted. Classic tech hoarder behavior.

And the comments? Oh, they came in swinging. One camp basically yelled, “That’s not a bug, that’s the point!” with gberger reminding everyone that S3 literally stands for Simple Storage Service and was always meant to be a basic building block, not an all-knowing digital librarian. Another crowd pushed back on the article’s framing, with hilariously pointing out that prefixes — the folder-like names people use to organize files — are not meaningless and even affect speed, which gives this debate a deliciously nerdy “read the docs!” energy.

Then came the cleanup crew. skybrian dropped the brutally practical question: why are teams dumping everything into one enormous bucket in the first place? Meanwhile dchess offered the cool-kid answer: other tools already solve this mess. The vibe is half serious architecture debate, half people staring into a cloud-storage closet and realizing it’s been chaos for years.

Key Points

  • The article says Amazon S3 is widely used as a general storage layer for many data types, including logs, ML datasets, media, backups, and tabular data.
  • It argues that large S3 buckets often contain multiple logical groups of related objects that function as datasets.
  • The article states that S3 has no first-class dataset abstraction, so teams rely on prefixes and naming conventions to group related objects.
  • It distinguishes between higher-level organizational prefixes, dataset-level prefixes, and lower-level implementation-detail prefixes such as partitions.
  • The article says that without dataset-aware storage semantics, tasks like listing datasets, tracking size and growth, and archiving or deleting them as a unit are harder than they should be.

Hottest takes

"performance boundaries as well" — hilariously
"It’s a building block" — gberger
"Why do they put everything into one huge bucket?" — skybrian
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.