Transformers Are Inherently Succinct

AI nerds are freaking out over a paper saying transformers cram in way more than rivals

TLDR: Researchers say the transformer design behind modern chatbots can describe some tasks much more compactly than older AI systems, and the paper was honored at a top conference. Commenters split between hype that this looks like a huge win and skepticism about what transformers still do badly — with one joking that maybe this explains ultra-terse chatbot replies.

A big new AI paper just walked into ICLR 2026 — one of the field’s biggest conferences — and basically said: transformers, the engine behind today’s chatbots, can pack the same behavior into far smaller systems than older approaches. Translation for normal humans: the architecture behind modern AI may be weirdly good at doing a lot with very little. That flex was strong enough that commenters immediately pointed out it wasn’t just accepted — it was picked as one of three outstanding papers, which in research-land is basically a red-carpet moment.

But the real fun started in the replies. One camp was openly dazzled: if transformers can be exponentially more compact than recurrent neural networks (older sequence-reading AIs), are we inching toward some kind of design jackpot? Another camp instantly went full detective mode: okay, but what about the reverse? If older systems can still express some things transformers can’t do without huge bloat, then this isn’t a clean coronation — it’s a rivalry. That tiny question injected the thread with classic tech-forum tension: victory lap or “not so fast”?

And then came the comedy. One commenter confessed the paper was over their head and wondered if “succinctness” might explain why Claude has been writing in maddeningly compressed, commit-message goblin mode lately. It’s not what the paper means, but honestly? It was the line everyone could understand. The result: a deeply theoretical paper somehow turned into a debate about whether AI is becoming elegant… or just increasingly impossible to read.

Key Points

•The paper analyzes transformers using succinctness, measuring how compactly different formalisms can describe the same language.
•It claims fixed-precision transformers are exponentially more succinct than linear temporal logic and recurrent neural networks, and doubly exponentially more succinct than finite automata.
•The authors present language families with polynomial-size transformer representations whose equivalent LTL, RNN, or automaton representations require much larger size.
•The paper gives an upper bound showing any fixed-precision transformer can be translated to LTL with at most exponential blow-up, improving a previously known doubly exponential translation.
•It concludes that transformer verification problems such as emptiness and equivalence are EXPSPACE-complete.

Hottest takes

"doesn’t that mean we’re approaching optimality?" — lkm0

"What about the other direction?" — measurablefunc

"using increasingly terse language with very short, overloaded words" — parasti

June 5, 2026

Big brain, tiny package

AI nerds are freaking out over a paper saying transformers cram in way more than rivals

Key Points

Hottest takes

June 5, 2026

Big brain, tiny package

Transformers Are Inherently Succinct

AI nerds are freaking out over a paper saying transformers cram in way more than rivals

Key Points

Hottest takes

Save News