December 19, 2025
GPT vs Chomsky: Fight Club, but nerdier
Where Is GPT in the Chomsky Hierarchy?
Nerds argue if GPT can be ‘infinite’ while commenters roast Chomsky and nitpick code grammar
TLDR: The post argues GPTs always eventually stop, so they aren’t truly “infinite” computers. Comments explode: some say Chomsky’s framework is outdated, others agree the argument is fine but trivial, and the rest dive into nitpicks and nerdy acronyms — showing how divided AI watchers really are.
The internet is having a full-on philosophy brawl over a post from The Fiefdom of Files asking where GPT sits in the Chomsky hierarchy (a way to rank how complex language systems can be). The author’s spicy claim: GPTs can’t be truly “infinite” because transformers do a fixed amount of math and always have a non-zero chance of hitting the “end” token — so they stop. Translation: cool poems, yes; endless computation, no. They even dunk on “thinking tokens” (those dramatic “But wait…” lines) as crutches.
Cue comment chaos. One camp fires back that the whole Chomsky framing is outdated: PaulHoule goes nuclear, saying the paradigm “held back progress for 70 years.” Another camp shrugs: Maxatar says, yeah, they terminate, so not Turing-complete — but calls it “not too insightful.” The pedants arrive with precision strikes: “Most programming languages aren’t context-free,” nitpicks fly, and pohl labels GPT “regular… albeit astronomically large.” Meanwhile, the math crowd flexes acronyms like AC^0 and TC^0 (think: very tiny logic circuits) to argue different model strengths.
For comic relief, someone drops a wild conspiracy-style one-liner about Chomsky, and the thread turns into a meme cage match: Team Infinite Brain vs Team Press Stop vs Team Alphabet Soup. It’s academia meets Reddit — and everybody brought receipts.
Key Points
- •The article evaluates GPTs’ position in the Chomsky hierarchy and whether they are Turing complete.
- •A prior result (Pérez, Barceló, Marinkovic 2021) shows transformers can be Turing complete if allowed to add layers with input length.
- •The author argues standard GPTs cannot be Turing complete: finite vocabularies and bounded embeddings yield compact inputs, bounded outputs, and a nonzero EOS probability, implying certain termination.
- •Transformers perform a bounded number of continuous operations, limiting unbounded computation; RNNs can, in principle, perform unbounded inference.
- •Practical techniques like thinking tokens and chain-of-thought approximate algorithmic reasoning, but results (Hahn 2020) show limitations for tasks like parity without chain-of-thought.