December 19, 2025

GPT vs Chomsky: Fight Club, but nerdier

Where Is GPT in the Chomsky Hierarchy?

Nerds argue if GPT can be ‘infinite’ while commenters roast Chomsky and nitpick code grammar

TLDR: The post argues GPTs always eventually stop, so they aren’t truly “infinite” computers. Comments explode: some say Chomsky’s framework is outdated, others agree the argument is fine but trivial, and the rest dive into nitpicks and nerdy acronyms — showing how divided AI watchers really are.

The internet is having a full-on philosophy brawl over a post from The Fiefdom of Files asking where GPT sits in the Chomsky hierarchy (a way to rank how complex language systems can be). The author’s spicy claim: GPTs can’t be truly “infinite” because transformers do a fixed amount of math and always have a non-zero chance of hitting the “end” token — so they stop. Translation: cool poems, yes; endless computation, no. They even dunk on “thinking tokens” (those dramatic “But wait…” lines) as crutches.

Cue comment chaos. One camp fires back that the whole Chomsky framing is outdated: PaulHoule goes nuclear, saying the paradigm “held back progress for 70 years.” Another camp shrugs: Maxatar says, yeah, they terminate, so not Turing-complete — but calls it “not too insightful.” The pedants arrive with precision strikes: “Most programming languages aren’t context-free,” nitpicks fly, and pohl labels GPT “regular… albeit astronomically large.” Meanwhile, the math crowd flexes acronyms like AC^0 and TC^0 (think: very tiny logic circuits) to argue different model strengths.

For comic relief, someone drops a wild conspiracy-style one-liner about Chomsky, and the thread turns into a meme cage match: Team Infinite Brain vs Team Press Stop vs Team Alphabet Soup. It’s academia meets Reddit — and everybody brought receipts.

Key Points

  • The article evaluates GPTs’ position in the Chomsky hierarchy and whether they are Turing complete.
  • A prior result (Pérez, Barceló, Marinkovic 2021) shows transformers can be Turing complete if allowed to add layers with input length.
  • The author argues standard GPTs cannot be Turing complete: finite vocabularies and bounded embeddings yield compact inputs, bounded outputs, and a nonzero EOS probability, implying certain termination.
  • Transformers perform a bounded number of continuous operations, limiting unbounded computation; RNNs can, in principle, perform unbounded inference.
  • Practical techniques like thinking tokens and chain-of-thought approximate algorithmic reasoning, but results (Hahn 2020) show limitations for tasks like parity without chain-of-thought.

Hottest takes

"the Chomsky paradigm probably held back progress in language technology over the past 70 years" — PaulHoule
"GPTs inherently can not be Turing complete because they always terminate" — Maxatar
"Regular...albeit astronomically large (unless we're granting idealizations like infinite context, etc.)" — pohl
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.