Where Is GPT in the Chomsky Hierarchy?

Nerds argue if GPT can be ‘infinite’ while commenters roast Chomsky and nitpick code grammar

TLDR: The post argues GPTs always eventually stop, so they aren’t truly “infinite” computers. Comments explode: some say Chomsky’s framework is outdated, others agree the argument is fine but trivial, and the rest dive into nitpicks and nerdy acronyms — showing how divided AI watchers really are.

The internet is having a full-on philosophy brawl over a post from The Fiefdom of Files asking where GPT sits in the Chomsky hierarchy (a way to rank how complex language systems can be). The author’s spicy claim: GPTs can’t be truly “infinite” because transformers do a fixed amount of math and always have a non-zero chance of hitting the “end” token — so they stop. Translation: cool poems, yes; endless computation, no. They even dunk on “thinking tokens” (those dramatic “But wait…” lines) as crutches.

Cue comment chaos. One camp fires back that the whole Chomsky framing is outdated: PaulHoule goes nuclear, saying the paradigm “held back progress for 70 years.” Another camp shrugs: Maxatar says, yeah, they terminate, so not Turing-complete — but calls it “not too insightful.” The pedants arrive with precision strikes: “Most programming languages aren’t context-free,” nitpicks fly, and pohl labels GPT “regular… albeit astronomically large.” Meanwhile, the math crowd flexes acronyms like AC^0 and TC^0 (think: very tiny logic circuits) to argue different model strengths.

For comic relief, someone drops a wild conspiracy-style one-liner about Chomsky, and the thread turns into a meme cage match: Team Infinite Brain vs Team Press Stop vs Team Alphabet Soup. It’s academia meets Reddit — and everybody brought receipts.

Key Points

•The article evaluates GPTs’ position in the Chomsky hierarchy and whether they are Turing complete.
•A prior result (Pérez, Barceló, Marinkovic 2021) shows transformers can be Turing complete if allowed to add layers with input length.
•The author argues standard GPTs cannot be Turing complete: finite vocabularies and bounded embeddings yield compact inputs, bounded outputs, and a nonzero EOS probability, implying certain termination.
•Transformers perform a bounded number of continuous operations, limiting unbounded computation; RNNs can, in principle, perform unbounded inference.
•Practical techniques like thinking tokens and chain-of-thought approximate algorithmic reasoning, but results (Hahn 2020) show limitations for tasks like parity without chain-of-thought.

Hottest takes

"the Chomsky paradigm probably held back progress in language technology over the past 70 years" — PaulHoule

"GPTs inherently can not be Turing complete because they always terminate" — Maxatar

"Regular...albeit astronomically large (unless we're granting idealizations like infinite context, etc.)" — pohl

December 19, 2025

GPT vs Chomsky: Fight Club, but nerdier

Nerds argue if GPT can be ‘infinite’ while commenters roast Chomsky and nitpick code grammar

Key Points

Hottest takes

December 19, 2025

GPT vs Chomsky: Fight Club, but nerdier

Where Is GPT in the Chomsky Hierarchy?

Nerds argue if GPT can be ‘infinite’ while commenters roast Chomsky and nitpick code grammar

Key Points

Hottest takes

Save News