Reflections on AI at the End of 2025

From ‘parrots’ to partners: commenters roast doom and debate AI’s new coding powers

TLDR: End-of-2025 take: AI moved from “parrot” to practical partner via step-by-step thinking and reward training. Commenters applauded coding wins but slammed the “extinction” line as hype, warned of Goodhart trade-offs, and argued over new model ideas and how much autonomy coding agents should have.

2025’s AI confession: even the ‘stochastic parrot’ crowd largely stopped calling large language models (LLMs) mindless mimics. The piece argues step‑by‑step chain of thought (CoT) and reward‑based training (think points for good results) turned chatbots into stronger problem‑solvers and code companions. The comments? Spicy. One camp cheered the coding gains but warned about Goodhart’s law: chase a single score and you get weird outcomes. As danielfalbo put it, optimize for speed and you might ship unreadable spaghetti. Another camp fixated on the mic‑drop finale—“avoiding extinction”—and called it doom‑bait. ur-whale asked what that even means; fleebee smelled “Big Tech fearmongering” to goose stock prices.

Others played hype police on whether CoT actually changed the nature of these models, while fans shot back: same architecture, smarter use. The ARC reasoning test—once the anti‑LLM badge—now looks more like an LLM victory lap, which sparked more eye‑rolling and applause. agumonkey dropped a tease about “Diffusion LLMs” ditching one‑word‑at‑a‑time generation, and alexgotoi went full meme with a Don’t Look Up callback. The vibe: coders are split between using AI as a helpful coworker versus unleashing full auto‑agents. The audience wants receipts, fewer apocalyptic endings, and practical guardrails—show the gains, show the trade‑offs, skip the scare PSAs.

Key Points

  • The article states that by 2025, most researchers moved away from describing LLMs as “stochastic parrots.”
  • Chain-of-thought prompting is presented as fundamental, combining internal sampling and reinforcement learning to improve outputs.
  • The author claims reinforcement learning with verifiable rewards can extend progress beyond token-based scaling limits.
  • Programmers’ resistance to AI-assisted coding has decreased, with use split between collaborative chat interfaces and autonomous coding agents.
  • ARC benchmarks are described as shifting from anti-LLM tests to validating LLMs, with strong performance on ARC-AGI-1 and ARC-AGI-2 using CoT.

Hottest takes

"optimizing for speed may produce code that is faster but harder to understand and extend" — danielfalbo
"it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks" — fleebee
"Diffusion LLMs too, apparently getting rid of the linear token generation" — agumonkey
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.