Don't trust large context windows

That giant memory promise may be fake, and commenters are absolutely spiraling

TLDR: The article says AI tools may look like they can remember everything, but often get fuzzy long before their advertised limit. Commenters split between people treating chats like disposable scratchpads and one confident rebel saying the problem is wildly overstated.

Tech people have found a new thing to be mad about: AI chatbots that brag they can remember huge amounts of text, then allegedly start acting clueless once the conversation gets too long. The article’s big claim is brutally simple: there’s a “smart zone” where the bot is focused, and a “dumb zone” where it starts forgetting important details. In plain English, the flashy memory number on the label may be more like a gym membership than actual fitness.

And the comments? Instant survival-guide mode. One camp basically said, “Yeah, obviously,” with users bragging that they clear chats constantly and treat every new task like a fresh start. One commenter said they’ve become the AI’s Product Manager, forcing it to write mini plans for every feature so it doesn’t wander off like a distracted intern. Another pitched a more hardcore fix: break work into many short sessions and pass along only the important notes, like a relay race for robots.

But not everyone bought the doom. One holdout came in hot saying this has not been their experience at all, claiming they push one chatbot to 500k, 800k, even near 900k tokens without the meltdown the author describes. So yes, the thread turned into the classic internet showdown: “the benchmark says it’s broken” versus “works on my machine.” The vibe was half lab report, half group therapy, with a side of “maybe the real skill is learning how to babysit the bot.”

Key Points

  • The article claims practical LLM performance declines well before the full advertised context window is reached, with a meaningful cutoff around 100,000 tokens.
  • It says coding agents can enter degraded-context conditions quickly because file reads, debugging, and test runs consume tokens rapidly.
  • The article cites RULER and Chroma's report on context rot as evidence that effective context is smaller than advertised and degrades gradually.
  • It describes automatic context compaction in tools like Claude Code as helpful but limited because it occurs after degradation begins.
  • The author recommends shorter sessions with manual written handoff artifacts such as specs, PRDs, plans, and sub-agent handoffs to preserve quality.

Hottest takes

"I /clear all the time out of habit" — mcapodici
"acting like the AI's Product Manager" — kristianc
"This has not been my experience... I routinely push past 500k tokens" — kelnos
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.