Do LLMs pass the mirror test?

AI looked in the mirror, and the comments instantly started a fight

TLDR: The post says chatbots should be tested by secretly changing their earlier replies and seeing whether they notice the mismatch. Commenters split fast: some found that spooky and interesting, while others said it overstates what these systems really understand.

A philosophy-heavy blog post asked a deceptively simple question: can a chatbot recognize when its own previous words have been secretly changed? The writer argues that this is a better version of the classic “mirror test,” which checks whether an animal can recognize itself. Instead of a red dot on a forehead, the chatbot gets its earlier answer quietly edited and then has to react. In other words: if its “own” text comes back wrong, does it notice the glitch? That idea fascinated readers — but, naturally, the comment section immediately turned into the real show.

One camp was intrigued by the weird implications. A commenter wondered if giving the system the power to edit its own chat history would make it start “fixing” reality like a digital neat freak. Another said their own AI helper already notices when files or code have been manually changed — and, hilariously, blames itself every time. That sparked the most human reaction of all: annoyance. Meanwhile, skeptics slammed the whole framing, arguing this kind of test risks making chatbots sound deeper or more self-aware than they really are. One critic pushed a totally different test: ask what kind of material the system learned from, not whether it can detect tampering.

And then there was the comic relief. One user ignored the philosophy entirely to roast the site design, saying it made their phone feel like “a cylinder.” In a thread about machine self-recognition, the most relatable moment may have been humans instantly deciding to argue about definitions, aesthetics, and whether the whole thing is overhyped.

Key Points

  • The article argues that most existing LLM mirror tests are text-based versions of a visual test and therefore measure the wrong thing.
  • It uses Alexandra Horowitz’s scent-based dog experiment as an analogy for modality-appropriate self-recognition testing.
  • The article reports that in Horowitz’s experiment, dogs investigated modified versions of their own scent more than unmodified samples.
  • The author interprets this as anomaly detection against an internal baseline rather than definitive proof of philosophical self-awareness.
  • For LLMs, the article proposes editing a model’s previous chat response in conversation history and observing whether the model notices the discrepancy, citing Google AI Studio and Gemma 4 31B as the setup example.

Hottest takes

"makes me feel like my phone is a cylinder" — FromTheFirstIn
"Would it try to fix the \"glitches\"?" — wcoenen
"it always assumes it must have made a mistake. It's sort of annoying actually" — impure
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.
Do LLMs pass the mirror test? - Weaving News | Weaving News