March 17, 2026
Padded rooms and robot tantrums
Why AI systems don't learn – On autonomous learning from cognitive science
AI needs street smarts, not padded rooms, says crowd—skeptics cry “old news”
TLDR: Top researchers propose AI that learns by watching and doing, with a brain-like switch to choose between the two. Commenters love the escape from language-only training but argue whether it’s fresh thinking or recycled cybernetics/JEPA, while engineers worry about how to switch modes without collapsing into one.
A heavyweight trio—Emmanuel Dupoux, Yann LeCun, and Jitendra Malik—dropped a brain-inspired plan for how AI should actually learn: watch the world, poke the world, and switch between the two like a smart internal coach. In simple terms: System A watches, System B does, and System M decides when to swap gears. The community? Absolutely buzzing.
The loudest cheers come from folks sick of language-only training. One commenter roasted today’s models as a “padded room” setup, fed by human-curated data and shocked when reality moves. Another praised the paper’s takedown of the “data wall” and assembly-line training. Fans say this is the jailbreak AI needs to handle messy, changing environments like the real world.
But the eye-rolls were loud too. One user said this is basically LeCun’s old tune—JEPA (a prediction approach)—with fresh packaging, even linking receipts: ai.meta.com/blog/yann-lecun-ai-model-i-jepa. Another delivered the zinger of the thread: “We are rediscovering Cybernetics.” Translation: cool idea, but grandpa had this in the ’50s.
Meanwhile, the builders want receipts: how do you design the meta-switch so it knows when to observe and when to act without collapsing into one mode? The debate’s delicious: bold new brain vibes vs. déjà vu science fair, with a side of memes about padded rooms and training wheels coming off.
Key Points
- •The article critiques current AI systems for falling short of autonomous learning capabilities.
- •It proposes a cognition-inspired framework integrating observation-based (System A) and action-based (System B) learning.
- •A meta-control component (System M) directs switching between learning modes via internal signals.
- •The approach is informed by how organisms adapt across evolutionary and developmental timescales.
- •The goal is improved adaptation to dynamic, real-world, non-stationary environments.