March 13, 2026
When chatbots eat CPUs
Executing programs inside transformers with exponentially faster inference
AI swallows a computer? Commenters argue “genius” vs “just a new .exe”
TLDR: Researchers claim an AI can run code inside itself at high speed, skipping external tools. The crowd is split between awe at a “chatbot-as-computer” and doubts about comprehension, cost, and whether it’s just a flashy new way to package programs—yet everyone agrees it could change how AIs actually get things done.
An eye-popping claim just dropped: researchers say they made an AI run code inside itself, cranking through millions of steps and streaming results blazing fast—no outside tools. Think “chatbot becomes a computer.” The crowd went wild, but not quietly.
The hype camp is loud. One fan sighed, “Truly, attention is all you need,” treating the model’s built‑in execution like sci‑fi made real. Another was thrilled the system could basically “execute assembly code,” calling the memory tricks fascinating. People love that this could fix the classic AI fail—messing up simple math and puzzles—by letting the model actually do the work instead of emailing it to a calculator.
But the skeptics brought receipts. A thoughtful dissenter poked at the manifesto vibe—quoting the paper’s vibe that “without execution the system has no comprehension”—and wasn’t buying the absolutism. Practical folks fretted about cost and bloat: “a lot of tokens,” i.e., a very wordy, potentially pricey way to think. And the class clown nailed the mood: is this wild new intelligence, or just a shiny new executable format stitched into a chatbot?
Meanwhile, dreamers want the sequel: mash this with reinforcement learning so the AI can imagine ideas and test them mid‑thought. If it works, your future laptop might talk and compute like a pro—no training wheels needed.
Key Points
- •Authors convert arbitrary C code into tokens that a transformer executes internally, producing execution traces without external tools.
- •A new execution-trace decoding path enables logarithmic-time attention lookups, supporting millions of steps in one run.
- •Demonstration task: solving min-cost perfect matching using the Hungarian algorithm within the transformer.
- •Reported throughput exceeds 30,000 tokens per second on a CPU while streaming results.
- •Work positions internal execution as a solution to LLMs’ difficulty with long, exact computations, contrasting with tool-use and agentic orchestration.