March 31, 2026
13 params, infinite drama
TinyLoRA – Learning to Reason in 13 Parameters
TinyLoRA claims “reasoning” with 13 knobs — commenters say the elephant is already in there
TLDR: Researchers say they boosted a big AI’s math reasoning by tweaking just 13 parameters, sparking claims that “the smarts were inside all along.” The community is split between unlock-vs-train, RL-vs-finetune, and jokes about fitting elephants—while pragmatists tout small models plus great data as the real win.
The internet just spit out its coffee: a new paper, Learning to Reason in 13 Parameters, says a giant AI can hit high math scores by tweaking just 13 tiny settings — roughly 26 bytes. Cue chaos. One camp is yelling, “Reasoning was inside the model all along!” If a teeny change boosts logic, they argue, maybe these models already had the brainpower; TinyLoRA just flips a hidden switch. Another camp fires back that it only works with reinforcement learning (teaching by trial-and-reward), not simple fine-tuning, so skill still has to be earned, not unlocked. The nerd humor came fast: one commenter riffed, “with four parameters I can fit an elephant… with five I can make him wiggle his trunk,” turning the 13-parameter flex into a full-blown meme. Meanwhile, the practical crowd is like, keep calm and curate datasets — claiming small models (3–7B) with good reasoning data are already scary good, name-dropping cartesien.io and Salesforce’s WebscaleRL. The spiciest debate? Whether “reasoning” is real or just clever pattern tweaks. Fans say these results prove efficiency is king; skeptics say it’s cosmetic — impressive scores, but the same old parlor tricks. Either way, 13 parameters just dragged the whole field into a 🔥 fight over how much intelligence is learned versus revealed.
Key Points
- •TinyLoRA is proposed to scale low-rank adapters down to as few as one parameter, addressing limits of conventional LoRA.
- •Using TinyLoRA with RL, Qwen2.5-8B reaches 91% accuracy on GSM8K with only 13 trained parameters in bf16 (26 bytes).
- •Across harder benchmarks (AIME, AMC, MATH500), TinyLoRA recovers about 90% of performance gains while training 1000× fewer parameters.
- •Strong results are achieved only with reinforcement learning; SFT needs 100–1000× larger updates to match performance.
- •The study questions the necessity of even rank=1 LoRA for learning reasoning, introducing an alternative parameterization.