Towards Autonomous Mathematics Research

AI writes math papers; commenters say the last 4% is chaos

TLDR: Aletheia says AI can help do real math research, even writing papers and solving four open problems. Commenters split: some cheer collaboration, others argue success is rare and the missing 4% hides the hard parts, demanding proofs that humans can verify and trust.

Aletheia, a new math research agent, claims it can go beyond contest puzzles and help write real proofs. The team says it powered an AI-written paper on “eigenweights,” a human–AI collaboration on particle bounds, and even solved four open questions from Bloom’s Erdős Conjectures database, with transparency tools and shared prompts. Ambitious? Absolutely. The comments: volcanic.

Skeptics mock the victory lap, pointing to benchmarks. One voice groans that hitting 96% still leaves “the last 4%” where all the real pain lives. Another quotes the paper’s own caution that “success cases are rare,” spiking the hype balloon. The spiciest thread torches talk of “proof space,” sneering that some outputs are “grammatically coherent gibberish.” Meanwhile, defenders note that human peer review misses mistakes too, so perfect rigor isn’t exactly a human-only club.

The vibe is half awe, half side‑eye. Memes fly: “96% genius, 4% chaos,” “LLM = Looks Like Math,” and “AI wrote a paper—can it pass office hours?” A link to the arXiv keeps receipts handy, while optimists cheer the four open problems and call this a dawn of human‑AI collaboration. The rest? They want fewer demos, more bulletproof proofs, and a lot less swagger. For now, the math crowd watches.

Key Points

  • Aletheia is introduced as an AI math research agent performing iterative generation, verification, and revision of solutions in natural language.
  • The system leverages an advanced version of Gemini Deep Think, a novel inference-time scaling law, and intensive tool use to tackle complex research tasks.
  • An AI-generated paper (Feng26) autonomously calculates eigenweights in arithmetic geometry without human intervention.
  • A human-AI collaborative paper (LeeSeo26) establishes bounds related to independent sets in interacting particle systems.
  • A semi-autonomous evaluation across 700 open problems in Bloom’s Erdos Conjectures database produced four autonomous solutions, and the authors propose transparency tools like human-AI interaction cards.

Hottest takes

"that last 4% is somehow still out of reach" — measurablefunc
"success cases are rare" — u1hcw9nx
"grammatically coherent gibberish" — measurablefunc
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.