Evolution: Training neural networks with genetic selection achieves 81% on MNIST

Survival-of-the-fittest AI scores 81% on handwriting; commenters riot

TLDR: A dev’s “evolve the AI” project scored 81% on a popular handwriting test, skipping traditional training. Comments split between cheering the fresh approach and mocking the low score, with a fiery accusation of ChatGPT‑made code pushing calls for proper benchmarks and transparency.

An indie dev dropped GENREG, an “evolve-your-AI” experiment where the best models reproduce and the worst get cut, no math-heavy training. It hit 81% on the classic MNIST handwriting test in about 40 minutes, and the crowd immediately split. Fans cheered the throwback-to-Darwin approach, praising that training uses a graphics card but inference runs on low-end CPUs. Skeptics rolled their eyes: modern methods blast past 99% on MNIST, so 81% feels… beginner tier.

Then came the flamethrower: one commenter accused the dev of using ChatGPT to write the code and leaving a placeholder username in the repo, demanding proper explanations and citations. Others rushed in to defend: “It’s open-source, let them iterate,” while the hardliners insisted, “Show comparisons against standard training and tougher, real-world data.” The dev’s notes—like “child mutation” being crucial and averaging more samples to stabilize results—sparked memes. Cue jokes: “Swipe right on high-trust genomes,” “No grads, just chads,” and “Darwin meets digits.” The 100% on rendered alphabet? Dismissed as too easy.

Under the drama, curiosity survived: people are grabbing checkpoints, asking for head‑to‑head benchmarks, and pushing for 95% or bust. Evolution may be slow, but the comment section evolved into a full‑blown ecosystem

Key Points

  • GENREG trains neural networks via evolutionary trust-based selection without gradients or backpropagation.
  • On MNIST, a 784→64→10 MLP (50,890 params) achieved 81.47% test accuracy after ~600 generations (~40 minutes).
  • Per-digit MNIST accuracy ranges from 70.9% (digit 5) to 94.5% (digit 1), with detailed results provided.
  • An alphabet task (10,000→128→26) reached 100% test accuracy in ~1,800 generations on rendered letters A–Z.
  • Key findings: stabilizing fitness signals via more samples, child mutation driving exploration, and capacity constraints enabling efficient solutions.

Hottest takes

"Did you even come up with the idea yourself or just ask chatgpt…" — dfajgljsldkjag
"Training neural networks without gradients or backpropagation" — AsyncVibes
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.