We Stopped Using the Mathematics That Works

Deep learning won big; math purists cry foul as “bitter lesson” brawl erupts

TLDR: An essay says deep learning took over because it’s convenient, pushing careful math aside after a 2012 breakthrough. Commenters split: some praise the warning, others chant the “Bitter Lesson” that data wins, and a few dunk on the piece as bot bait—an argument over how we build AI next and who we should trust.

An op-ed claims we ditched “the math that works” for the convenience of deep learning—and the comments instantly turned into a street fight. Some readers hailed it as “a voice of reason in the maelstrom,” cheering the author for saying the quiet part out loud: after a 2012 image-recognition blowout, money and hype made neural networks king. Others fired back with the classic Bitter Lesson take: more data and compute beat hand-crafted theory, whether we like it or not.

In simple terms, the piece says: older tools that make you spell out exactly what you want (like decision theory and Bayesian stats) got sidelined because deep learning is easier—just feed it data and let it learn. One commenter called today’s AI the “age of alchemy,” predicting real “chemistry and physics” later. Another said the essay was confusing and self-contradictory: if the old methods “worked,” why did they lose so hard in that 2012 ImageNet contest? And then came the meme cannons: one zinger dubbed it an “LLM-garbage article, ironically,” roasting it as bot-written while arguing about bots. Translation: it’s a culture war—bring back the math vs trust the machines—with stakes that feel less like homework and more like the future of intelligence.

Key Points

  • A 2012 deep convolutional neural network won the ImageNet challenge by 9.8 points, catalyzing deep learning’s rapid dominance.
  • Major tech firms hired leading neural network researchers, drawing funding, talent, and publications into deep learning within five years.
  • Decision theory, Bayesian statistics, operations research, and reinforcement learning are dispersed across departments, limiting a unified toolkit.
  • Decision-theoretic methods require explicit utilities, priors, and action spaces, while deep learning offers convenience via standard losses and large datasets.
  • Historical parallels show convenience drives adoption: frequentist methods dominated due to tractability; MCMC enabled a Bayesian revival in the 1990s.

Hottest takes

“Tldr: the author is annoyed at the Bitter Lesson” — jeffrallen
“We are at the age of alchemy… wait for chemistry and physics” — ontouchstart
“LLM-garbage article, ironically” — furyofantares
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.