Chess engines do weird stuff

AI chess learns by peeking ahead — and the comments are screaming about NSFW links

TLDR: Chess engines are ditching endless self-play to copy what their own search finds, even adjusting on the fly and using a wild random-tweak method to win more. Comments erupted over a “chess is solved” claim, why SPSA beats fancier methods, and surprise NSFW warnings — proving the drama is half the fun

The nerds say chess engines just got weirder — and the crowd is loving the chaos. Instead of grinding millions of self-play games (reinforcement learning), devs are copying what their own search finds: the engine looks ahead, sees better moves, and the model learns from that. One commenter dropped a response from the Viridithas author, and suddenly the thread turned into a watch party. Bonus twist: some engines now adjust mid-game, correcting their own bias on the fly. And there’s a wild “shake the weights at random and keep what wins” method (SPSA) that can add +50 Elo — basically a free upgrade.

But the real show? Drama in the comments. One user declared, “chess is solved,” claiming modern Stockfish is unbeatable, which sparked instant pushback and a thousand eye-rolls. Another wondered why top engines still use SPSA instead of fancy-sounding tools like Bayesian or evolutionary algorithms — cue old-school devs replying: if it wins, it stays. Meanwhile, multiple users yelled NSFW alert about the linked homepage, turning a chess thread into a workplace hazard zone.

The vibe: amazed that “learning by peeking” beats expensive training, curious about “live-learning” mid-match, and giggling at the “defiantly NSFW” typo while debating if chess is actually solved (spoiler: it isn’t). It’s engine wizardry meets internet circus — check and drama

Key Points

•Search contributes far more Elo (~1200) than differences between model qualities (~200), enabling effective distillation from weak-model-plus-search into a strong model without extensive RL self-play.
•lc0’s BT4 was trained via distillation and reportedly performed worse when reintroduced into an RL loop, suggesting RL may be unnecessary after an initial strong model exists.
•Stockfish implements a runtime calibration technique (PR #4950) that adjusts neural evaluations based on discrepancies with search, adapting outputs to the current position.
•To directly optimize for winning, lc0 employs SPSA, perturbing weights and selecting the better-performing direction, achieving about +50 Elo on small models at significant computational cost.
•SPSA also tunes engine heuristics in C++, such as setting a checkmate-detection depth backoff to ~1.09, yielding ~5 Elo gains by optimizing numeric constants.

Hottest takes

AFAIK chess is has been "solved" — RivieraKid

I'm no expert on chess engine development, but it's surprising to me that both lc0 and stockfish use SPSA for "tuning" the miscellaneous magic numbers which appear in the system rather than different black box optimization algorithms like Bayesian optimization or evolutionary algorithms — mpolson64

The homepage for this site is defiantly NSFW. — t1234s

February 17, 2026

Checkmates and hot takes collide

AI chess learns by peeking ahead — and the comments are screaming about NSFW links

Key Points

Hottest takes

February 17, 2026

Checkmates and hot takes collide

Chess engines do weird stuff

AI chess learns by peeking ahead — and the comments are screaming about NSFW links

Key Points

Hottest takes

Save News