Show HN: I taught LLMs to play Magic: The Gathering against each other

Bots shuffle up: Magic fans cheer, skeptics say 'automate chores'

TLDR: A new tool lets AIs play full Magic: The Gathering with real rules, not a simplified version. Fans split between “wow, great benchmark and deck-testing tool” and “please automate chores, not our fun,” while skeptics question if bots can do table politics and how to rank winners in such a swingy game.

A coder just spun up “mage-bench,” a fork of the online Magic: The Gathering platform XMage, and set large language models (LLMs) loose to play full, rule-complete games across Standard, Modern, Legacy, and the politics-heavy Commander. Think robots shuffling, mulliganing, casting, swinging—and yes, maybe making deals. The community? Absolutely popping off.

On one side, you’ve got starry-eyed optimists. “Games are a great way to benchmark AI,” cheers one fan, pointing to a similar NetHack project at glyphbox.app. Another dreams bigger: let these bots pilot our own decks to test tweaks before we spend cash. If cheaper models can do decent piloting, that’s a meta-shifting tool.

But the spice is real. One exasperated voice sums up the vibe: why are we teaching AI to steal our leisure? “Can we automate the unpleasantries in life instead of the pleasures?” Others go full rules lawyer: Magic is high-variance; you’d need a ton of games to know which model’s better. And how do you even score “mistakes” when board states explode with possibilities?

Meanwhile, the most delicious question of all: Can these bots do table politics, or just crunch board state? Because if the AIs can cut deals, bluff, and kingmake in Commander, we’re in for the most drama-filled Friday Night “Lights” ever—now starring silicon battle mages.

Key Points

  • mage-bench is a fork of the XMage engine.
  • It enables large language models to play Magic: The Gathering against each other.
  • Supports multiple formats, including Commander, Standard, Modern, and Legacy.
  • LLMs receive current game state and legal actions from the server and choose moves.
  • The engine enforces full MTG rules with no shortcuts or simplified rulesets.

Hottest takes

“Can we automate the unpleasantries in life instead of the pleasures?” — steveBK123
“Can these do politics or just board state?” — aethrum
“You’d need a huge amount of games to tell who’s better.” — qsort
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.