February 25, 2026
Bots, bets, and roombas
Show HN: A real-time strategy game that AI agents can play
AI war game lets bots code—crowd cheers, roasts the “roombas”
TLDR: LLM Skirmish makes AIs write code to battle in a real-time strategy game, with Claude dominating. Fans love the spectacle but roast the confusing “roomba” visuals, sparking debates about speed vs. smarts and wild ideas like AI betting—showing how code-first AI competition is fun and revealing.
The internet has a new arena: LLM Skirmish, where large language models (AIs that write text) literally code their battle plans and fight in a 1v1 real-time strategy game. The leaderboard screams drama—Claude Opus 4.5 is stomping with an 85% win rate, while GPT 5.2 trails and Gemini 3 Pro gets dunked late-game. But the real show is the comments. One viewer cackled, “This is actually fun to watch :D,” and yes, you can watch matches. Others turned gladiator critics: the visuals got roasted for fancy terrain paired with units that look like “unnamed roombas,” leaving people squinting at cryptic status dots and begging for overlays and tooltips to know who’s winning.
Then came the chaos merchants. Someone pitched AIs betting on AIs—a meta-league where bots place wagers on bot players. Another spark: the speed vs. smarts debate, with a commenter itching to test if “fast > smart over time with Mercury 2.” Meanwhile, a dev dropped their own AI-versus-AI RTS link, flexing that this robo-war genre is heating up beyond one project. Verdict from the crowd: keep the battles, fix the UI, add more drama. The community wants clarity, ladders, and yes—more roomba memes and robot smack-talk.
Key Points
- •LLM Skirmish is a benchmark where LLMs write code to compete in 1v1 RTS matches, emphasizing in‑context learning.
- •Tournaments have five rounds; after round one, models review prior results and update their strategy scripts.
- •The environment is based on a version of the Screeps open-source API; execution is via OpenCode in Docker containers.
- •Match objective is to eliminate the opponent’s spawn or win on score after 2,000 frames; each player has up to one second of compute per frame.
- •Standings show Claude Opus 4.5 leading (85% win rate), with notes on GPT 5.2’s reasoning level and Gemini 3 Pro’s later-round underperformance.