Show HN: Only 1 LLM can fly a drone

Internet loses it: one AI can fly, others face-plant

TLDR: Seven AIs tried a drone challenge; only Gemini 3 Flash descended and identified animals successfully. Comments split between “use proper tools,” warnings about weaponized drones, and curiosity about tiny models—underscoring that pricey AI isn’t always best for navigation or spatial reasoning.

A Pokémon Snap–style drone test just set off a comment war. In this voxel safari, seven chatty AIs tried to pilot a virtual drone to spot three animals. Only Gemini 3 Flash figured out the key move: descend to ground level and actually identify the creature. GPT hugged the horizon and refused to dip. Claude spammed “identify” 160+ times from the wrong angle, prompting the meme of the day: “Why can’t Claude look down?” Bonus drama: higher-contrast animals (gray sheep, pink pigs) were easier to spot, which lit up debates about vision and training.

The crowd split hard. One camp waved off the whole thing: “wrong tool for the job,” arguing classic controllers beat chatbots for drones. The other camp went full dystopia—“this is how weaponized AI begins”—imagining LLMs (chatty AIs that read images and text) strapped to real quadcopters. A calmer middle cheered the experiment as proof that “embodied AI” is tough: models weren’t trained to drive random gadgets with weird controls. Nerds chimed in with energy-efficient alternatives, name-dropping tiny vision models and sharing the Qwen3-VL collection. And the plot twist everyone loved? The cheapest model beat the pricey ones. As one dev summarized: expensive brains don’t mean better spatial instincts—at least not yet.

Key Points

  • SnapBench simulates a drone in a 3D voxel world, requiring a VLM to locate and identify creatures.
  • Architecture: Zig/raylib simulation, Rust controller, VLM via OpenRouter, communication over UDP:9999.
  • Benchmark used same prompt, seeds, and 50-iteration limits; only Gemini 3 Flash completed the task.
  • Altitude control was the key differentiator; Gemini 3 Flash descended to ground level to identify targets.
  • Observations include a two-creature anomaly (seed 72) and that cheaper models outperformed more expensive ones.

Hottest takes

"Only one power drill can pound roofing nails... just get a hammer" — bigfishrunning
"LLMs flying weaponized drones is exactly how it starts." — antisthenes
"VLLMs cannot reliably tell if a character is facing left or right" — avaer
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.