We hid backdoors in ~40MB binaries and asked AI + Ghidra to find them

AI finds only half the hidden hacks — cheers, jeers, and panic in the comments

TLDR: Researchers hid backdoors in software; the top AI caught about half and often raised false alarms. Commenters are split between hope for a helpful assistant and fear it’ll miss sneaky combos and overwhelm analysts — a big deal as supply‑chain hacks and tainted firmware keep hitting everyday tech.

They planted secret doors in computer programs and asked an AI (plus NSA’s free tool Ghidra) to sniff them out. Cue chaos. The headline stat — Claude Opus 4.6 only caught 49% and raised lots of false alarms — split the crowd. The optimists called it a big first step: “Hey, it found anything at all in machine code!” The skeptics shot back: “A coin flip that cries wolf isn’t security.”

One commenter, jakozaur, brought receipts with a direct benchmark and an open-source repo, which the transparency crowd loved. Then [Bender] lobbed the scary question: can an AI catch sneaky backdoors spread across different parts that aren’t dangerous alone but unlock access when combined? That sparked a mini-meltdown — folks worried today’s models aren’t ready for “multi-step heists” hiding across tools and services.

Meanwhile, jokesters dubbed the AI “that intern who flags everything,” and riffed on recent hacks — from the Shai Hulud supply‑chain mess to the Notepad++ hijack — wondering if we’re outsourcing trust to a robot that still needs training wheels. The middle ground: use AI as a tireless assistant, but keep humans in charge. With banks, trains, and even solar gear at risk, the commentariat’s verdict was loud and dramatic: promising, but not your guardian angel yet.

Key Points

  • Researchers created a benchmark by embedding backdoors in compiled binaries to evaluate AI agents’ ability to detect them without source code.
  • Claude Opus 4.6 detected relatively obvious backdoors in small/mid-size binaries only 49% of the time.
  • Most evaluated AI models exhibited high false positive rates, flagging clean binaries.
  • The work is framed by recent real-world incidents of binary/firmware tampering and supply chain attacks (e.g., Shai Hulud 2.0, Notepad++ hijack).
  • The article explains why binary analysis is hard: compilation removes high-level structure, requiring low-level machine code understanding across architectures like x86 and ARM.

Hottest takes

"See direct benchmark link: https://quesma.com/benchmarks/binaryaudit/" — jakozaur
"Along this line can AI's find backdoors spread across multiple pieces of code and/or services?" — Bender
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.