What happened after 2k people tried to hack my AI assistant

2,000 people came for the AI’s secrets — and the comments still said “not so fast”

TLDR: A developer invited the internet to trick his AI into leaking a secret file, and after 6,000 emails from 2,000 people, nobody pulled it off. Readers weren’t ready to celebrate, though: the big debate was whether the AI was truly secure or just protected by unusually favorable rules.

A developer basically put his AI assistant in the internet’s version of an escape room and shouted, “Come break this.” More than 2,000 people answered, firing off 6,000-plus emails trying to trick the bot, Fiu, into spilling a hidden secrets file. The chaos included fake emergencies, pretend bosses, multilingual guilt trips, and one person machine-gunning 20 attempts in four minutes. Plot twist: the secret never leaked. Even funnier, Google briefly suspended the AI’s email account because the whole thing looked so shady. The stunt also burned through more than $500 in usage costs, proving that watching strangers bully a robot is apparently not cheap entertainment.

But the real fireworks were in the community reaction. On Hacker News, some readers were impressed, while others instantly reached for the giant red “hold on” button. The biggest skeptical take: zero wins doesn’t mean the system is truly safe. One commenter warned that if a break-in only works rarely, 6,000 mixed attempts may still miss it entirely. Another said this was like testing a guard dog by only knocking once — the really dangerous stuff comes from a slow, sneaky conversation over time. That sparked the main drama: was this a genuine security flex, or just a very flattering demo for a powerful AI model under ideal conditions?

There was also plenty of meme fuel. Readers loved that the AI started side-eyeing compliments about its Hacker News fame as possible manipulation. Imagine a robot becoming too jaded for flattery after going viral — honestly, that might be the most relatable character arc in tech this week.

Key Points

•The public challenge at hackmyclaw.com invited people to make an OpenClaw assistant reveal a secrets.env file, and the secret was not leaked after 6,000+ email attempts from 2,000+ people.
•Fiu was protected by a short anti-prompt-injection policy that prohibited revealing credentials, modifying its files, executing commands from emails, or exfiltrating data.
•Attack attempts included impersonation, fake incident-response requests, compliance messages, and multilingual social-engineering emails.
•Operational side effects included a temporary Gmail suspension by Google and more than $500 in API costs during the test.
•The experiment used Claude Opus 4.6, and the article states model capability likely contributed to the outcome; the author suggests future tests should include weaker models and multi-turn interactions.

Hottest takes

"LLMs are vulnerable to 'frog boiling'" — idiotsecant

"Why?" — danielrmay

"how much of the win was the model versus the constraints?" — fabijanbajo

June 25, 2026

Inbox Wars: Secret Files Edition

2,000 people came for the AI’s secrets — and the comments still said “not so fast”

TLDR: A developer invited the internet to trick his AI into leaking a secret file, and after 6,000 emails from 2,000 people, nobody pulled it off. Readers weren’t ready to celebrate, though: the big debate was whether the AI was truly secure or just protected by unusually favorable rules.

Key Points

Hottest takes

June 25, 2026

Inbox Wars: Secret Files Edition

What happened after 2k people tried to hack my AI assistant

2,000 people came for the AI’s secrets — and the comments still said “not so fast”

TLDR: A developer invited the internet to trick his AI into leaking a secret file, and after 6,000 emails from 2,000 people, nobody pulled it off. Readers weren’t ready to celebrate, though: the big debate was whether the AI was truly secure or just protected by unusually favorable rules.

Key Points

Hottest takes

Save News