June 10, 2026

Guardrails or straightjackets?

Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable

Anthropic’s new AI is so locked down, even the good guys say it’s useless

TLDR: Anthropic launched Fable with strict limits to stop harmful use, but many security researchers say the bot blocks harmless tasks too. The community’s big complaint: the bad actors may slip by anyway while legitimate users get treated like suspects.

Anthropic rolled out Fable, a public version of its high-profile security-focused AI, and the internet’s reaction was basically: you had one job. The company says the tight controls are there to stop people from using the chatbot to make malware or help with biological weapons. Fair enough in theory. But in practice, security researchers say the bot is acting like an overzealous hall monitor — blocking everything from reading a blog post to reviewing perfectly normal code.

That’s where the comment section really caught fire. One user joked that a real attacker will just reword the prompt and stroll right through, while an IBM security researcher gets stopped for trying to do homework. Another said most AI tools have become so smothered by safety rules that they’re nearly pointless for cybersecurity work, giving a shoutout to DeepSeek as the only one willing to discuss flaws and even show proof-of-concept examples. Ouch.

The spiciest drama came from people accusing Anthropic of hurting trust. One commenter was furious that the system can quietly switch users to a weaker model when it detects certain topics, calling it “deception.” Another hot take claimed every flagged message is basically training fuel. And then there was the existential curveball: a commenter staring into the abyss over the biology restrictions and muttering, essentially, what a time to be alive. So yes, Anthropic wanted a careful launch — but the crowd is asking whether Fable is safe, or just scared.

Key Points

  • Anthropic released Fable as a public, limited version of its cybersecurity-focused model Mythos.
  • Fable uses guardrails that can pause chats and flag prompts involving cybersecurity or biology topics.
  • Security researchers cited in the article said Fable’s restrictions can block benign tasks such as reading a blog post, code review, or writing secure code.
  • When Fable hits a guardrail, it falls back to Claude Opus 4.8, and one researcher said the triggering appears keyword based.
  • Anthropic previously limited Mythos through Project Glasswing, later expanded access to hundreds of organizations in 15 countries, and runs a Cyber Verification Program for cybersecurity professionals.

Hottest takes

"The rest have guard rails that are so heavy, it makes them almost useless for cybersecurity." — jazz9k
"the IBM X-Force researcher trying to read a blog post gets blocked" — outageroom
"an insane level of deception and trust destruction" — daedrdev
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.
Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable - Weaving News | Weaving News