Anthropic says these topics are too dangerous to let its Fable 5 model talk about

Ars Technica Models

Summary

Anthropic has released Claude Fable 5, its latest AI model with strict topic-based safeguards that prevent it from answering queries on dangerous subjects like cybersecurity, biology, and chemistry; the model may occasionally refuse harmless requests but aims to prevent malicious use.

<p>Anthropic Tuesday <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">publicly released Claude Fable 5</a>, its first "Mythos-class" model that it says surpasses its previous frontier Opus models in overall capabilities. But the model's launch today comes with safeguards designed to prevent it from answering queries on topics like cybersecurity, biology, and chemistry, where the company has <a href="https://arstechnica.com/ai/2026/04/anthropic-limits-access-to-mythos-its-new-cybersecurity-ai-model/">publicly worried about its potential impact</a> to "uplift" malicious actors.</p> <p>Anthropic says Fable 5 operates on the "same underlying model" as Mythos 5, which is coming out of <a href="https://arstechnica.com/ai/2026/04/mozilla-anthropics-mythos-found-271-zero-day-vulnerabilities-in-firefox-150/">its monthslong "Mythos Preview" period</a> today, but only for "a small group of cyberdefenders" judged trustworthy through the <a href="https://www.anthropic.com/glasswing">existing Project Glasswing</a>. Unlike Mythos 5, though, the publicly accessible Fable 5 is designed to funnel queries on certain sensitive topics to the earlier Claude Opus 4.8 model and to warn the user when this is happening.</p> <img width="2600" height="2870" src="https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench.webp" class="fullwidth full" alt="" decoding="async" loading="lazy" srcset="https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench.webp 2600w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-640x706.webp 640w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-1024x1130.webp 1024w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-768x848.webp 768w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-1391x1536.webp 1391w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-1855x2048.webp 1855w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-980x1082.webp 980w, https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench-1440x1590.webp 1440w" sizes="auto, (max-width: 2600px) 100vw, 2600px"> Among the many claimed benchmark improvements for Fable 5, the one related to cybersecurity was a particularly large jump. Credit: <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5" target="_blank">Anthropic</a> <p>Anthropic said it has tuned these safeguards to be "stricter than ideal," meaning the system may occasionally refuse "harmless requests" in a way that it acknowledges may be frustrating for regular users. But Anthropic says such false positives come up in less than five percent of all sessions in testing, and were worth it to avoid situations where Mythos could give malicious actors assistance in "causing serious harm that they couldn’t have received from other sources."</p><p><a href="https://arstechnica.com/ai/2026/06/anthropic-says-these-topics-are-too-dangerous-to-let-its-fable-5-model-talk-about/">Read full article</a></p> <p><a href="https://arstechnica.com/ai/2026/06/anthropic-says-these-topics-are-too-dangerous-to-let-its-fable-5-model-talk-about/#comments">Comments</a></p>
Original Article
View Cached Full Text

Cached at: 06/10/26, 12:18 AM

# Anthropic says these topics are too dangerous to let its Fable 5 model talk about Source: [https://arstechnica.com/ai/2026/06/anthropic-says-these-topics-are-too-dangerous-to-let-its-fable-5-model-talk-about/](https://arstechnica.com/ai/2026/06/anthropic-says-these-topics-are-too-dangerous-to-let-its-fable-5-model-talk-about/) Anthropic Tuesday[publicly released Claude Fable 5](https://www.anthropic.com/news/claude-fable-5-mythos-5), its first “Mythos\-class” model that it says surpasses its previous frontier Opus models in overall capabilities\. But the model’s launch today comes with safeguards designed to prevent it from answering queries on topics like cybersecurity, biology, and chemistry, where the company has[publicly worried about its potential impact](https://arstechnica.com/ai/2026/04/anthropic-limits-access-to-mythos-its-new-cybersecurity-ai-model/)to “uplift” malicious actors\. Anthropic says Fable 5 operates on the “same underlying model” as Mythos 5, which is coming out of[its monthslong “Mythos Preview” period](https://arstechnica.com/ai/2026/04/mozilla-anthropics-mythos-found-271-zero-day-vulnerabilities-in-firefox-150/)today, but only for “a small group of cyberdefenders” judged trustworthy through the[existing Project Glasswing](https://www.anthropic.com/glasswing)\. Unlike Mythos 5, though, the publicly accessible Fable 5 is designed to funnel queries on certain sensitive topics to the earlier Claude Opus 4\.8 model and to warn the user when this is happening\. [![](https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench.webp)](https://cdn.arstechnica.net/wp-content/uploads/2026/06/fable5bench.webp) Among the many claimed benchmark improvements for Fable 5, the one related to cybersecurity was a particularly large jump\. Among the many claimed benchmark improvements for Fable 5, the one related to cybersecurity was a particularly large jump\.Credit:[Anthropic](https://www.anthropic.com/news/claude-fable-5-mythos-5) Anthropic said it has tuned these safeguards to be “stricter than ideal,” meaning the system may occasionally refuse “harmless requests” in a way that it acknowledges may be frustrating for regular users\. But Anthropic says such false positives come up in less than five percent of all sessions in testing, and were worth it to avoid situations where Mythos could give malicious actors assistance in “causing serious harm that they couldn’t have received from other sources\.” ## I can’t let you do that, Dave Fable 5’s topic\-based safeguards are built around[a system of classifiers](https://www.anthropic.com/research/next-generation-constitutional-classifiers)designed to broadly detect banned prompt subjects as well as any potential jailbreak attempts\. In over 1,000 hours of red\-team testing with a bug bounty program, Anthropic says external teams failed to find any universal jailbreaks for Fable 5\. The new model also resisted automated jailbreak attempts to a much larger degree than previous Claude Opus models, Anthropic said\. The company said it is particularly worried about Mythos 5’s ability to perform “agentic hacking,” executing multi\-part cyberattacks with much more facility than earlier models\. But testing from the UK’s AI Security Institute in recent months found that Mythos Preview[performed similarly to OpenAI’s GPT\-5\.5](https://arstechnica.com/ai/2026/05/amid-mythos-hyped-cybersecurity-prowess-researchers-find-gpt-5-5-is-just-as-good/)on a suite of Capture the Flag challenges, suggesting Mythos’ performance is not “a breakthrough specific to one model\.”

Similar Articles

Claude Fable won’t answer basic biology questions

The Verge

Anthropic's new Claude Fable 5 model refuses to answer basic biology questions due to overly conservative safety filters aimed at preventing bioweapons misuse, highlighting the tradeoff between capability and safety.

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today

TechCrunch AI

Anthropic launched Claude Fable 5, a publicly accessible version of its powerful Mythos model, with safety guardrails that block responses in high-risk areas and fall back to a weaker model. The release follows Anthropic's warning about AI becoming too dangerous and its push for coordinated safety measures.

If Claude Fable stops helping you, you'll never know

Simon Willison's Blog

Anthropic's Fable 5 model includes silent safeguards that degrade responses for requests related to competitive AI development, without user awareness, raising concerns about transparency and research impact.

Anthropic releases its first Mythos-class model Claude Fable 

The Verge

Anthropic announced Claude Fable 5, its most powerful widely available AI model, part of the Mythos class previously considered too dangerous for public release. The model features new safeguards that fall back to Opus 4.8 in high-risk areas.