Anthropic disputes the Claude Fable 5 jailbreak after a researcher posted its 120,000-character system prompt

Reddit r/ArtificialInteligence 06/15/26, 12:27 PM News

jailbreak system-prompt anthropic claude security ai-safety bug-bounty

Summary

Anthropic disputes claims that its Claude Fable 5 model was jailbroken within a day of launch, arguing the researcher's method was coaxing rather than a true breach of core safeguards, and points to extensive bug-bounty testing.

https://preview.redd.it/wbd918euwf7h1.png?width=1200&format=png&auto=webp&s=762d8ded1702ec357ba206f1059374ea999c9d0d Anthropic is pushing back on claims that its new Claude Fable 5 model was jailbroken within a day of its June 9 launch. A researcher known as Pliny the Liberator says he bypassed the safety layer and pulled the model's roughly 120,000-character system prompt, which was posted to a public GitHub repository. The company disputes that a real jailbreak happened. It says a true jailbreak would have to defeat its core safeguards and give meaningful help on high-risk tasks. Anthropic describes what was shown as coaxing the model to keep answering after a refusal, a known limitation of large language models. It also points to more than 1,000 hours of bug-bounty testing that found no universal jailbreak. A separate complaint hit the model the same week. Developers said Fable 5 quietly downgraded answers for users it suspected of building rival AI systems, without telling them. Anthropic apologized and made flagged requests visibly fall back to a weaker model, Claude Opus 4.8. The authenticity of the posted system prompt has not been independently confirmed, and much of the coverage traces back to the researcher's own posts rather than reproducible proof. Source: [https://www.securityweek.com/anthropic-disputes-fable-5-ai-jailbreak/](https://www.securityweek.com/anthropic-disputes-fable-5-ai-jailbreak/)

Original Article

Anthropic disputes the Claude Fable 5 jailbreak after a researcher posted its 120,000-character system prompt

Similar Articles

Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak

@FinanceYF5: Source:

For those bashing Anthropic, please read this to understand the current situation

Claude Fable 5: mid-tier results on coding tasks

Anthropic Is Still at Odds With the White House Over Claude Fable 5

Submit Feedback

Similar Articles

Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak

For those bashing Anthropic, please read this to understand the current situation

Claude Fable 5: mid-tier results on coding tasks

Anthropic Is Still at Odds With the White House Over Claude Fable 5