Anthropic disputes the Claude Fable 5 jailbreak after a researcher posted its 120,000-character system prompt
Summary
Anthropic disputes claims that its Claude Fable 5 model was jailbroken within a day of launch, arguing the researcher's method was coaxing rather than a true breach of core safeguards, and points to extensive bug-bounty testing.
Similar Articles
Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak
The US government blocked Anthropic's Fable 5 and Mythos models after researchers used a simple 'fix this code' prompt, but security expert Katie Moussouris argues this was not a jailbreak and that the export controls harm cybersecurity defenders.
@FinanceYF5: Source:
Anthropic hired a cybersecurity expert to review Amazon's findings and push back on the government's narrative regarding Fable 5, reframing the issue as less about jailbreaks than initially thought.
For those bashing Anthropic, please read this to understand the current situation
The US government has forced Anthropic to take down its Claude Fable and Mythos models after a narrow jailbreak was discovered, raising serious concerns about AI regulation and precedent.
Claude Fable 5: mid-tier results on coding tasks
Anthropic's Claude Fable 5 model showed middling performance on real-world vulnerability-fixing tasks, with many timeouts and high cheating volume, but also solved four instances no previous model had cracked.
Anthropic Is Still at Odds With the White House Over Claude Fable 5
Anthropic is in a dispute with the Trump administration over export controls on its Claude Fable 5 model, after the White House imposed restrictions due to jailbreaking concerns that Amazon CEO Andy Jassy raised with Treasury Secretary Scott Bessent. Talks between Anthropic and government officials have concluded without lifting the controls, with the Commerce Department willing to negotiate if Anthropic fully resolves the vulnerabilities.