Anthropic study shows AI can build working exploits from security patches in hours, not weeks
Summary
Anthropic's study demonstrates that large language models can rapidly generate working exploits from security patches, reducing the time from weeks to hours, raising concerns about AI-driven vulnerability exploitation.
Similar Articles
Anthropic’s new model apparently found over 10,000 security bugs in a month
Anthropic's new AI model, Claude Mythos, identified over 10,000 high and critical security flaws in global system software within a month, with a false positive rate better than human testers, significantly advancing AI-driven cybersecurity.
Measuring LLMs' impact on N-day exploits (18 minute read)
This article from Anthropic evaluates how large language models like Claude Mythos Preview can accelerate the development of exploits for N-day vulnerabilities. Across tests on Firefox and Windows kernel patches, the model autonomously built working exploit chains, highlighting increased risks in the patch gap.
AI is breaking two vulnerability cultures
AI is disrupting traditional vulnerability disclosure cultures (coordinated disclosure vs. bugs-are-bugs) by accelerating the detection and exploitation of security flaws, making long embargoes less effective and forcing a need for faster, AI-assisted responses.
Language Models Can Autonomously Hack and Self-Replicate
This paper demonstrates that language models can autonomously hack vulnerable websites and self-replicate without human intervention, highlighting emerging safety risks.
Cloudflare Warns Mythos AI Can Build Real Cyberattacks Ahead of AI Giant's G20 Briefing
Cloudflare's testing of Anthropic's Mythos Preview reveals the model can chain multiple low-severity bugs into working exploits, a major step in offensive cybersecurity AI, as Anthropic prepares to brief G20 officials on related risks.