@AnthropicAI: Correction: Claude Opus 4's ~3x average speedup dates to May 2025, not May 2024. This evaluation has only existed since…
Summary
Anthropic issued a correction clarifying that Claude Opus 4's ~3x average speedup dates to May 2025, not May 2024, and that earlier models from May 2024 showed no speedup on the backtested evaluation.
Similar Articles
@AnthropicAI: Each time we release a model, we run the same test: give it code that trains a small AI model, ask the new model to spe…
Anthropic shares internal benchmark results showing dramatic AI coding improvement: while Claude Opus 4 averaged ~3x speedup on an ML code optimization task in May 2024, the new Mythos Preview model achieved ~52x speedup this April, compared to 4-8 hours for a skilled human to reach 4x.
@rohanpaul_ai: Fast mode for Claude Opus 4.8 is roughly 2.5x the speed while being 3X cheaper than before. AI/ML API (@aimlapi) alread…
Claude Opus 4.8 now has a fast mode that is 2.5x faster and 3x cheaper, integrated on AI/ML API with free access for selected users.
Claude Opus 4.8 says it's the only model that finished every case on the Super-Agent benchmark. Anyone run it on real agents yet?
Anthropic released Claude Opus 4.8, claiming it is the only model to complete every case on the Super-Agent benchmark and that it outperforms GPT-5.5 on browser/computer use tasks with better tool efficiency and fewer uncorrected code flaws.
Introducing Claude Opus 4.7
Anthropic has released Claude Opus 4.7, a new AI model featuring significant improvements in advanced software engineering, vision capabilities, and self-verification. The release includes specific cybersecurity safeguards and is available via API and major cloud providers.
Claude Opus 4.8 scores over 1% on ARC-AGI 3 !!
Claude Opus 4.8 achieves a score of over 1% on the ARC-AGI 3 benchmark, demonstrating slight progress on a difficult AI reasoning test.