@AnthropicAI: Correction: Claude Opus 4's ~3x average speedup dates to May 2025, not May 2024. This evaluation has only existed since…

X AI KOLs 06/04/26, 08:34 PM News

Summary

Anthropic issued a correction clarifying that Claude Opus 4's ~3x average speedup dates to May 2025, not May 2024, and that earlier models from May 2024 showed no speedup on the backtested evaluation.

Correction: Claude Opus 4's ~3x average speedup dates to May 2025, not May 2024. This evaluation has only existed since September 2024, but we backtested it on earlier models: those from May 2024 showed no speedup whatsoever.

Original Article

Similar Articles

@AnthropicAI: Each time we release a model, we run the same test: give it code that trains a small AI model, ask the new model to spe…

X AI KOLs

Anthropic shares internal benchmark results showing dramatic AI coding improvement: while Claude Opus 4 averaged ~3x speedup on an ML code optimization task in May 2024, the new Mythos Preview model achieved ~52x speedup this April, compared to 4-8 hours for a skilled human to reach 4x.

@rohanpaul_ai: Fast mode for Claude Opus 4.8 is roughly 2.5x the speed while being 3X cheaper than before. AI/ML API (@aimlapi) alread…

X AI KOLs Following

Claude Opus 4.8 now has a fast mode that is 2.5x faster and 3x cheaper, integrated on AI/ML API with free access for selected users.

Claude Opus 4.8 says it's the only model that finished every case on the Super-Agent benchmark. Anyone run it on real agents yet?

Reddit r/AI_Agents

Anthropic released Claude Opus 4.8, claiming it is the only model to complete every case on the Super-Agent benchmark and that it outperforms GPT-5.5 on browser/computer use tasks with better tool efficiency and fewer uncorrected code flaws.

Introducing Claude Opus 4.7

Anthropic News

Anthropic has released Claude Opus 4.7, a new AI model featuring significant improvements in advanced software engineering, vision capabilities, and self-verification. The release includes specific cybersecurity safeguards and is available via API and major cloud providers.

Claude Opus 4.8 scores over 1% on ARC-AGI 3 !!

Reddit r/singularity

Claude Opus 4.8 achieves a score of over 1% on the ARC-AGI 3 benchmark, demonstrating slight progress on a difficult AI reasoning test.