Tag
The author shares their personal journey of adopting a new coding workflow using a single Codex agent with /goal mode, which they find superior to multi-agent setups with newer models like GPT-5.5 and Opus 4.8.
Opus 4.8 shows improved build quality and lower cost compared to Opus 4.7 on the MineBench 3D block-structure benchmark, though with some inconsistencies. The model demonstrates streamlined thinking and more efficient inference.
Opus 4.8 is now available on DeepSWE, scoring 6% higher than Opus 4.7 with reduced average cost per task.
Anthropic reaches $96.5 billion valuation, surpassing OpenAI to become the most valuable AI startup, and launches Opus 4.8 model on the same day. Both companies are preparing for IPOs and vying for market share in coding tools.
A user criticizes Anthropic's Opus 4.8 model, comparing it unfavorably to Opus 4.6 and calling it weak; another user echoes the sentiment, calling the model 'stupid'.