Models Are Hitting Diminishing Returns Within Software Engineering

Reddit r/ArtificialInteligence News

Summary

A distinguished engineer at a hyperscaler argues that AI models are hitting diminishing returns in software engineering tasks, as he finds little difference between Claude's Fable 5 and previous Opus models, and predicts local models will soon provide comparable value.

For the creds: I'm a distinguished engineer at a hyperscaler and work in the space. We've seen Claude's Fable 5 release recently, and I've been having a go at it. Thus far, I wouldn't be able to tell if you did a blind test which model I was using. If you had put Opus 4.6, 4.7, 4.8 and Fable in my Claude Code setup, based on the work I do and how I work, I wouldn't be able to tell which is which. The reasoning is pretty straightforward, in that I never 'one-shot' a project. Since I need to understand every component inside and out, I work in small chunks - and I'm not alone. Moreover, models have had access to the Internet's wide suite of information such as API docs, best practices, etc for a while - which added 'intelligence' of a certain flavour to the models outputs. So when you look at how software engineers in industry work, we work in singular abstractions, test those abstractions and move on. I can almost do this today with local Gemma 4 models. This is also true for system architecture asks, where understanding every component is pretty crucial. And Fable still hallucinates on this. Example: Fable got the AWS ALB/ECS draining behaviour completely wrong, and confidently so. The only reason I was able to catch it is that I was already familiar with how those two pieces work together. So anyways, in short, we're hitting an asymptotic limit here. I'm not getting more value from every model release anymore, and the way I work isn't changing. Having spoken to my colleagues who are heavy AI enjoyers, my views also track with their own experiences as well. Anecdotally, by this time next year, I believe there will be local models you can run on a 128GB MacBook Pro that will provide 90% of the value Claude adds to my software engineering work today. I can already see this with the current suite of open source models.
Original Article

Similar Articles

The Model Is No Longer the Bottleneck (6 minute read)

TLDR AI

Anthropic's Claude, a general-purpose AI model without chemistry fine-tuning, outperformed specialized software like ChemDraw and MestReNova in NMR analysis, suggesting that the bottleneck in scientific AI has shifted from model capability to workflow design.

Can tech companies learn to love cheaper AI models? 

TechCrunch AI

TechCrunch reports on a potential industry shift as companies consider switching to cheaper, smaller AI models instead of always using the most powerful ones, driven by escalating costs. Predictions like Brian Armstrong's suggest 80% of workloads could run on 99% cheaper models within 12-18 months, which would significantly impact major AI labs like OpenAI and Anthropic.