@Mnilax: Karpathy threw a grenade at every senior engineer who still treats LLMs as a toy. his actual words: the worst thing an …

X AI KOLs Timeline News

Summary

The article discusses Andrej Karpathy's advice on leveraging LLMs despite their cognitive deficits, highlighting a case study where custom configuration (CLAUDE.md) significantly reduced error rates.

Karpathy threw a grenade at every senior engineer who still treats LLMs as a toy. his actual words: the worst thing an expert can do right now is reject them. most experts read it as a threat, but it's advice. his framing: > the gap between "AI tools are bad" and "AI tools are useful when used right" is professional discipline, not capability > agents have cognitive deficits. they fail in ways nothing in the training set anticipated > the experts who reject LLMs lose to experts who learn to wrangle them > "models have so many cognitive deficits. but you can route around them" routing around the deficits is what CLAUDE.md was invented for. Karpathy himself wrote 4 rules. across 30 codebases they took my Claude error rate from 41% down to 11%. solid drop. but his rules pre-date the slop era going public. I bolted on 8 more, tuned to the failure modes that surfaced after January. got it down to 3%. a CLAUDE.md does not raise Claude's IQ. it lowers his slop floor. that is the entire game. open the article underneath. the model is not the bottleneck. your config is.
Original Article

Similar Articles

Quoting Bryan Cantrill

Simon Willison's Blog

Bryan Cantrill critiques LLMs for lacking the optimization constraint of human laziness, arguing that LLMs will unnecessarily complicate systems rather than improve them, and highlighting how human time limitations drive the development of efficient abstractions.

LLMs Corrupt Your Documents When You Delegate

arXiv cs.CL

DELEGATE-52 is a new benchmark revealing that current LLMs, including frontier models like GPT-5.4 and Claude 4.6 Opus, corrupt an average of 25% of document content during long delegated workflows across 52 professional domains. The research demonstrates that LLMs introduce sparse but severe errors that compound over interactions, raising concerns about their reliability for delegated work paradigms.