@DeRonin_: Andrej Karpathy: "90% of Claude's mistakes come from missing context, not a weak model." 41% mistake rate without a CLA…
Summary
Andrej Karpathy states that 90% of Claude's mistakes stem from missing context, not model weakness, and provides a set of 12 rules that reduced error rates from 41% to 3% in experiments.
View Cached Full Text
Cached at: 05/18/26, 10:30 AM
Andrej Karpathy: “90% of Claude’s mistakes come from missing context, not a weak model.”
41% mistake rate without a CLAUDE.md. 11% with the 4-rule baseline. 3% with the 12-rule version below
here are the 12 rules senior engineers settled on:
-
think before coding: state assumptions, don’t guess. the model can’t read your mind, stop hoping it will
-
simplicity first: minimum code, no speculative abstractions. the moment you let Claude add “for future flexibility,” you’ve added 200 lines you’ll delete next quarter
-
surgical changes: touch only what you must. don’t let it improve adjacent code, that’s how PRs blow up
-
goal-driven execution: define success criteria upfront, loop until verified. without them Claude either loops forever or stops too early
-
use the model only for judgment calls: classification, drafting, summarization, extraction. NOT routing, retries, status-code handling, deterministic transforms. if code can answer, code answers
-
token budgets are not advisory: per-task 4000, per-session 30000. by message 40 of a long debug, Claude is re-suggesting fixes you rejected at message 5
-
surface conflicts, don’t average them: two patterns in the codebase? pick one. Claude blending them is how errors get swallowed twice
-
read before you write: read exports, callers, shared utilities. Claude will happily add a duplicate function next to an identical one it never read
-
tests verify intent, not just behavior: a test that can’t fail when business logic changes is wrong. all 12 of Claude’s tests can pass while the function returns a constant
-
checkpoint every significant step: Claude finished steps 5 and 6 on top of a broken state from step 4. nobody noticed for an hour
-
match the codebase conventions: class components? don’t fork to hooks silently. testing patterns assumed componentDidMount, hooks broke them without surfacing
-
fail loud: “completed successfully” with 14% of records silently skipped is the worst class of bug. surface uncertainty, don’t hide it
what actually compounds instead of the next framework:
- the CLAUDE.md file as institutional memory across sessions
- eval-driven changes, not vibe-driven
- checkpoints over speed
- explicit conflicts over silent blending
- discipline over framework, every time
- one repo, one rules file, no exceptions
be a few rules ahead of AI twitter before this becomes mass-opinion
study this
Ronin (@DeRonin_): anybody who uses or learns agentic systems, SHOULD READ THIS
the install order I run before any new agentic project:
- PRIVACY: direnv + a real secrets manager
install direnv, then plug it into your team’s password manager (1Password CLI via op run, doppler, infisical, vault,
Similar Articles
@PrajwalTomar_: Claude isn't broken. Your CLAUDE .md is. Most people think Claude Code makes mistakes because the model is bad. Wrong. …
A tweet argues that poor Claude Code behavior stems from bad CLAUDE.md configuration rather than model flaws, sharing rules to make the agent act like a senior engineer.
@DeRonin_: Andrej Karpathy: "90% of your AI coding bill is paying for context you didn't need to send" Here are 10 things senior A…
The article summarizes Andrej Karpathy's advice on reducing AI coding costs by optimizing context usage, avoiding overpowered models for simple tasks, and implementing efficient routing strategies.
@PrajwalTomar_: Karpathy complained about Claude making mistakes, someone turned it into 4 rules, and it became the fastest-growing sin…
The author updates a viral GitHub prompt template that drastically reduced Claude's coding errors, introducing eight new rules to solve modern AI agent issues like context loss and workflow conflicts for developers using Claude Code.
@_avichawla: A smarter Claude model burns more tokens, not fewer! And it's not a minor 3-5% difference. But 54% higher token usage. …
The article analyzes why smarter AI agents like Claude consume more tokens when interacting with human-centric backends like Supabase due to inefficient context discovery. It introduces InsForge, an open-source backend tool designed for agents that provides structured context to significantly reduce token usage and manual interventions.
@aakashgupta: This guy literally broke down everything you need to master Claude: 6:07 - Why to Stop Using Chat 9:56 - Cowork vs Code…
A comprehensive guide to mastering Claude, covering topics from basic usage to advanced skills and MCP connectors, presented with timestamps.