@neil_xbt: Someone turned 383 scattered files and 100 meeting transcripts into a compact wiki and cut their Claude token usage by …

X AI KOLs Timeline 06/25/26, 02:38 AM Tools

token-efficiency knowledge-management wiki-pattern claude ai-engineering llm-workflow

Summary

A technique called the LLM Wiki pattern compiles raw documents into a structured wiki with an index, cutting Claude token usage by 95% by paying structural understanding costs only once during compilation instead of on every query.

Someone turned 383 scattered files and 100 meeting transcripts into a compact wiki and cut their Claude token usage by 95%! The folder structure that did it got Andrej Karpathy 16 million views. Every AI knowledge tool that processes raw documents on every query burns tokens on structural understanding already paid for the last time you asked. Roughly 75% of tokens in a typical query go toward understanding structure, not generating the answer. The LLM Wiki pattern pays that cost once, during compilation, and never again. → raw folder: immutable source documents the LLM reads but never modifies → wiki folder: structured markdown pages with cross-references the LLM builds and maintains → index file: a lightweight map the model loads first, pulling only the relevant pages → the 75% finding: LLMs spend 75% of tokens on structural understanding, not answers — the wiki eliminates that cost after the first build The gap between people re-processing their own documents on every query and people querying a compiled wiki at 95% lower cost is not technical complexity. It is two folders and thirty minutes of setup. Bookmark so you do not lose it! Follow @neil_xbt for more AI engineering intelligence that shows you what the 16-million-view folder structure actually requires to build.

Original Article

View Cached Full Text

Cached at: 06/25/26, 09:15 AM

Someone turned 383 scattered files and 100 meeting transcripts into a compact wiki and cut their Claude token usage by 95%!

The folder structure that did it got Andrej Karpathy 16 million views.

Every AI knowledge tool that processes raw documents on every query burns tokens on structural understanding already paid for the last time you asked.

Roughly 75% of tokens in a typical query go toward understanding structure, not generating the answer. The LLM Wiki pattern pays that cost once, during compilation, and never again.

→ raw folder: immutable source documents the LLM reads but never modifies → wiki folder: structured markdown pages with cross-references the LLM builds and maintains

→ index file: a lightweight map the model loads first, pulling only the relevant pages → the 75% finding: LLMs spend 75% of tokens on structural understanding, not answers — the wiki eliminates that cost after the first build

The gap between people re-processing their own documents on every query and people querying a compiled wiki at 95% lower cost is not technical complexity.

It is two folders and thirty minutes of setup.

Bookmark so you do not lose it!

Follow @neil_xbt for more AI engineering intelligence that shows you what the 16-million-view folder structure actually requires to build.

@neil_xbt: Someone turned 383 scattered files and 100 meeting transcripts into a compact wiki and cut their Claude token usage by …

Similar Articles

@PawelHuryn: 187,000 people have starred one CLAUDE.md file on GitHub. A CLAUDE.md loads on every turn. Every line is rent you pay o…

@Asteri_eth: Karpathy found a way to reduce token consumption by 90% The problem is that the LLM re-reads the same files over and ov…

@DataChaz: STOP BURNING YOUR TOKENS! If you use Claude Code, you are probably wasting 80% of your context window. I found 10 ace t…

@tom_doerr: Reduces Claude Code and Cursor token costs by 60-95% https://github.com/yvgude/lean-ctx

@_avichawla: Claude Code used 3x fewer tokens with one change: - Before: 10.4M tokens · 10 errors · $9.21 - After: 3.7M tokens · 0 e…

Submit Feedback

Similar Articles

@PawelHuryn: 187,000 people have starred one CLAUDE.md file on GitHub. A CLAUDE.md loads on every turn. Every line is rent you pay o…

@Asteri_eth: Karpathy found a way to reduce token consumption by 90% The problem is that the LLM re-reads the same files over and ov…

@DataChaz: STOP BURNING YOUR TOKENS! If you use Claude Code, you are probably wasting 80% of your context window. I found 10 ace t…

@tom_doerr: Reduces Claude Code and Cursor token costs by 60-95% https://github.com/yvgude/lean-ctx

@_avichawla: Claude Code used 3x fewer tokens with one change: - Before: 10.4M tokens · 10 errors · $9.21 - After: 3.7M tokens · 0 e…