@LufzzLiz: https://x.com/LufzzLiz/status/2058542686551028006

X AI KOLs Timeline 05/24/26, 01:36 PM Tools

llm-wiki knowledge-management open-source tutorial wiki code-analysis claude-code

Summary

This article details the paradigm of incrementally maintaining a persistent markdown wiki using LLM (LLM Wiki), and open-sources the Claude Code skill along with multiple examples, including x-algorithm-wiki, to help developers automatically build a trusted architecture wiki for their project source code.

https://t.co/0dM0v8PDbF

Original Article

View Cached Full Text

Cached at: 05/25/26, 12:39 AM

Build a Trusted Architecture Wiki for Any Open Source Project: LLM Wiki Paradigm Practical Tutorial

Introduction

A few days ago, some Twitter friends saw my recent wiki projects and wanted me to share. While organizing the materials, I realized that instead of repeating an oral version, it would be better to write a systematic article that everyone can reuse.

This article aims to answer three questions:

Why did I spend time making these wikis? Aren’t traditional README, Notion, or Feishu documents enough?
How can you turn the source code of an open source project into a readable, searchable, and trustworthy wiki?
Can it be reused? After reading this, can you copy the workflow and build one for the project you care about?

Let me state the conclusion upfront:

My purpose for making wikis is simple: to enable LLMs to provide trustworthy analysis based on the source code. At the same time, code is constantly updated and iterated. Using llm-wiki allows you to keep track of all relationship chains in a timely manner.

LLMs have loosened the fundamental contradiction of knowledge management.

This article will share and open source: the skill based on llm-wiki, and the various wikis and wiki-web we have built.

For decades, all “personal knowledge base” movements have failed on the same thing: maintenance costs grow superlinearly with scale, and eventually people give up.

RAG seemed to solve it, but it retrieves and synthesizes from scratch each time — knowledge is not accumulated, there’s no compounding. In early 2026, Karpathy proposed a new paradigm: let an LLM incrementally maintain a persistent markdown wiki. Cross-references are pre-built, contradictions are pre-annotated, and synthesis is pre-completed. Humans are responsible for curation and questions, while the LLM handles all summaries, links, and archiving.

After reading Karpathy’s gist, I immediately started experimenting. 4 days later I built the first wiki (Hermes-Wiki (https://github.com/cclank/Hermes-Wiki)), and 6 days later the second (OpenClaw-wiki). Recently, I have successively built 7 wiki projects:

Project Target Scale GitHublanshu-wiki-skill Abstracts the entire paradigm into a Claude Code skill; a newbie can get started in 5 minutes1 skill + 2 schema (personal KB / code repo wiki)cclank/lanshu-wiki-skillx-algorithm-wikixAI’s open source X/Twitter For You recommendation algorithm34 pages / 6,800+ linescclank/x-algorithm-wikiHermes-WikiNous Research’s hermes-agent (AI Agent framework)45 pages / 26 changelogscclank/Hermes-WikiOpenClaw-wikiOpenClaw multi-channel AI gateway41 pages / 21,800+ lines (internal)lanshu-wiki-webOnline rendering of the first three wikis via a Next.js site1 Web appcclank/lanshu-wiki-webllm-video-wikiMigrating the wiki paradigm to the video domain (experimental)scaffold—personal wikiMy own reading/thinking sedimentationprivate—

They are already useful in my daily development:

Looking up any module of Hermes Agent: reading the wiki is faster than reading the source code;
Explaining the X algorithm to colleagues: sending a wiki link is better than talking for an hour;
Even subtle details like “is today’s understanding of the source code contradictory to yesterday’s?” — the wiki can tell me automatically.

The rest of the article is divided into two parts: first the principles, then a complete walkthrough using x-algorithm-wiki as a case study.

Principles

1. Three-Layer Architecture: raw / wiki / schema

All wikis follow the same abstraction:

raw/    # Original content (source code, PDF, web snapshots), immutable, LLM reads only
wiki/   # Structured notes maintained by LLM (concept pages, entity pages, synthesis pages)
schema/ # Human-machine co-created rules (CLAUDE.md / SCHEMA.md), defining page format, tags, workflow

Each layer has its role:

raw is the anchor of truth — any time there’s doubt, you can go back to the original text
wiki is “compiled” knowledge — cross-references, comparisons, contradictions are all pre-annotated
schema is the “syllabus” that humans give to the LLM, telling it how to name, tag, and when to create new pages

2. Two Paradigms

Paradigm Applicable Directory StructurePersonal Knowledge Base (Karpathy original)Reading notes, cross-domain sedimentation5 layers: raw / sources / entities / concepts / synthesesCode Repository Wiki (our simplified version)Tracking open source project architecture3 layers: concepts / entities / changelog

Common skeleton:

index.md: A flat navigation for LLM, one page per line with a summary
log.md: Append-only log of all operations, key for cross-session continuity
SCHEMA.md or CLAUDE.md: Page format, tag system, operation rules
[[wikilinks]]: Obsidian-compatible bidirectional links

3. Three Core Operations

3.1 Ingest: Convert one piece of material into 5-15 wiki pages:

Read the original thoroughly (not just the title)
Discuss key points with people (to confirm correct understanding)
Search index for duplicates to avoid recreating pages
Save the original in raw/ (never modify)
Create a sources/ summary page, mark reliability (peer-reviewed / official / expert / social / unknown)
Must read the full text of related pages before deciding how to update (forbidden to change based only on index summaries)
Extract entities, concepts; create syntheses if necessary
Synchronize index.md + log.md

3.2 Query: Answer questions by synthesizing multiple pages; if “heavy” enough, archive:

Answer synthesizes ≥3 sources, cross-domain comparison, reveals new connections → automatically archive as synthesis

3.3 Lint: Automatic scan for 9 types of problems:

Contradictions (different conclusions on the same topic) / outdated pages / orphan pages / dangling links / near-synonym tags / summary accuracy / unsolved problem summary…

Hard rule throughout the three operations: All conclusions on every page must be traceable to source code file:line or original URL; if you can’t do it, don’t write it. This rule is the core that distinguishes wiki from “AI free summary”.

4. Implementation with Claude Code Skill

After turning the above rules into a Claude Code skill, triggering becomes very light:

~/.claude/skills/wiki/
  SKILL.md              # Thin routing layer (trigger words + command decision tree + new wiki guide)
~/wiki/
  CLAUDE.md             # Thick schema (authoritative rules for page format, tags, ingest workflow)

Command system: /wiki add /wiki ingest /wiki query /wiki lint /wiki status /wiki deprecate /wiki retract /wiki merge. It also supports natural language — say “add this to the wiki” and it understands.

Cross-session continuity is maintained by each new session reading three files: CLAUDE.md → index.md → last 10 lines of log.md.

Practice: Five-Step Tutorial for Building a Wiki for an Open Source Project

Using x-algorithm-wiki (https://github.com/cclank/x-algorithm-wiki) as the case study (online preview (https://lanshu-wiki-web.lank.workers.dev/wiki/cclank/x-algorithm-wiki#index)).

34 pages / 6,800 lines wiki — complete backbone in five steps. Any open source project uses the same actions.

Step 1 · Project Initiation + Lock Source Code

# Clone source, lock commit or tag (to avoid subsequent ingest drifting from source)
git clone https://github.com/xai-org/x-algorithm /tmp/x-algorithm
cd /tmp/x-algorithm && git checkout 0bfc279

# Create wiki repo + copy schema template
mkdir ~/code/x-algorithm-wiki && cd $_
git init && mkdir -p concepts entities changelog
cp ~/code/lanshu-wiki-skill/schema/wiki-code-repo-SCHEMA.md ./SCHEMA.md
echo "# Wiki Index" > index.md && echo "# Wiki Log" > log.md

Open SCHEMA.md, fill in two ⚠️ fields: Domain description (project + commit + core subsystems covered) + Tag Taxonomy (project-specific tag system, e.g., recsys / candidate-pipeline / ranking, etc.).

Step 2 · Let LLM Read Through Source Code and Generate First Batch of Pages

Open Claude Code, one prompt to launch (replace the content in angle brackets with your project):

Read SCHEMA.md. According to its rules, create an architecture wiki for <absolute path to source code>. Start with the 5 most core modules: <list your project's 5 most important directories or subsystems>. Hard rule: Each conclusion must have file:line anchor; don’t write what you can’t cover. Finally, if you find something obscure, you can also generate plain language based on confirmed factual content.

For x-algorithm-wiki, we filled in home-mixer / candidate-pipeline / phoenix / thunder / grox; just replace with yours.

One output: 10-20 concept + entity pages, total 2,000-5,000 lines. This step establishes the core discipline of “conclusions traceable” — all subsequent pages follow it.

Step 3 · Lint Full Verification

After each batch, run /wiki lint:

Dangling [[wikilink]] / missing frontmatter fields / near-synonym tags → auto scan
Randomly pick 2-3 pages and cross-verify against the full source code → the only way to find hidden errors like “LLM mixes up details of module A and module B”
If wiki conclusion conflicts with official documentation → write it in changelog instead of quietly editing (preserve audit trail)

In x-algorithm-wiki, we checked 29 pages with 482 anchors, found 3 discrepancies (mini model size, scorer count, candidate isolation mask) — all went into changelog.

Step 4 · Two-Layer Navigation ⭐ Key Upgrade

Up to Step 3, the wiki only serves engineers. When shared with product colleagues, the feedback was often “can’t understand” — technical pages are full of source code terminology.

Solution: Add a guide/ directory, pairing “technical page + plain language page”:

**Technical page (concepts/) Plain language page (guide/)**system-architecture.md (source-code level)how-it-works.md (explain the five major components using analogies, no code)candidate-selection.md (Selector trait + TopKScore)how-posts-are-picked.md (draft pick analogy)Othersglossary.md / faq.md / operating-myths.md, etc.

Plain language pages must also follow the traceability rule: each core conclusion at the end of the plain language page has a “Sources” table pointing to the corresponding technical page + source code anchor. Otherwise, plain pages degrade into AI free play.

This step upgrades the wiki from “engineer internal document” to “external promotion material”: the same wiki serves two audiences.

Step 5 · Continuous Polish

The backbone is now done. Long-term maintenance involves just two things:

Regular lint: Fix broken anchors as the project evolves, add term explanations
Reader feedback loop: Good questions from users should be immediately added to faq; new topics get a new page

Optional Extension: Web Rendering

At this point, the wiki is a pure markdown repository, viewable in GitHub or Obsidian. If you want to share it with non-technical people, feed the repository into lanshu-wiki-web (https://github.com/cclank/lanshu-wiki-web) — it instantly becomes a reading site with D3 knowledge graph, Cmd+K full-text search, Mermaid rendering, and [[wikilink]] internal navigation.

See the online version of x-algorithm-wiki: lanshu-wiki-web.lank.workers.dev/wiki/cclank/x-algorithm-wiki (https://lanshu-wiki-web.lank.workers.dev/wiki/cclank/x-algorithm-wiki#index)

Pitfalls / Tips for Newcomers

Multiple names for the same concept is the #1 killer: Hermes-Wiki at one point had 8 names for kanban, 6 names for ralph-loop (different daily-syncs each invented their own). Before creating a new page, grep index.md for near-synonym pages first
Not reading the full text before editing is the #2 killer: editing a page based only on the index summary loses information. Schema enforces “read full text before updating”
Lint tags monthly: If you don’t deduplicate, after a year 10 out of 30 tags will have overlapping semantics
changelog + log.md is the audit trail: Each operation is appended; tracking down issues is not stressful. This is where wiki beats Notion / Feishu documents

When NOT to Use LLM Wiki

To be objective, here are the boundaries:

Scale upper limit ~2000 pages: index.md must fit all page summaries for the LLM to read in one go. Beyond ~1000 pages, consider splitting into sub-wikis by domain; beyond ~2000 pages, this model starts to break
Concurrent collaboration is still rough: Multiple people ingesting the same source simultaneously will create multiple source pages, requiring /wiki merge afterwards; the schema must also be agreed upon in advance. The current skill focuses on single person / small team scenarios
Token cost is not negligible: Having the LLM read through 50,000 lines of source code for the first ingest costs approximately 100k-500k input tokens (at Claude Sonnet 4.5 pricing, about $0.3-$ 1.5), plus multiple rounds of lint and subsequent ingests — a medium project wiki lifecycle costs on the order of $5-$ 30. Cheaper than engineer hours, but not free

Skill Open Source & Practical Advice

But the most practical thing for you is just one: build your own wiki. The simplest entry path:

# 1. Install the skill
git clone https://github.com/cclank/lanshu-wiki-skill.git ~/code/lanshu-wiki-skill
mkdir -p ~/.claude/skills/wiki
ln -sf ~/code/lanshu-wiki-skill/SKILL.md ~/.claude/skills/wiki/SKILL.md

# 2. Personal knowledge base starts like this
mkdir -p ~/wiki/{raw,sources,entities,concepts,syntheses,reports,assets}
cp ~/code/lanshu-wiki-skill/schema/wiki-personal-CLAUDE.md ~/wiki/CLAUDE.md
cd ~/wiki && echo "# Wiki Index" > index.md && echo "# Wiki Log" > log.md && echo "# Wiki Inbox" > inbox.md

# 3. Architecture wiki for open source projects starts like this
mkdir ~/code/<project>-wiki && cd ~/code/<project>-wiki
mkdir -p concepts entities changelog
cp ~/code/lanshu-wiki-skill/schema/wiki-code-repo-SCHEMA.md ./SCHEMA.md
echo "# Wiki Index" > index.md && echo "# Wiki Log" > log.md

Then open Claude Code and use natural language: “add to wiki” or “according to SCHEMA, create architecture wiki for ”.

Detailed instructions, FAQ, comparison tables, two-layer navigation guide, five-step practical guide are all in the repository README: github.com/cclank/lanshu-wiki-skill (https://github.com/cclank/lanshu-wiki-skill)

To see the effect online, just feed any wiki repo to lanshu-wiki-web.lank.workers.dev (https://lanshu-wiki-web.lank.workers.dev).

Acknowledgments and Legacy

The core idea of this paradigm is not my invention; this article just industrializes it.

Vannevar Bush proposed the Memex (https://en.wikipedia.org/wiki/Memex) in his 1945 article As We May Think (https://en.wikipedia.org/wiki/As_We_May_Think): a personal, human-curated desktop knowledge base full of “associative trails” between documents.

Bush’s vision was closer to the essence of knowledge management than the later direction of the World Wide Web: private, deep, continuously curated by a human. He left an unresolved question for posterity: “Who will maintain these associations?” That question crushed all personal knowledge base attempts for the next 80 years (card boxes, Zettelkasten, Notion, Obsidian… humans eventually give up maintaining).

Andrej Karpathy in his April 2026 LLM Wiki gist (https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) (only 75 lines) gave the answer: let the LLM do the maintenance.

The three-layer architecture (raw / wiki / schema), the three operations (Ingest / Query / Lint), the index + log design, the argument that “knowledge should be compiled rather than interpreted each time” — all come from those 75 lines. This article and all the wiki tools I’ve built are engineering implementations of that abstract idea.

My highest respect. I suggest reading the two together: to understand the idea, read Karpathy’s original (https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) (only 10 minutes); to build a working wiki, read this article + use lanshu-wiki-skill (https://github.com/cclank/lanshu-wiki-skill). The former gives you why, the latter gives you how.

You are welcome to discuss in the issue area, or share your wiki link with me. The more people do this, the more fun it becomes.

A better reading experience can be found at: https://mp.weixin.qq.com/s/N2HP_sxYnUqESiOiKPPamg