@khemaridh: https://x.com/khemaridh/status/2070294198469968237
Summary
The author describes building a self-improving 'company brain' using Claude Code and a wiki of markdown files, inspired by Andrej Karpathy, to organize and query training materials efficiently.
View Cached Full Text
Cached at: 06/26/26, 10:10 AM
I built a self-improving “company brain” in Claude Code
When a new model gets released, I immediately point it to a giant folder filled with a ton of context and start asking really hard questions.
And once I started using Claude Cowork I noticed that (unlike Claude Projects) there was no longer a file limit.
Hallelujah!
But unfortunately that didn’t mean I could just use bloated file and folder structures — LLMs still struggle with large numbers of files due to their limited context window.
And this problem is exacerbated by knowledge work because you’re constantly incorporating new information.
I’ve learned that LLMs and agents use a “brute force” approach to navigating large amounts of context. It can work, but it also will cost you a lot of tokens and can miss a lot of your critical information.
Thankfully there’s a solution, courtesy of Anthropic researcher Andrej Karpathy. Karpathy came up with a simple yet elegant approach to creating a self-improving “company brain” using folders, markdown files and Claude skills.
Are you looking to make AI work for your company?
Join two experienced founders of AI Consulting firms for an unfiltered conversation about the AI clichés we’re tired of, the beliefs holding teams back, and the uncomfortable truths no one’s saying out loud.
👉Join the FREE event
Structuring my Wiki LLM
The entire documentation for this project (Officially dubbed “Wiki LLM”) can be accessed via GitHub. Here’s Karpathy’s description:
Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn’t just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.
To test out this approach, I created this wiki to organize and query all of my training materials from the past two years. Here’s the schema I used:
Let’s first go through the ingestion process:
-
Sources: These are all the documents that are regularly created by me either directly (e.g. a tutorial on my blog) or indirectly (e.g. a Granola transcript of a training session).
-
Raw files: Saved as immutable objects in their original format. This gives me the ability to always revert.
-
File Ingestion: Converts the file to markdown for LLM-readability (using a simple python script).
-
**Human-in-the-loop: **I use the “interview method” to layer in my judgement of the underlying material
Next, there’s the self-improvement process (all in markdown):
-
Document page: This is the most important step and adds meta-data, a structured summary, recurring concepts and new ideas.
-
**Index page: **A running catalog that appends one line per source so that every page is discoverable in a single place.
-
**Link concepts: **This step connects the new page to the rest of the wiki using hyperlinks (to help the LLM navigate).
-
**Log page: **Chronological diary of all actions taken by the Wiki.
The filesystem is the database
One of the strengths of LLMs is that they like to ingest information to improve their outputs and decision-making capabilities.4
The process above gives an LLM a very clear map to help it find the right context in an efficient manner. In his post File systems are the new primitive for AI Agents, Carter Rabasa explains the strength of this approach:
A filesystem isn’t a complete memory architecture, and I’m not going to pretend it is. But it’s an unusually good substrate for the part of memory agents struggle with most: durable, inspectable, revisable working context. Files give the model names, paths, hierarchy, timestamps, permissions, and conventions it already knows how to reason about.
The hyperlinks help LLMs navigate the ideas
You’ll see that all of these pages are interconnected using a hyperlink system. TBH, I need to better understand how this works in practice, but my assumption is that the LLM will first crawl the index page, then the summary pages and finally the raw page.
Here’s how ChatGPT explained it back to me:
A hyperlink does not magically load the linked page into the model’s brain. It gives the agent a clear instruction path: “this page exists, it is related, and you may want to read it next.”
A skill puts this all together
The ingestion skill is a repeatable process that runs via a Claude Skill. It also handles the conversion of files into markdown.
I can see a world where this step is fully automated — but until I understand how this works I will keep myself as the human in the loop.
A clean-up process (called “linting”)
This is the final maintenance step. It goes through the wiki, looks for duplicates, orphaned concepts and ideas that can be combined.
Ultimately I’ll turn this into its own skill and then run it via a scheduled task.
Here’s an example of some related concepts that Claude suggested I merge.
Querying the Wiki
Ok, now it’s time for the real talk.
What’s the real point of all this? Is it performative? Or does it actually move the needle for LaTour AI and our clients?
I’ll write a longer post on “evals” — methods for consistently “evaluating” the outputs of LLMs.
But for now, I’m using it to create customized curriculums, presentations and marketing materials.
Here’s an example of a pretty complicated task, that generated a very good result using the wiki.
Take the detailed syllabus and update it to incorporate the latest ideas, particularly in the most recent powerpoint and some of the latter trainings. Keep the 3 sections (Foundations, Cowork, Agentic Coding) and have 7-10 headers as modules, then 2-5 bullet points as descriptions. Then one bullet point for examples, outcomes,. This will be a draft that becomes the “latest” version of the syllabus. Some top of mind changes. More emphasis on skills. A section on tools at the end. Build out hallucination and sycophancy. Cover token costs. Once again, the master deck is the closest to the source of truth and then expand accordinlgy
Future improvements for the wiki
It’s been a week of ingesting materials and here’s the upcoming roadmap to assess how well the wiki is working:
• More comprehensive “evals” against just dumping the raw materials into Claude Cowork
• Automating the collection of documents into the “raw” folder
• Finalizing the “linting” skill
• Creating a separate sub-table for concepts
• Creating a skill that assesses “knowledge gaps” in the training
I’m excited because I see glimmers of a process that takes proprietary knowledge and improves itself in an automatic way.
Time will tell if it really makes a difference.
Are you ready to start building personalized AI agents (without any code)?
Imagine sitting down Monday morning and your meetings are already prepped, your inbox is triaged, and last week’s notes are organized — all before you opened your laptop. That’s Claude Cowork running on a schedule you set once. This self-paced course teaches you to build that system. Finish it in a weekend.
👉 Learn Cowork in one weekend
Similar Articles
@cyrilXBT: Andrej Karpathy built a wiki to think with AI. I built something that thinks back. Claude Code + Obsidian equals an AI …
A developer built an AI system using Claude Code and Obsidian that remembers user context, goals, and history permanently, acting as a second brain that compounds knowledge over time.
@eng_khairallah1: https://x.com/eng_khairallah1/status/2066437136354545981
A comprehensive guide to building a self-running second brain using a swarm of 300 AI agents within Obsidian, processing raw notes and articles overnight into organized knowledge without cloud dependency.
@rohit4verse: Karpathy second brain is the highest leverage tool nobody uses correctly. it should brief you every morning with the co…
A Twitter post discussing Andrej Karpathy's second brain system using Obsidian and Claude Code for automated knowledge capture and daily briefings as a productivity workflow.
@tom_doerr: Self-correcting memory and context engineering for Claude Code https://github.com/rohitg00/pro-workflow…
Pro Workflow is a plugin for Claude Code and Cursor that adds self-correcting memory and persistent wikis using SQLite, allowing the AI to learn from corrections and build knowledge across sessions.
@undefinedKi: https://x.com/undefinedKi/status/2068306794116501544
A step-by-step guide on building an AI-powered second brain by connecting Claude (via Claude Code) with Obsidian, enabling persistent knowledge storage, automatic linking, and querying across all notes.