@khemaridh: https://x.com/khemaridh/status/2070294198469968237

X AI KOLs Timeline Tools

Summary

The author describes building a self-improving 'company brain' using Claude Code and a wiki of markdown files, inspired by Andrej Karpathy, to organize and query training materials efficiently.

https://t.co/OoTmbqs5Ay
Original Article
View Cached Full Text

Cached at: 06/26/26, 10:10 AM

I built a self-improving “company brain” in Claude Code

When a new model gets released, I immediately point it to a giant folder filled with a ton of context and start asking really hard questions.

And once I started using Claude Cowork I noticed that (unlike Claude Projects) there was no longer a file limit.

Hallelujah!

But unfortunately that didn’t mean I could just use bloated file and folder structures — LLMs still struggle with large numbers of files due to their limited context window.

And this problem is exacerbated by knowledge work because you’re constantly incorporating new information.

I’ve learned that LLMs and agents use a “brute force” approach to navigating large amounts of context. It can work, but it also will cost you a lot of tokens and can miss a lot of your critical information.

Thankfully there’s a solution, courtesy of Anthropic researcher Andrej Karpathy. Karpathy came up with a simple yet elegant approach to creating a self-improving “company brain” using folders, markdown files and Claude skills.

Are you looking to make AI work for your company?

Join two experienced founders of AI Consulting firms for an unfiltered conversation about the AI clichés we’re tired of, the beliefs holding teams back, and the uncomfortable truths no one’s saying out loud.

👉Join the FREE event

Structuring my Wiki LLM

The entire documentation for this project (Officially dubbed “Wiki LLM”) can be accessed via GitHub. Here’s Karpathy’s description:

Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn’t just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

To test out this approach, I created this wiki to organize and query all of my training materials from the past two years. Here’s the schema I used:

Let’s first go through the ingestion process:

  • Sources: These are all the documents that are regularly created by me either directly (e.g. a tutorial on my blog) or indirectly (e.g. a Granola transcript of a training session).

  • Raw files: Saved as immutable objects in their original format. This gives me the ability to always revert.

  • File Ingestion: Converts the file to markdown for LLM-readability (using a simple python script).

  • **Human-in-the-loop: **I use the “interview method” to layer in my judgement of the underlying material

Next, there’s the self-improvement process (all in markdown):

  • Document page: This is the most important step and adds meta-data, a structured summary, recurring concepts and new ideas.

  • **Index page: **A running catalog that appends one line per source so that every page is discoverable in a single place.

  • **Link concepts: **This step connects the new page to the rest of the wiki using hyperlinks (to help the LLM navigate).

  • **Log page: **Chronological diary of all actions taken by the Wiki.

The filesystem is the database

One of the strengths of LLMs is that they like to ingest information to improve their outputs and decision-making capabilities.4

The process above gives an LLM a very clear map to help it find the right context in an efficient manner. In his post File systems are the new primitive for AI Agents, Carter Rabasa explains the strength of this approach:

A filesystem isn’t a complete memory architecture, and I’m not going to pretend it is. But it’s an unusually good substrate for the part of memory agents struggle with most: durable, inspectable, revisable working context. Files give the model names, paths, hierarchy, timestamps, permissions, and conventions it already knows how to reason about.

The hyperlinks help LLMs navigate the ideas

You’ll see that all of these pages are interconnected using a hyperlink system. TBH, I need to better understand how this works in practice, but my assumption is that the LLM will first crawl the index page, then the summary pages and finally the raw page.

Here’s how ChatGPT explained it back to me:

A hyperlink does not magically load the linked page into the model’s brain. It gives the agent a clear instruction path: “this page exists, it is related, and you may want to read it next.”

A skill puts this all together

The ingestion skill is a repeatable process that runs via a Claude Skill. It also handles the conversion of files into markdown.

I can see a world where this step is fully automated — but until I understand how this works I will keep myself as the human in the loop.

A clean-up process (called “linting”)

This is the final maintenance step. It goes through the wiki, looks for duplicates, orphaned concepts and ideas that can be combined.

Ultimately I’ll turn this into its own skill and then run it via a scheduled task.

Here’s an example of some related concepts that Claude suggested I merge.

Querying the Wiki

Ok, now it’s time for the real talk.

What’s the real point of all this? Is it performative? Or does it actually move the needle for LaTour AI and our clients?

I’ll write a longer post on “evals” — methods for consistently “evaluating” the outputs of LLMs.

But for now, I’m using it to create customized curriculums, presentations and marketing materials.

Here’s an example of a pretty complicated task, that generated a very good result using the wiki.

Take the detailed syllabus and update it to incorporate the latest ideas, particularly in the most recent powerpoint and some of the latter trainings. Keep the 3 sections (Foundations, Cowork, Agentic Coding) and have 7-10 headers as modules, then 2-5 bullet points as descriptions. Then one bullet point for examples, outcomes,. This will be a draft that becomes the “latest” version of the syllabus. Some top of mind changes. More emphasis on skills. A section on tools at the end. Build out hallucination and sycophancy. Cover token costs. Once again, the master deck is the closest to the source of truth and then expand accordinlgy

Future improvements for the wiki

It’s been a week of ingesting materials and here’s the upcoming roadmap to assess how well the wiki is working:

• More comprehensive “evals” against just dumping the raw materials into Claude Cowork

• Automating the collection of documents into the “raw” folder

• Finalizing the “linting” skill

• Creating a separate sub-table for concepts

• Creating a skill that assesses “knowledge gaps” in the training

I’m excited because I see glimmers of a process that takes proprietary knowledge and improves itself in an automatic way.

Time will tell if it really makes a difference.

Are you ready to start building personalized AI agents (without any code)?

Imagine sitting down Monday morning and your meetings are already prepped, your inbox is triaged, and last week’s notes are organized — all before you opened your laptop. That’s Claude Cowork running on a schedule you set once. This self-paced course teaches you to build that system. Finish it in a weekend.

👉 Learn Cowork in one weekend

Similar Articles