@DerekNee: everyone is talking about agent loops, harnesses, and self-evolving agents. but almost no one is talking about the actu…
Summary
The author argues that a single giant agent is ineffective for running a company, and describes their approach with Matrix, an operating system for autonomous work that organizes agents into a workspace brain, department leads, and scoped workers with proof loops.
View Cached Full Text
Cached at: 06/25/26, 11:18 AM
everyone is talking about agent loops, harnesses, and self-evolving agents.
but almost no one is talking about the actual hard part:
you cannot run a company on one giant agent with every tool, every file, and no accountability. that’s not autonomy. that’s a fog machine.
here’s how we’re building an agent company OS inside Matrix.
—
the stack:
Workspace Brain → Matrix Runtime Orchestrator → Department Verticals → Department Lead Agents → Worker Agent Pool → Proof / Check-in Loop
Matrix is not a chatbot. it’s an operating system for autonomous work.
—
the workspace brain is the company boundary.
it gets loaded with the things a real company actually runs on:
→ product docs → codebase context → chats, files, goals → operating rules → prior runs + examples of good work → approvals, memory, skills
this isn’t “context.” it’s the shared operating layer. it knows what the company knows, what it’s trying to do, who owns what, what good looks like, and what must be proven before work counts as done.
—
on top sits the Matrix Runtime. it coordinates wake, cron, department messages, OKR state, permissions, worker dispatch, proof ledger, memory updates.
under the runtime, work is organized into departments.
a department is not a chat thread. it’s a long-running agent with identity, memory, skills, goals, history, tool boundaries, taste, and accountability.
Founder Strategy. Product Engineering. Growth. Ops. Research.
each one has a lead agent that decides what happens, reads the relevant Memory Skill, breaks work into scoped tasks, and picks the right execution seat.
—
sometimes that seat is a native Matrix worker. sometimes Codex. sometimes Claude Code. sometimes a browser / computer automation worker.
the point is not “one model does everything.” the point is:
→ the right agent → with the right context → inside the right boundary → using the right tools → with a clear definition of done
—
this is why scoped workers matter.
a “do everything” agent is too vague. but:
→ a release worker with repo context, tests, and approval gates → very good → a Codex worker scoped to one patch and one validation path → very good → a Claude Code worker doing deep repo analysis → very good → a browser worker with a specific flow and proof requirement → very good
narrow scope reduces drift. Memory Skill keeps narrow agents from going blind. proof prevents fast output from pretending to be progress.
—
that is the loop:
Workspace Brain → Department Lead → Worker → Artifact → Proof → Check-in → Memory Skill update
every cycle, the company gets smarter. that’s the real self-evolution. not a single agent rewriting its own prompt in a void — but a whole org compounding through proof.
—
each workspace is an isolated agent company. its own brain, departments, memory, workers, proof ledger.
workspaces can talk when needed. but context should not bleed by default.
isolation is not a limitation. it’s what makes the system usable.
—
once a department pattern works, you fork the pattern — not the raw context. you still customize memory, examples, approval gates, tools, voice, definition of done.
but you’re not starting from zero. you might already have 70% of the OS for that kind of work.
—
what this actually changes:
a small team of strong operators can now run surfaces that used to require entire departments.
but only if the agents are actually good. and good agents don’t come from connecting more tools. they come from source material, taste, iteration, narrow scope, workflow design, proof, memory, and human judgment.
vague agents just create vague output faster.
Matrix is our attempt to build the opposite:
an agent company OS where autonomous work has structure, memory, ownership, and proof.
the loop is the product.
agent attention and context are way too precious. gotta protect them
Similar Articles
@omarsar0: Had so many thoughts on the "loop engineering" trend. I spent a few minutes with my writer agent to summarize some of m…
Discussion on the emerging 'loop engineering' trend in AI agents, based on research and conversations with students, founders, and startups, suggesting new ways of working with agents will lead to step-change capabilities.
@yoheinakajima: great article, mostly focused on coding agents but applies elsewhere impo. aligns w a lot of my prior thoughts: - agent…
A tweet highlighting key principles for building agent systems, emphasizing scaffolding, memory, and reusable tools, based on an article by Yohei Nakajima.
@akshay_pachaar: Andrej Karpathy: "Remove yourself as the bottleneck. Maximize your leverage. Put in very few tokens, and a huge amount …
A detailed breakdown of loop engineering for AI agents: moving from manual agent supervision to autonomous loops with triggers, makers, checkers, and persistent state. Recommends Zep's Graphiti for temporal knowledge graphs and Comet's Opik for observability to build reliable unattended agent systems.
@rohit4verse: Building dumb AI Loops that ship is the current MOAT in Agentic systems. 88% of agent pilots ship this exact pattern an…
The article discusses common failure patterns in agentic AI systems, specifically 'dumb AI loops,' citing issues like state poisoning and data leaks observed in Claude Code deployments.
@techwith_ram: https://x.com/techwith_ram/status/2064925285003542820
Explores the shift from human-in-the-loop to autonomous agent loops in AI coding, where agents self-prompt and iterate, discussing both the promise and the hidden costs of reduced human control.