Do coding agents need an OS-like control plane? I built a prototype and want critique.

Reddit r/AI_Agents 05/13/26, 09:37 AM Tools

coding-agents agent-orchestration developer-tooling prototyping mcp software-architecture

Summary

The author introduces 'KnowledgeOS', a prototype control plane designed to govern local coding agents by managing task lifecycles, preventing state drift, and ensuring execution evidence. They are seeking architectural critique on whether this OS-like abstraction is necessary or if it constitutes over-engineering for agent workflows.

I’ve been experimenting with a local control-plane for coding agents, and I’d love serious critique from people building real agent workflows. The problem I kept running into: \- agents forget the original project intent after long sessions \- “done” is often claimed without eval/test/postflight evidence \- MCP/tool/subagent calls are invisible unless you manually inspect logs \- old projects accumulate stale generated files, broken hooks, and mismatched state \- multi-agent work gets messy because there is no durable task/spec/lifecycle record So I built a prototype called KnowledgeOS. The idea is not to replace the operating system. It is more like a project-local governance layer for agents. Current pieces: \- \`.agent-os/\` control plane per project \- \`create-task\` for formal task intake \- \`create-spec\` / \`align-spec\` so runs bind to durable user intent \- \`route-task\` and \`check-route-write\` to prevent uncontrolled file mutation \- \`context-pack\` and \`plan-task\` before execution \- mandatory lifecycle phases: route, plan, review, dispatch, execute, report \- visible \`CHECKPOINT\_OK\`, \`CAPABILITY\_OK\`, and \`TRACE\_OK\` markers \- \`capability-event\` for MCP / skill / subagent / shell / script visibility \- \`eval-task\`, \`verify-context\`, \`verify-lifecycle\`, \`complete-task\` \- postflight hook that must return \`\[SYNC\_OK\]\` \- local tool registry for MCPs, skills, orchestrators, and subagents \- recently integrated Maestro Orchestrate as a local specialist-agent catalog via MCP The design philosophy is: \- small kernel \- pluggable modules \- optional apps/workbench \- each project decides strictness \- every important agent claim needs command evidence What I’m unsure about: 1. Is “OS-like control plane for agents” the right abstraction, or is this just workflow tooling with a fancy name? 2. Should lifecycle gates be strict by default, or opt-in per project? 3. Is spec-first / checkpoint-first work too much friction for everyday coding? 4. How should subagent registries be represented without turning into prompt soup? 5. Are there existing systems that solve this more cleanly? I’m not looking for stars as much as architecture feedback. If this is over-engineered, I’d love to hear where. If the abstraction is useful, I’d love suggestions on what should be kernel vs plugin/module.

Original Article

Do coding agents need an OS-like control plane? I built a prototype and want critique.

Similar Articles

AI coding agents need a “plan first, edit later” workflow? Looking for feedback

I built a local control system for agent failures, fixes, evals, and gates to make autoresearch-style self-improvement loops work in real agent codebases

@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…

Coding Agents Won’t Be Won by Prompts, but by Runtime Infrastructure

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Submit Feedback

Similar Articles

AI coding agents need a “plan first, edit later” workflow? Looking for feedback

I built a local control system for agent failures, fixes, evals, and gates to make autoresearch-style self-improvement loops work in real agent codebases

@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…

Coding Agents Won’t Be Won by Prompts, but by Runtime Infrastructure

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents