Do coding agents need an OS-like control plane? I built a prototype and want critique.
Summary
The author introduces 'KnowledgeOS', a prototype control plane designed to govern local coding agents by managing task lifecycles, preventing state drift, and ensuring execution evidence. They are seeking architectural critique on whether this OS-like abstraction is necessary or if it constitutes over-engineering for agent workflows.
Similar Articles
AI coding agents need a “plan first, edit later” workflow? Looking for feedback
A proposed workflow for AI coding agents that emphasizes brainstorming and boundary enforcement before code editing, seeking community feedback on its utility.
I built a local control system for agent failures, fixes, evals, and gates to make autoresearch-style self-improvement loops work in real agent codebases
A local control system is built to manage agent improvement loops, capturing traces, finding recurring failures, drafting fixes with Codex/Claude Code, and applying changes only after passing checks and evals.
@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…
Discusses challenges with coding agents in complex long-horizon tasks, highlighting bizarre user experience issues and inefficient agent interactions, and advocates for more control over the agent harness.
Coding Agents Won’t Be Won by Prompts, but by Runtime Infrastructure
As coding agents become more capable, the bottleneck shifts from model quality to the infrastructure that supports long-running tasks, including durable state, permissions, checkpoints, observability, and cost controls. The author argues that the best agent products resemble runtime and workflow systems rather than just improved prompt interfaces.
Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents
Agent libOS introduces a library-OS-inspired runtime substrate for LLM agents, treating agents as schedulable processes with explicit capabilities, lifecycle management, audit records, and human approval queues. The design shifts the trust boundary from tool dispatch to runtime primitives, enabling long-running agents to be scheduled, authorized, resumed, and audited safely.