@yunwei37: Several recent papers, including https://arxiv.org/html/2606.25189v1…, give the feeling that I am no longer designing and writing a system or paper, but rather "training" a system: empirical research collects a large amount of real-world scenario data as a training set, and based on this data, let AI analyze what properties the system should have and how to design and implement it; then I write a set of test cases to verify whether it works...
Summary
This paper presents ActPlane, a policy engine that enforces safety and effectiveness policies for AI agents at the OS kernel level using eBPF, bridging the semantic gap between natural language policy intent and concrete system actions.
View Cached Full Text
Cached at: 06/28/26, 06:01 AM
A few recent papers, including https://arxiv.org/html/2606.25189v1…, give the feeling that I am no longer designing and writing a system or paper, but rather “training” a system: empirical research collects a large amount of real-world data as a training set, using this data to let AI analyze what attributes the system should have and how to design and implement it; then I write a test set to verify whether it works…
ActPlane: Programmable OS-Level Policy Enforcement for Agent Harnesses
Source: https://arxiv.org/html/2606.25189v1 Yusheng Zheng1,4, Tianyuan Wu3, Quanzhi Fu2, Tong Yu4, Wenan Mao5, Wei Wang3, Dan Williams2, Andi Quinn1
Abstract.
AI agents increasingly run in production throughharnesses, the software around the LLM, including an engine that enforces safety and effectivenesspolicies, e.g., “run tests before committing.” Enforcing these policies requires bridging asemantic gap: policy intent is expressed in underspecified natural language, while enforcement must act on concretesystem actions, e.g., which test to run. Many policies also define event ordering or data flow actions. Yet existing approaches fall short. Tool-call guardrails miss system actions that bypass the tool layer, while OS sandboxes control resource access instead of actions, returning opaque errors that confuse the agent. Our key insight is that policy context lives within the agent closest to the task, while enforcement must happen at the OS to cover all execution paths. We introduce ActPlane, a policy engine that lets agents declare policies and enforces them in the OS kernel with semantic feedback and isolation. ActPlane uses a simple information-flow control (IFC) DSL to support cross-event policies. We implement ActPlane with eBPF and evaluate it on policies from the empirical study, coding-task benchmarks, and safety benchmarks. ActPlane improves policy compliance, including on indirect execution paths that tool-call interception cannot observe, with 1.9%–8.4% overhead. ActPlane is athttps://github.com/eunomia-bpf/ActPlane.
AI agents, eBPF, information-flow control
††copyright:none††ccs:Software and its engineering Operating systems††ccs:Computing methodologies Artificial intelligence††ccs:Software and its engineering Domain specific languages
1.Introduction
AI agents are widely used for coding, DevOps, and enterprise workflows(Yang et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib55); Wang et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib50); Debenedetti et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib16); Zhan et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib58); Zheng et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib61)). Besides the LLM, agents requireharnesses, software layers around the model that improve agent performance while enforcing instructions and constraints for safety and compliance(Böckeler,2026 (https://arxiv.org/html/2606.25189v1#bib.bib6)).
A major component of the harness is apolicy engine, which observes and enforces instructions and constraints (e.g., run tests before commit) over the agent’s concrete actions. Projects encode many such policies in instruction files (e.g.,CLAUDE.md,AGENTS.md)(Chatlatanagulchai et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib9),2025 (https://arxiv.org/html/2606.25189v1#bib.bib8); Santos et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib44); Lulla et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib31)), so harnesses can provide them to the model directly. Because LLMs comply with instructions probabilistically(Liu et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib30); Jiang et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib27); Qi et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib42))and may violate them due to planning errors, prompt-injection drift, and tool or script side-effects(Greshake et al.,2023 (https://arxiv.org/html/2606.25189v1#bib.bib24); Zhan et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib58)), harnesses require deterministic policy engines to enforce compliance(Rebedea et al.,2023 (https://arxiv.org/html/2606.25189v1#bib.bib43); Wang et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib49); Debenedetti et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib15)).
Refer to captionFigure 1.ActPlane enables the agent closest to the task to write concrete policy DSLs according to its intent or the higher authorities’ instructions. The DSL is then compiled by ActPlane and enforced inside the OS kernel.However, existing policy engines leave asemantic gap:policy intentis expressed in underspecified natural language, but enforcement must act on concretesystem actionsbased on project or task context. For example, as illustrated in Figure1 (https://arxiv.org/html/2606.25189v1#S1.F1), “run tests before committing” requires knowing which test command to run, and “the worker sub-agent should not delete data files” needs the sub-agent to locate where the data directories are. Our empirical study of CLAUDE.md and AGENTS.md instruction files in 64 popular projects (§2.2 (https://arxiv.org/html/2606.25189v1#S2.SS2)) shows that enforcement must handle context-dependent system actions, with 64% of statements being policies, 83% involving system actions, and 74% depending on context that cannot be pre-defined statically.
Existing approaches fall short. Tool-call guardrails(Xiang et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib53); Wang et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib49); Shi et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib45); Costa et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib14); Debenedetti et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib15))miss indirect system actions that bypass the tool layer, such as agit commitinside a script the agent wrote earlier. OS-level enforcement systems (e.g., sandboxes(Edge,2015 (https://arxiv.org/html/2606.25189v1#bib.bib19); Canonical Ltd.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib7); The Linux Kernel Documentation,2025 (https://arxiv.org/html/2606.25189v1#bib.bib46); Cilium Project,2026 (https://arxiv.org/html/2606.25189v1#bib.bib12); Google,2018 (https://arxiv.org/html/2606.25189v1#bib.bib23))) expect static pre-defined policy, control resource access instead of actions, and return opaque denials (e.g.,EPERM) without explaining which policy was violated or how to comply.
To bridge the gap, we argue that an agent-harness policy engine should let agents define policies and enforce them at the OS level, where all execution paths are visible including subprocesses and shell-outs that bypass tool-call interception. While safety constraints such as “never expose API keys” come from higher authority such as the user, platform operator, or parent agent, the context needed to resolve policies resides with the agent closest to the task, which already reads the repository, interprets the current task, and resolves abstract references such as “run tests” into concrete commands. This makes agents the natural producer of concrete policy, also reflecting the fact that instruction files are increasingly maintained by agents(Anthropic,2025 (https://arxiv.org/html/2606.25189v1#bib.bib2); Galster et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib21)).
Agent-authored, OS-enforced policy has two design requirements. First, policy expressions must be high-level enough for agents to generate reliably, yet concrete enough to compile to deterministic kernel checks. Since 81% of projects contain policies that define event ordering or data flow, they must also track state across operations. Second, agents must not weaken safety constraints from higher authority or affect other agents’ policies.
ActPlane is a programmable OS-level policy enforcement system for AI agent harnesses. ActPlane provides a simple DSL for agents to express policies as deterministic kernel checks, such askill exec “git” “commit” unless after exec “go” “test” exits 0(Figure1 (https://arxiv.org/html/2606.25189v1#S1.F1)). To enforce policies with event ordering or data flow, ActPlane compiles DSL rules into an eBPF engine that uses information-flow control (IFC), attaching labels that mark which sources have influenced each object and propagating them across process, file, and network operations. Policies that express constraints useblockorkillto deny violations with semantic feedback, while policies that express instructions usenotifyto guide the agent, e.g., “blocked: commit without tests; run npm test first.” ActPlane uses policy domains bound to process subtrees to isolate agents from each other and prevent them from weakening higher-authority constraints.
On a decision-compliance benchmark, ActPlane resolves 2.0–3.2×\timesmore policy violations than prompt-filter, tool-regex, FIDES(Costa et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib14))(tool-level IFC), and feedback-free kernel IFC by covering indirect execution paths that tool-call interception cannot observe, while adding 1.9% end-to-end overhead on agent workloads and up to 8.4% on kernel builds. On a safety benchmark of 361 personal-assistant tasks, ActPlane prevents 74% of baseline-unsafe behaviors by loading agent-generated safety policies as higher-authority rules before task execution. ActPlane is open-sourced athttps://github.com/eunomia-bpf/ActPlane.
To summarize, we make three contributions:
- (1)Anempirical studyof 64 projects that characterizes these gaps (§2.2 (https://arxiv.org/html/2606.25189v1#S2.SS2)) and motivates ActPlane.
- (2)ActPlane, a programmable OS-level policy enforcement system that addresses them (§3 (https://arxiv.org/html/2606.25189v1#S3)–4 (https://arxiv.org/html/2606.25189v1#S4)).
- (3)Anevaluationon a decision-compliance benchmark built on the empirical study, together with external coding-task and safety benchmarks covering workplace and personal-assistant tasks (§5 (https://arxiv.org/html/2606.25189v1#S5)).
2.Motivation
This section presents an empirical study of 64 projects to characterize the gap between policy intent and enforcement and motivate the design of ActPlane.
2.1.Agent Harnesses and Policies
AI agents combine model reasoning with external tools, memory, and long-running environments. Claude Code(Anthropic,2025 (https://arxiv.org/html/2606.25189v1#bib.bib2))and Codex(OpenAI,2025 (https://arxiv.org/html/2606.25189v1#bib.bib37))are prominent examples. Agents operate through anAI agent harness: software around the model that maintains the agent loop and session state, routes tool calls, mediates shell, file, and network access, and returns results or feedback to the model, improving agent performance while enforcing instructions and constraints(Trivedy,2026 (https://arxiv.org/html/2606.25189v1#bib.bib47); Böckeler,2026 (https://arxiv.org/html/2606.25189v1#bib.bib6)). A single tool invocation can run arbitrary scripts, browse untrusted content, call APIs, and touch many files(Debenedetti et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib16); Zhan et al.,2024 (https://arxiv.org/html/2606.25189v1#bib.bib58); Chennabasappa et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib11)). To improve task performance, safety, and compliance, projects often encode intent-level policy in natural-language files (CLAUDE.md,AGENTS.md)(Chatlatanagulchai et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib9); Santos et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib44); Lulla et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib31))or deterministic policy configurations. Apolicyspecifies what the agent should do (instructions) or should not do (constraints); we term its semantic meaning the agent’spolicy intent.
2.2.Empirical Study
To characterize how policies are specified in production agent projects and what enforcement requirements they impose, we conduct an empirical study of 64 popular repositories that containCLAUDE.mdorAGENTS.md. Different from prior studies that analyze instruction files at file- or section-heading granularity(Chatlatanagulchai et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib9),2025 (https://arxiv.org/html/2606.25189v1#bib.bib8); Santos et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib44); Lulla et al.,2026 (https://arxiv.org/html/2606.25189v1#bib.bib31)), our study focuses onstatement-levelanalysis to understand the enforcement requirements of individual policies, where astatementis a coherent unit expressing one claim or constraint. Specifically, we aim to answer three questions: (1) Are instruction files primarily behavioral policies or descriptive context? (2) Which policies require OS-level enforcement, and what kinds of OS-level checks do they need? (3) What context is needed to instantiate these policies into concrete, enforceable rules?
Dataset.We collected public GitHub repositories containingCLAUDE.mdorAGENTS.md, prioritizing AI-agent projects created after 2025 and excluding non-code, inactive, fake-star, and stub repositories. The snapshot, taken 2026-05-23 UTC, contains 64 repositories with median 20K GitHub stars, 84 instruction files, and 2,116 extracted statements.
We extract and validate statements from raw instruction files in three steps. (1) A two-pass LLM Agent-assisted pipeline extracted statements with source line ranges and four labels: content type, topic, enforcement level, and context requirement. (2) A validation script verified full source coverage and verbatim span matching and was cross-checked by independent Claude and Codex agents. (3) A stratified sample of 100 statements was independently reviewed by human annotators, who verified the labels were correct. Table1 (https://arxiv.org/html/2606.25189v1#S2.T1)collects representative statements that illustrate all four labels, andS1–S8are referenced throughout.
Table 1.Representative statements illustrating all four labels. Enforcement level and context requirement apply only to policies (“—” indicates not applicable).Table of eight representative statements from the study, each classified by type, topic, enforcement level, and context requirement.
Q1: Are Instruction Files Behavioral Policies?We categorize a statement as apolicyif it requires, forbids, or conditions an agent action; otherwise, we label it as descriptive context. Policies dominate at both statement and repository granularity. Across 2,116 statements, 64% are policies and 36% are descriptions (Figure 2). Across repositories, 70.1% have more policy statements than descriptive statements. However, the mix varies widely: one repository contains no policies, while another contains 97% policies.
Refer to captionFigure 2.Policy fraction per repository by statement count. Most repositories contain a majority of policy statements.Sorted bar chart of policy fraction per repository.To understand how policies distribute across topics, we assign each statement to one of 12 topic categories adapted from prior instruction-file studies(Chatlatanagulchai et al.,2025 (https://arxiv.org/html/2606.25189v1#bib.bib8)), applied at statement granularity rather than file granularity (Figure3 (https://arxiv.org/html/2606.25189v1#S2.F3)). We find that Development Process and Implementation Details are policy-heavy at 87% and 85% respectively, while Architecture is mostly descriptive at 23% because these sections are dominated by directory layouts and design summaries.
Takeaway #1:Instruction files are primarily behavioral policies (64% statements), but their policy density varies across repositories and topics.
Refer to captionFigure 3.Policy ratio by topic.Bar chart showing policy ratio for each topic.Q2: Which Policies Require OS-Level Enforcement?Intuitively, some policies in instruction files are semantic-only requirements (e.g., “please write comments for each function”), while others require OS
Similar Articles
@dair_ai: https://x.com/dair_ai/status/2068724104815890889
Highlights three recent AI papers: SpatialClaw (training-free spatial reasoning via code), SkillWeaver (compositional skill routing with decompose-retrieve-compose pipeline), and PreAct (compiling agent runs into fast state machines for repeated tasks).
@omarsar0: Great paper on self-improving agents. Why? We need to think more deeply about AI agent system design. The protocol spec…
A paper introduces a protocol framework for self-improving AI agents, enabling auditable improvement proposals, assessments, and rollbacks.
@dair_ai: https://x.com/dair_ai/status/2061104052818108476
A roundup of three notable AI papers: SkillOpt treats skill documents as trainable parameters to optimize frozen agents; a new method compiles agentic workflows into model weights for 100x cost reduction; and AutoScientists introduces a decentralized agent team for long-running science without a central planner.
@dair_ai: NEW paper worth reading. (bookmark it) Autonomous research systems usually prove themselves on cherry-picked wins, huma…
FARS is a fully automated research system that uses stage-specific AI agents to handle ideation, planning, experimentation, and writing, producing 166 complete research papers across 67 AI/ML topics in its first public deployment.
@Xudong07452910: This paper is a must-read for heavy users of Claude Code, Codex, or other AI Agents. It doesn't study how Agents fail on benchmarks, but a more real problem: In real development, what exactly are AI coding agents doing...
This paper analyzes 20,574 real-world coding-agent sessions to identify how AI agents misalign with developer intent, finding that constraint violations and inaccurate self-reporting are the most common failure modes, imposing trust and effort costs rather than irreversible damage.