@dongxi_nlp: https://x.com/dongxi_nlp/status/2066290950352081336
Summary
This article discusses the design concept of how Markdown files (such as AGENTS.md and SKILL.md) in Coding Agents effectively influence agent behavior through the Harness mechanism, emphasizing the importance of loading different contexts at the right time.
View Cached Full Text
Cached at: 06/15/26, 07:09 PM
Markdown Is A Context Interface
Harness Series — Part 5. A tiny markdown file can change the world state of a Coding Agent.
.md file works when the harness knows what kind of context it is.
A fascinating phenomenon in modern coding agents: an ordinary markdown file can alter agent behavior.
Add an AGENTS.md — the Agent starts following repo-specific rules.
Add a SKILL.md — the Agent becomes more stable on a certain class of tasks.
The markdown file is only the visible part.
What’s truly interesting is the Harness’ design behind this mechanism:
The Naive Markdown Dump
By habit, if we look at markdown from a tiny agent’s perspective:
This seems reasonable, because rules, docs, plans, and procedures are often written in markdown.
But a raw markdown dump comes with costs:
- Old notes compete with current instructions
- Long docs squeeze out recent tool results
- Drafts might be mistaken for rules
- General project guidance can overshadow narrow task procedures
The model gets more text, but the task gets fewer useful instructions.
Therefore,
.md should enter context through a Harness route, not be dumped directly into the prompt.
Two Files, Two Jobs
Here, let’s look together at two common markdown files: AGENTS.md and SKILLS.md.
Both are markdown, but they enter the prompt differently.
AGENTS.md : In this workspace, behave like this.
SKILL.md : For this kind of task, use this procedure.
Why AGENTS.md Works
AGENTS.md works because the Harness treats it as a workspace instruction context.
It is project guidance that should be visible to the model before it starts operating on the repo.
For example, my own DongXi Agent’s main task is to summarize learning experiences from daily interactions with the coding agent.
So, in the DongXi workspace, AGENTS.md requires: When working on the DongXi Agent or coding-agent harness learning, keep the learning artifact current.
This single file changes the behavior of the entire session:
The model is still responsible for writing and reasoning.
But the Harness injects a stable project rule before the turn begins.
That’s the power of AGENTS.md:
- It is discovered from the workspace
- It is scoped by project or directory
- It enters the stable instruction layer
- It persists across multiple tasks
- It saves the user from repeating local conventions in every prompt
How AGENTS.md Works
- The Harness should know which directory each rule applies to.
- Nested workspaces may have narrower rules.
- These rules should be treated as instruction context, while still being subordinate to system and developer policy.
- The file content should not be blindly copied into every visible transcript message.
- It belongs in the stable context layer.
This keeps the prompt consistent without turning project rules into noisy chat history.
Where AGENTS.md Enters the Agent Turn
AGENTS.md is loaded before the model turn is assembled.
It enters the stable workspace instruction context, alongside project rules and local operating constraints. The transcript records what happened; AGENTS.md dictates what to do next.
Why SKILL.md Works
SKILL.md works for a different reason.
It improves performance on a specific class of tasks because it gives the model a focused procedure at the right moment.
Without a skill, the model may understand the general domain but miss the local workflow.
With a skill, the Harness can provide:
- When to use this workflow
- Which files or artifacts are important
- Which tools are allowed or expected
- What steps should be followed
- What checks prove the work is done
- What output format the user expects
This is far more powerful than a vague prompt:
“Do this better.”
This file boosts performance because it turns a fuzzy ability into a repeatable procedure.
How SKILL.md Works
A good Harness does not cram all skills into the default prompt.
It uses progressive disclosure.
At startup or capability refresh, the Harness scans skill folders:
The model can see that a skill exists.
The user can also see it via a local command like /skills.
The full body enters the context only when the task requires it.
Where SKILL.md Enters the Agent Turn
SKILL.md first appears as capability metadata; the body is loaded only after the task is invoked.
It enters the task procedure context: steps, tool expectations, checks, output shape. This context serves only the current workflow — it does not bleed into an always-on workspace policy.
A tiny markdown file. It doesn’t change the model weights.
But it changes the harness-managed context, tools, and workflow for that task.
Using these markdown files appropriately is the central concern for coding agents.
And:
Loading different markdown files into context at the right time is what the harness should do.
Similar Articles
@dongxi_nlp: https://x.com/dongxi_nlp/status/2065200644802101633
The article proposes that in a Coding Agent, tool invocations should be treated as contracts rather than simple functions, emphasizing the Harness's adjudication role in verification, permissions, lifecycle management, and others, and discusses in detail the composition and lifecycle of tool contracts.
@dongxi_nlp: https://x.com/dongxi_nlp/status/2066991890348572950
This is the 6th article in the "Context Is A Projection Harness" series. It delves into the core issues of context management in coding agents, proposing a Harness method that projects the full history into the narrow window needed by the model. Key techniques include Large-Result Preview, Idle-Gap Microcompact, Old-Span Collapse, and Auto-Compact Near The Limit.
@Potatoloogs: https://x.com/Potatoloogs/status/2057391224592667051
This article deeply analyzes the concept of Agent Harness, which is the engineering infrastructure wrapped around an LLM, including 12 components such as orchestration loops, tool calling, memory systems, context management, etc. The article cites practices from companies like Anthropic, OpenAI, and LangChain, arguing for the critical role of the harness in production-grade AI agents.
@Xudong07452910: This paper is a must-read for heavy users of Claude Code, Codex, or other AI Agents. It doesn't study how Agents fail on benchmarks, but a more real problem: In real development, what exactly are AI coding agents doing...
This paper analyzes 20,574 real-world coding-agent sessions to identify how AI agents misalign with developer intent, finding that constraint violations and inaccurate self-reporting are the most common failure modes, imposing trust and effort costs rather than irreversible damage.
This article systematically reviews AI Agent architecture and engineering practices, covering control flow, context engineering, tool design, memory, multi-agent organization, evaluation, tracing, and security. It is based on the OpenClaw implementation and emphasizes the critical role of Harness (testing and validation infrastructure) for system stability.
This article systematically reviews AI Agent architecture and engineering practices, covering control flow, context engineering, tool design, memory, multi-agent organization, evaluation, tracing, and security. It is based on the OpenClaw implementation and emphasizes the critical role of Harness (testing and validation infrastructure) for system stability.