@QingQ77: 把任何一个 GitHub 仓库变成它自己的 AI Agent——带专属 CLI、MCP 服务、记忆和签名认证,能直接 npm 发布。 https://github.com/ruvnet/agent-harness-generator… 你…
摘要
MetaHarness converts any GitHub repository into a custom AI agent harness with CLI, MCP service, memory, and signing, allowing deployment on multiple agent platforms.
查看缓存全文
缓存时间: 2026/06/20 16:20
把任何一个 GitHub 仓库变成它自己的 AI Agent——带专属 CLI、MCP 服务、记忆和签名认证,能直接 npm 发布。
https://github.com/ruvnet/agent-harness-generator…
你通过浏览器或者命令行告诉它你要什么,它生成一个完整的 npm 包,里面有专属的 npx 命令、MCP 服务、项目记忆、权限策略和 Ed25519 签名认证。这套框架可以跑在 Claude Code、Codex、pi、Hermes 等 8 个 Agent 平台上,还内置了 19 种垂直场景模板(开发、研究、交易、法律等)。
ruvnet/agent-harness-generator
Source: https://github.com/ruvnet/agent-harness-generator
MetaHarness
Mint a custom AI agent harness from any repo.
npx metaharness · open the Studio →
(Repo: ruvnet/agent-harness-generator · CLI: metaharness · Library: @ruvnet/agent-harness-generator)
What this is
Every serious repo deserves its own agent. A repo-aware CLI, a repo-aware coding agent, a local MCP server, memory scoped to the project, skills generated from the actual file layout, governance policy, release verification, witness-signed provenance.
metaharness mints those, on demand, from a GitHub URL or a blank slate. It is not another agent framework. It is a factory for agent frameworks.
The model is replaceable. The harness is the product.
What it gives you
In under 60 seconds, in your browser, with nothing leaving your machine:
- A custom AI agent harness for your repo (or any repo)
- Recommended agents, skills, slash commands, MCP tools
- A scoped memory namespace + governance policy
- Witness-signed provenance + release gates
- Drops into Claude Code, OpenAI Codex, pi.dev, Hermes, OpenClaw, or RVM — pick one or all
Output is an npm-publishable .zip with your name on it, your branding, your npx <your-name> CLI.
New
- Score any repo before you build it.
npx metaharness score <repo>reads the repo (never runs it) and prints a one-screen report card — how well a harness fits, how likely it is to build, how safe the tools are, and the rough cost per run — so you know what you’ll get before scaffolding. - Pick the cheapest model that’s good enough.
@metaharness/routerroutes each request to the right model from your own results — same quality, far less spend. Works out of the box with zero native deps; train it on your data for a sharper fit (npm i @metaharness/router). Add the optional@ruvector/tiny-dancerto train a fast native model instead — same training data, no API change. - Let your harness improve itself. Every scaffold now ships with Darwin Mode
(
@metaharness/darwin) wired in — runnpm run evolveand the harness mutates its own config, tests each change in a sandbox, and keeps only what measurably improves. The model stays frozen; the harness evolves. Safe by default (no network, no API key; pure refactor/tuning behind a safety gate). Validated on real SWE-bench Lite bug-fixing.--no-darwinto skip.
Tune it to your project — then ship it as your own npm
A generated harness is a starting point you own, not a fixed framework. Open it and make it yours:
- Keep only what your repo needs. Delete the agents, skills, slash commands,
and MCP servers you won’t use — the scaffold ships a recommended set, but a
payments service and a docs site want very different harnesses. A smaller,
targeted harness is faster, cheaper, and easier to reason about.
harness doctor/harness validatekeep it healthy as you trim. - Optimize the model routing for your work. Swap the per-task model tiers, tighten the governance policy, point the memory namespace at your domain. The harness is config you control, not a black box.
- Publish it as your own package for the whole org. Rename it, set your
scope, and
npm publish— now anyone on your team runsnpx @your-org/your-harnessand gets the same repo-tuned agent. One command, org-wide, versioned like any other dependency. (The 19@metaharness/*examples are exactly this pattern, published live.)
Make older, cheaper models punch like frontier ones. The right harness isn’t a pile of extra steps bolted onto an expensive model — it’s putting the right model on each task and getting out of the way. Our DRACO benchmark proves it: a small, cheap model delivers frontier-quality research at roughly one-tenth the cost, and a smart router squeezes out the rest. Stop paying frontier prices for work a $0.10 model does just as well.
That router ships as @metaharness/router
— route(query) returns the cheapest model predicted to clear your quality bar,
learned from your own eval logs. npm i @metaharness/router.
Try it in 30 seconds
# In the browser — zero install, nothing leaves the page
open https://ruvnet.github.io/agent-harness-generator/
# Or in the terminal — the same harness (behaviourally equivalent output)
npx metaharness my-bot --template vertical:coding --host claude-code
cd my-bot && npx . --help
Don’t know what to pick? Run the wizard:
npx metaharness --wizard
Already have a repo you want a harness for?
harness analyze-repo . # local — deterministic analysis only
harness analyze-repo . --scaffold my-bot # materialise the recommended harness
No repository code is executed. Inferred build/test commands are emitted as trust: inferred · execution: disabled.
📖 Read the plain-language user guide →
Hosts
The same harness output runs on nine agent hosts — eight interactive, plus GitHub Actions (CI/CD):
| Host | What ships | Notes |
|---|---|---|
| Claude Code | MCP server + hooks + 3-scope settings | Richest surface; Ruflo-native |
| OpenAI Codex | MCP via ~/.codex/config.toml | TOML, no hooks |
| pi.dev | Pi extension via pi.registerTool() | No MCP by design |
| Hermes | MCP runtime, <think> scrubbing | Per Hermes issue #741 |
| OpenClaw | ~/.openclaw/openclaw.json + workspace skills | Personal-AI gateway |
| RVM | Bare-metal microhypervisor + capability tokens | Hardware isolation for untrusted peers |
| GitHub Copilot | MCP via .vscode/mcp.json | VSCode 1.99+ (ADR-032) |
| OpenCode | MCP via .opencode/opencode.json | sst/opencode TUI (ADR-036) |
| GitHub Actions | .github/workflows/ + composite action.yml | Non-interactive CI/CD; default-deny via permissions: (ADR-033) |
See ADR-004 — Host integration model and ADR-033 — GitHub Actions host.
MCP — modular, default-deny
MCP is included as a first-class adapter surface, not the identity. It is gated and default-deny (ADR-022):
- Modes:
off·local(stdio) ·remote(HTTPS + auth) - Emits
src/mcp/{server,tools,resources,prompts,policy,audit}.ts+ a scannable.harness/mcp-policy.json - Safe defaults: no network, no shell, no file-write, approve-dangerous, 30s timeout, 8 calls/turn, audit on
harness mcp-scan <path>— “npm audit for agent tools”: static-only scan flagging shell/network grants, missing audit/timeouts, wildcard permissions, unguarded secrets, unpinned deps. Exits 1 on any HIGH.
Verticals (19 quick-start templates)
npx metaharness --list
npx metaharness my-bot --template vertical:coding
| Category | Templates |
|---|---|
| Starter / Operations | minimal, vertical:devops |
| Engineering | vertical:coding, vertical:ai, vertical:repo-maintainer (iter 113) |
| Knowledge | vertical:research, vertical:ruview, vertical:education |
| Finance / Pro | vertical:trading, vertical:legal, vertical:health |
| Customer / Growth | vertical:support, vertical:crm, vertical:marketing, vertical:advertising, vertical:sales |
| Business / Frontier | vertical:business, vertical:agentics, vertical:gaming, vertical:exotic |
Each ships bespoke domain agents (with system prompts), skills, commands, and per-host settings — all default-deny.
One-command examples
Don’t want to pick flags? Each host and vertical has a dedicated
@metaharness/* wrapper — published, one npx away, no template/host
flags to remember. A scaffold from a wrapper is byte-identical to the
equivalent metaharness invocation.
Host integrations
Vertical workflows (ready-made multi-agent pods)
All 18 are live on npm under @metaharness. Source + per-package README:
examples-packages/ · plain-language deep-dive gists:
examples-packages/GISTS.md.
Day-to-day commands
After scaffolding, every harness has a harness CLI:
| You’re trying to … | Subcommand |
|---|---|
| Smoke-check the scaffold | harness doctor |
| Run every release gate | harness validate |
| Check kernel ↔ harness compatibility | harness diag |
| Score the harness 0-100 with badges | harness score |
| Pre-scaffold: is this REPO ready for an agent? | harness genome <repo> |
| Pre-scaffold: fit/cost/safety report card for a repo | metaharness score <repo> |
| MCP threat-model artifact for a PR review | harness threat-model |
| Declare OIA v0.1 layer alignment | harness oia-manifest |
| File a useful support ticket | harness diag --bundle > bundle.json |
| Diff two harnesses | harness compare a/ b/ |
| Share MCP + Bash + claims config for review | harness export-config |
| Run npm-audit per-harness | harness audit --bundle > audit.json |
| Emit SPDX-2.3 SBOM | harness sbom |
| Drift-detect against the latest template | harness upgrade |
| Sign / verify the witness | harness sign · harness verify |
| Pin the manifest to IPFS | harness publish --confirm |
| Recommend a harness from a repo | harness analyze-repo |
21 subcommands total. Every one respects --help / -h. Shell completion: harness completions bash | zsh | fish.
📖 Full reference: docs/USAGE.md
Status
v0.1.x beta — published and usable, with the credibility/doc reconciliation in issue #4 / ADR-042 in progress. The release pipeline is mature: CI matrix green across Rust × 3 OS + WASM × 3 OS + Node 20+22 × 3 OS + Bench + pack+install × 3 OS
- a CI-passed aggregator; single-command releases (
node scripts/release.mjs <bump> --push) atomically bump 15 sources, run all gates, and tag.
| Layer | Status |
|---|---|
| Rust kernel (WASM + NAPI-RS) | Shipped — 7 subsystems |
| 6 host adapters | claude-code · codex · pi-dev · hermes · openclaw · rvm |
17 harness subcommands | Shipped |
| 7 Codex skills | Shipped |
| Claude marketplace plugin | Shipped + schema-validated |
| Witness signing (Ed25519) | Shipped + tamper-tested |
| MCP tool dispatch | 11 end-to-end cases |
| Test suite | 568/568 across 67 files |
| CI matrix | 16 jobs green |
| Security pipeline | cargo-audit · cargo-deny · npm-audit · CodeQL · SBOM (SPDX-2.3) |
| Publish pipeline | GCP WIF + 2 gates + 11 packages + IPFS pin |
| Agent Harness Studio | Live at https://ruvnet.github.io/agent-harness-generator/ |
Architecture in 30 seconds
You (harness author)
└→ agent-harness-generator ← the factory
└→ Your harness (.zip) ← what you ship
├ npx <your-name> ← your identity
├ <your agents> ← your content
└ @metaharness/kernel ← shared primitives (Rust + WASM + NAPI-RS)
└→ Host adapter (Claude Code / Codex / pi.dev / Hermes / OpenClaw / RVM)
└→ LLM providers
You operate the factory. The factory produces your harness. Your users never see the factory — only the brand and CLI you ship. The kernel ships as @metaharness/kernel (Rust → wasm-pack + NAPI-RS); your content stays yours.
📖 Deeper: docs/ARCHITECTURE.md · docs/adrs/INDEX.md (31 ADRs)
Quality gates
| Concern | Where |
|---|---|
| CI | ci.yml — Rust 3-platform × fmt/clippy/test/doc + WASM build + size budget + Node 20/22 × 3-platform |
| Publish | publish.yml — GCP WIF → Secret Manager → smoke → npm publish --provenance (SLSA L2) |
| Security | security.yml — cargo-audit + cargo-deny + npm-audit + CodeQL + weekly cron |
| Provenance | ADR-011 — Ed25519-signed witness manifest, byte-deterministic across runners |
| Studio liveness | pages-monitor.yml — daily HTTP probe of live Studio |
| Research quality (DRACO) | draco.yml — cross-domain deep-research benchmark (ADR-037). Deterministic subset gates the scorer/runner machinery on every push (offline); a weekly judged cadence runs the real OpenRouter-fusion score. 5 dimensions (grounding/coverage/balance/cleanliness/faithfulness); the verifier + judge are different model families than the synthesizer (fusion). See packages/bench/draco/. |
Developer quick-start
git clone https://github.com/ruvnet/agent-harness-generator
cd agent-harness-generator
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings
npm install
npm run build:wasm
npm test
node scripts/healthcheck.mjs
See CONTRIBUTING.md.
Related
- ruflo — the meta-harness this generator factors apart
- ruvector — vector + agentic database (memory backend)
- @ruvector/emergent-time — memory-decay clock the kernel uses
License
MIT — see LICENSE.
FAQ
What is MetaHarness?
MetaHarness is a CLI and browser Studio that turns any GitHub repo (or a
blank slate) into a custom AI agent harness. The output is a branded,
npm-publishable package with its own npx <name> CLI, MCP server, memory,
governance policy, and Ed25519 witness-signed releases. Runs on Claude
Code, OpenAI Codex, pi.dev, Hermes, OpenClaw, and RVM.
How is MetaHarness different from an agent framework?
Frameworks help developers build agents. MetaHarness helps repositories ship agents. The model is replaceable; the harness is the product.
Do I need to run a server?
No. The Studio is 100% client-side (GitHub Pages). The CLI runs locally. There is no MetaHarness account, no hosted backend, no telemetry.
Does it execute my code during analysis?
No. metaharness analyze and metaharness genome are deterministic
static-analysis only. Inferred build/test commands are marked
trust: inferred · execution: disabled.
Which agent runtimes does it support?
Six today: Claude Code, OpenAI Codex, pi.dev, Hermes (Nous Research), OpenClaw, and RVM. GitHub Copilot and GitHub Actions are proposed in ADR-032 and ADR-033.
What languages does it understand?
Rust, TypeScript / JavaScript, Python, and Go are detected deterministically via lockfile and manifest probing. Lexical scoring is the default; optional in-browser MiniLM embeddings via Transformers.js boost recall for unusual repos.
Is the output really npm-publishable?
Yes — the generated harness ships with package.json, bin, a working
CLI, and harness validate to gate releases. harness sign adds the
Ed25519 witness; harness sbom emits SPDX-2.3.
Keywords: metaharness, AI agent CLI, AI agent scaffold, AI agent generator, repo to agent, GitHub repo to AI agent, agent harness, agent harness generator, agent framework alternative, agentic AI, agentic workflow, autonomous AI agents, multi-agent framework, multi-agent system, MCP, MCP server, model context protocol, Claude Code plugin, OpenAI Codex plugin, Anthropic agents, GPT agent, Codex agent, pi.dev extension, hermes agent, Nous Research, OpenClaw, RVM agent, vertical AI agents, custom AI CLI, npx metaharness, npm create AI agent, Rust WASM agent kernel, NAPI-RS, wasm-bindgen, agent memory, ReasoningBank, HNSW vector search, emergent time, witness manifest, Ed25519 signed, provenance, SBOM, SPDX, SLSA, plugin marketplace, IPFS registry, drift detection, anti-slop, TDD agents, self-evolving agents, federated agents, swarm intelligence, GCP Workload Identity Federation, Secret Manager, npm provenance, repo-aware AI, repo-native CLI, repo factory.
相似文章
@geekbb: Agent harness 自动化优化工具,接管了 Agent harness 优化的脏活,你给一个基准测试命令和目标仓库,它就自动生成提案、跑评测、记结果、留好的,弃差的,自动改进 agent 的 prompt、配置和源码。 https…
autoharness 是一个自动化代理 harness 优化工具,能基于基准测试命令自动生成提案、运行评估并改进 agent 的 prompt、配置和源码,支持 Codex 和 Claude。
@Potatoloogs: https://x.com/Potatoloogs/status/2057391224592667051
本文深度拆解了Agent Harness的概念,即包裹在LLM外部的工程基础设施,包括编排循环、工具调用、记忆系统、上下文管理等12个组件。文章引用Anthropic、OpenAI、LangChain等公司的实践,论证了harness对生产级AI Agent的关键作用。
@FakeMaidenMaker: awesome-harness-engineering,这个项目收录的知识含金量远超这个数字——OpenAI、Anthropic、微软、Meta 的一线工程实践全在里头。 GitHub:https://github.com/ai-boos…
awesome-harness-engineering 是一个收录了来自 OpenAI、Anthropic、微软、Meta 等公司关于 AI agent harness 工程(上下文管理、工具设计、验证回路、记忆系统等)实践资料的精选资源列表,旨在帮助开发者构建可靠的 agent 框架。
@astaxie: 今天群里面讨论怎么样学习 Harness,Harness 工程我学习这两个: 1. https://github.com/walkinglabs/learn-harness-engineering… 通过这个了解每一个 Harness 的…
A project-based course repository on Harness Engineering for AI coding agents, covering environment setup, state management, verification, and control mechanisms to make AI coding agents work reliably. The course synthesizes best practices from OpenAI and Anthropic on building effective harnesses for long-running agents.
@FakeMaidenMaker: LangChain 刚官方开源了一个开箱即用的 agent harness——Deep Agent 装好就能跑长任务、多步工作,不用 fork 也能轻松替换任何部件。 GitHub 23701 stars,LangChain 官方(lan…
LangChain 官方开源了 Deep Agent,一个开箱即用的 agent harness,支持长任务、多步工作流,可插拔组件,模型无关,生产就绪。
