@geekbb: MCP 工具，把 Codex 的低风险活甩给 DeepSeek 干，贵的模型只做判断。五任务测试平均省 48%，延迟 6 秒左右。 CodexSaver 一个 MCP 工具，把 Codex 编码会话中的低风险任务（写测试、写文档、代码解释…

X AI KOLs Timeline 2026/05/08 03:43 工具

摘要

CodexSaver is an MCP tool that offloads low-risk coding tasks (tests, docs, lint fixes) from Codex to a cheaper model like DeepSeek, achieving ~48% cost savings with ~6s latency.

MCP 工具，把 Codex 的低风险活甩给 DeepSeek 干，贵的模型只做判断。五任务测试平均省 48%，延迟 6 秒左右。 CodexSaver 一个 MCP 工具，把 Codex 编码会话中的低风险任务（写测试、写文档、代码解释、lint 修复等）委托给 DeepSeek 执行，高风险任务（架构决策、安全逻辑、生产部署、最终审查）保留在 Codex。 https://github.com/fendouai/CodexSaver…

查看原文导出为 Word 导出为 PDF

查看缓存全文

缓存时间: 2026/05/08 15:36

fendouai/CodexSaver

Source: https://github.com/fendouai/CodexSaver

CodexSaver

Make Codex cheaper without making it dumber.

中文文档

CodexSaver

CodexSaver is an MCP tool that turns Codex into a cost-aware router. It pushes low-risk development work to a cheaper worker LLM, keeps high-risk judgment in Codex, and returns enough interaction detail that you can feel when the tool is active.

Lower-cost execution for tests, docs, search, and explanation work
Codex stays responsible for architecture, security, protected domains, and final review
Global-by-default Codex install, so every workspace can use the same MCP tool
DeepSeek by default, with presets for OpenAI, Anthropic, Gemini, Qwen, Ollama, LM Studio, and more
One-time local provider setup in ~/.codexsaver/config.json
Verified with tests, real DeepSeek calls, and end-to-end MCP launcher checks

Why This Exists

Most coding sessions contain two very different kinds of work:

expensive thinking
cheap execution

Codex is excellent at the first one. It is overqualified for much of the second.

CodexSaver splits the flow on purpose:

Codex handles reasoning, ambiguity, protected domains, and approval
a configured worker provider handles low-risk throughput work

That gives you a practical pattern:

Use the expensive model for judgment.
Use the cheaper model for volume.
Never confuse the two.

What It Feels Like

When CodexSaver is active, tool responses are not silent blobs of JSON. They include an interaction block that makes the routing decision visible:

{
  "interaction": {
    "tool": "codexsaver.delegate_task",
    "mode": "delegated_execution",
    "headline": "CodexSaver delegated this task to the configured worker provider.",
    "route_label": "[CodexSaver] route=deepseek task_type=write_tests risk=low",
    "next_step": "Review the worker result and apply it only if the patch looks safe."
  }
}

Three states matter:

preview: routing preview only, no external model call
delegated_execution: delegated run completed
codex_takeover: task stayed with Codex because risk was too high or the task was ambiguous

Quick Start

Recommended Global Install

git clone https://github.com/fendouai/CodexSaver
cd CodexSaver

python cli.py auth set --provider deepseek --api-key YOUR_API_KEY
python cli.py install
python cli.py doctor

That is it. python cli.py install writes a global Codex MCP entry to ~/.codex/config.toml and points it at a stable launcher: ~/.codexsaver/codexsaver_mcp.py.

After that, every Codex workspace can call:

codexsaver.delegate_task

Use --project only when you want a repository-local .codex/config.toml:

python cli.py install --project

Provider Setup

DeepSeek is the default because it is inexpensive and exposes an OpenAI-compatible API. Switching providers is just one flag:

python cli.py auth set --provider openai --api-key YOUR_API_KEY --model gpt-4o-mini
python cli.py auth set --provider anthropic --api-key YOUR_API_KEY --model claude-3-5-haiku-latest
python cli.py auth set --provider gemini --api-key YOUR_API_KEY --model gemini-2.0-flash
python cli.py auth set --provider qwen --api-key YOUR_API_KEY --model qwen-plus

For local models:

python cli.py auth set --provider ollama --model llama3.1
python cli.py auth set --provider lmstudio --model local-model

For any custom OpenAI-compatible endpoint:

python cli.py auth set \
  --provider custom \
  --api-key YOUR_API_KEY \
  --base-url https://example.com/v1/chat/completions \
  --model your-model

See built-in presets:

python cli.py auth providers

If you prefer a temporary one-shell-session setup instead of saving the key locally:

export CODEXSAVER_PROVIDER=deepseek
export CODEXSAVER_API_KEY=YOUR_API_KEY
python cli.py install
python cli.py doctor

One Message To Codex

If Codex is already open in this repository, you can just say:

Save my worker provider API key for CodexSaver, run `python cli.py auth set --provider deepseek --api-key ...`, then run `python cli.py install` and `python cli.py doctor`, and tell me whether it is ready.

For repo-local setup:

Save my worker provider API key for CodexSaver, install CodexSaver only for this repo, run `python cli.py auth set --provider deepseek --api-key ...`, `python cli.py install --project`, then `python cli.py doctor`, and summarize the result.

Ready means:

~/.codex/config.toml contains the global codexsaver MCP server, or .codex/config.toml exists in the repo
~/.codexsaver/codexsaver_mcp.py exists for global installs
provider settings are available from env vars or ~/.codexsaver/config.json
python cli.py doctor reports CodexSaver is ready

60-Second Demo

Global MCP config created by python cli.py install:

[mcp_servers.codexsaver]
command = "python"
args = ["/Users/you/.codexsaver/codexsaver_mcp.py"]
startup_timeout_sec = 10
tool_timeout_sec = 120

Then tell Codex:

Use CodexSaver for safe low-risk tasks.
Add unit tests for user service.

Or call the CLI directly:

python cli.py delegate "Explain the routing logic briefly" --files codexsaver/router.py --workspace .

Dry run:

python cli.py "add unit tests for user service" --files src/user/service.ts --workspace . --dry-run

Real run:

python cli.py "add unit tests for user service" --files src/user/service.ts --workspace .

Verified Setup Flow

Measured on May 8, 2026 with the global install and local-key workflow:

Check	Command	Result
Full test suite	`PYTHONDONTWRITEBYTECODE=1 python -m pytest -q -p no:cacheprovider`	`86 passed in 0.23s`
Global install	`python cli.py install --workspace .`	`status=ok`, global config points at `~/.codexsaver/codexsaver_mcp.py`
Local provider persistence	`python cli.py auth set --provider deepseek --api-key ...`	saved to `~/.codexsaver/config.json`
Workspace doctor	`python cli.py doctor --workspace .`	`provider_api_key_source=local_config:deepseek`, workspace ready
Global launcher check	`python ~/.codexsaver/codexsaver_mcp.py` with MCP `initialize`	returned `serverInfo.name=codexsaver`
Real DeepSeek call	`python cli.py delegate "Explain the CodexSaver router..." --files codexsaver/router.py --workspace .`	`route=deepseek`, `status=success`, verification passed

This is the intended workflow:

Save the key once
Install CodexSaver globally
Confirm readiness with doctor
Use real delegated calls without re-exporting API keys

Provider Matrix

Built-in presets cover the common hosted and local routes:

Provider	Style	Default model	API key
`deepseek`	OpenAI-compatible	`deepseek-chat`	required
`openai`	OpenAI	`gpt-4o-mini`	required
`anthropic`	native Messages API	`claude-3-5-haiku-latest`	required
`gemini`	OpenAI-compatible endpoint	`gemini-2.0-flash`	required
`qwen`	OpenAI-compatible endpoint	`qwen-plus`	required
`ollama`	local OpenAI-compatible endpoint	`llama3.1`	not required
`lmstudio`	local OpenAI-compatible endpoint	`local-model`	not required

Run python cli.py auth providers for the complete list.

Post-Setup Usage Ratio

After setup completed, I measured the actual routed tasks in this working session. I only counted tasks that truly entered model routing, not local commands like pytest, git, install, doctor, or README editing.

Result:

DeepSeek: 7 / 8 = 87.5%
Codex: 1 / 8 = 12.5%

Why not 100%?

One test-writing prompt originally included the phrase production logic. That triggered the router’s intentional high-risk keyword guard and returned the task to Codex. This was not a failure. It was the protection logic working as designed.

If you only count the later standardized five-task benchmark with natural low-risk phrasing, the delegation ratio was:

DeepSeek: 5 / 5 = 100%
Codex: 0 / 5 = 0%

Takeaway:

In real usage, CodexSaver defaulted to DeepSeek for most low-risk work
It still preserved a strict fallback path for risky wording and protected domains

Five-Task A/B Benchmark

Method:

A = counterfactual Codex-only baseline with normalized cost index fixed at 1.00
B = CodexSaver mode with the live router and DeepSeek worker
latency is wall-clock time for the real CodexSaver execution
savings come from the current CostEstimator, so this is a reproducible routing benchmark, not invoice-grade billing data

Summary:

All 5 tasks were typical low-risk development chores: explanation, docs, tests, and README maintenance
All 5 delegated successfully after using natural low-risk phrasing
Average live latency was 6.18s
Average estimated savings were 48.4%
Average normalized cost moved from 1.00 to 0.52
Estimated relative reduction was 48.0%

Task	Type	Route	Latency	A: Codex-only Cost Index	B: CodexSaver Cost Index	Estimated Savings	Output Shape
Explain router logic	`explain`	`deepseek`	`2.13s`	`1.00`	`0.55`	`45%`	read-only summary
Document router module	`docs`	`deepseek`	`3.13s`	`1.00`	`0.55`	`45%`	1-file patch
Add cost tests	`write_tests`	`deepseek`	`9.29s`	`1.00`	`0.55`	`45%`	test patch
Explain verifier flow	`explain`	`deepseek`	`2.30s`	`1.00`	`0.55`	`45%`	read-only summary
Update install docs	`docs`	`deepseek`	`14.06s`	`1.00`	`0.38`	`62%`	README patch

Five-task benchmark

Figure: Gray bars are the Codex-only baseline fixed at 100. Green bars are the CodexSaver cost index for the same task. Lower bars mean lower estimated Codex spend.

Interpretation:

Read-only explain tasks were the fastest, cleanest wins
Small docs edits delegated well and returned compact, reviewable patches
Test generation had higher latency than explanation, but still stayed in the low-risk savings band
Larger-context documentation work produced the biggest estimated savings because the Codex-only context cost would be higher

Routing Rules

Good Tasks To Delegate

repo scanning and code search
code explanation and summarization
writing unit tests
fixing lint or type errors
documentation updates
boilerplate generation
small localized refactors

Tasks Kept In Codex

architecture decisions
auth, security, payment, billing, or permissions logic
database migrations
deployment and production operations
ambiguous product requests
final review before applying changes

Why Some Medium-Risk Tasks Still Delegate

CodexSaver does not just ask:

Is this code work?

It asks:

Is this code work cheap enough to delegate without losing judgment quality?

That creates a deliberate asymmetry:

read-only understanding can be cheap
writes in sensitive domains are expensive in risk even if the diff is small
ambiguity defaults to Codex, not delegation

That is why Explain auth code may still delegate while Refactor auth service stays in Codex.

How It Works

User
  ↓
Codex
  ↓ MCP tool call
CodexSaver
  ├─ Router
  ├─ Context Packer
  ├─ Worker LLM Provider
  ├─ Verifier
  └─ Cost Estimator
  ↓
Codex review / apply / finalize

Core modules:

Router: classify tasks and assign risk
ContextPacker: bound file context before delegation
ProviderClient: call the configured worker model
Verifier: validate output shape, protected paths, and suggested commands
CostEstimator: estimate relative savings bands

Security And Persistence

python cli.py auth set --provider ... --api-key ... saves provider settings to ~/.codexsaver/config.json
the config file is written with local-user-only permissions
doctor shows whether the key comes from the environment or local config, and only prints a masked preview
live calls use local config automatically if no env key is exported
if verification fails, CodexSaver falls back to needs_codex

Commands

python cli.py auth providers
python cli.py auth set --provider deepseek --api-key YOUR_API_KEY
python cli.py install
python cli.py install --project
python cli.py doctor
python cli.py delegate "Explain the routing logic briefly" --files codexsaver/router.py --workspace .

Roadmap

MCP server
rule-based routing
bounded context packing
DeepSeek default worker integration
multi-provider OpenAI-compatible worker support
local API key persistence
interaction-aware tool responses
end-to-end verification flow
cost-aware dynamic routing
cost-aware provider selection

If This Saves You Money

Star the repo.

相似文章

@Fenng: 使用 Codex 调一个看似简单的 Bug 但反复调不对，这时候你该看看用的是什么模型，如果是 GPT-5.4，就新起个会话切到 GPT-5.3-Codex，可能很快就解决了。不用谢。

X AI KOLs Following

Fenng suggests switching from GPT-5.4 to GPT-5.3-Codex when debugging simple bugs that persist, implying model version can affect code-fixing performance.

Rakuten 借助 Codex 将问题修复速度提升一倍

OpenAI Blog

# Rakuten 借助 Codex 将问题修复速度提升一倍来源：[https://openai.com/index/rakuten/](https://openai.com/index/rakuten/) [Rakuten\（在新窗口中打开）](https://www.rakuten.com/?src=googadw-Search&eeid=17879&utm_channel=sem&utm_medium=sem&utm_source=11146983516&utm_campaign=b&utm_content=c&utm_term=ggl&utm_pub=109630179055&utm_size=kwd-301015505171&mkwid=0001&acct=b&ds_kids=301015505171&dest_stack=act&gclsrc=aw.ds&gad_source=1&gad_campaignid=11146983516&gbraid=0AAAAAD5nKBB5rzb6HrU7mt3

@QingQ77: 一个专为 DeepSeek API 前缀缓存机制设计的终端 AI 编程代理，通过缓存优先的架构在长会话中保持超低 token 成本。 https://github.com/esengine/DeepSeek-Reasonix… Reaso…

X AI KOLs Timeline

Reasonix 是一个专为 DeepSeek API 前缀缓存机制设计的终端 AI 编程代理，通过缓存优先架构在长会话中实现超低 token 成本。实测 4.35 亿输入 token 仅花费约 12 美元，缓存命中率 99.82%。

@OpenAI: 如果一个任务需要多个工具，Codex会为每一步选择最佳的工具。当插件能胜任工作时，它会使用插件，……

X AI KOLs

OpenAI 强调了 Codex 能够自主选择并组合最佳工具（如插件或浏览器）来处理复杂任务的每一步。

@oragnes: 卧槽，我的AI终于给我赚钱了 Codex + Opus 无敌了

X AI KOLs Timeline

用户分享了使用Codex和Claude Opus组合工具成功利用AI赚钱的经历，称其为无敌组合。

fendouai/CodexSaver

CodexSaver

Why This Exists

What It Feels Like

Quick Start

Recommended Global Install

Provider Setup

One Message To Codex

60-Second Demo

Verified Setup Flow

Provider Matrix

Post-Setup Usage Ratio

Five-Task A/B Benchmark

Routing Rules

Good Tasks To Delegate

Tasks Kept In Codex

Why Some Medium-Risk Tasks Still Delegate

How It Works

Security And Persistence

Commands

Roadmap

If This Saves You Money

相似文章

@Fenng: 使用 Codex 调一个看似简单的 Bug 但反复调不对，这时候你该看看用的是什么模型，如果是 GPT-5.4，就新起个会话切到 GPT-5.3-Codex，可能很快就解决了。 不用谢。

Rakuten 借助 Codex 将问题修复速度提升一倍

@QingQ77: 一个专为 DeepSeek API 前缀缓存机制设计的终端 AI 编程代理，通过缓存优先的架构在长会话中保持超低 token 成本。 https://github.com/esengine/DeepSeek-Reasonix… Reaso…

@OpenAI: 如果一个任务需要多个工具，Codex会为每一步选择最佳的工具。当插件能胜任工作时，它会使用插件，……

@oragnes: 卧槽，我的AI终于给我赚钱了 Codex + Opus 无敌了

提交意见反馈

@Fenng: 使用 Codex 调一个看似简单的 Bug 但反复调不对，这时候你该看看用的是什么模型，如果是 GPT-5.4，就新起个会话切到 GPT-5.3-Codex，可能很快就解决了。不用谢。