@tom_doerr: Reduces Claude Code and Cursor token costs by 60-95% https://github.com/yvgude/lean-ctx
Summary
lean-ctx is an open-source Rust-based context runtime that reduces token costs for AI coding agents like Claude Code, Cursor, Copilot, and others by 60–95% through file read compression and shell output optimization. It operates as a Shell Hook and MCP Server with 56 tools and multiple read modes.
View Cached Full Text
Cached at: 05/09/26, 05:43 AM
The context layer for AI coding agents
Reduce token waste in Cursor, Claude Code, Copilot, Windsurf, Codex, Gemini & more by 60–95% (up to 99% on cached reads) Shell Hook + MCP Server · 56 tools · 10 read modes · 95+ patterns · Single Rust binary
Website · Docs · Install · Demo · Benchmarks · Cookbook · Security · Changelog · Discord
See it in action:
All GIFs are generated from reproducible VHS tapes in demo/.
Similar Articles
@_avichawla: Claude Code used 3x fewer tokens with one change: - Before: 10.4M tokens · 10 errors · $9.21 - After: 3.7M tokens · 0 e…
By swapping to Insforge Skills + CLI as the backend context layer, a user cut Claude Code token usage by 64 %, eliminated all errors and reduced cost from $9.21 to $2.81.
zilliztech/claude-context
Zilliz releases Claude Context, an open-source MCP plugin that adds semantic code search to Claude Code and other AI coding agents, enabling cost-effective deep context from entire codebases via vector search.
@appliedcompute: https://x.com/appliedcompute/status/2052826576723841292
Applied Compute introduces ACL-Wiki, a continual learning memory system built on their Context Engine that logs coding agent interactions from Cursor, Claude Code, and Codex to build an improving Contextbase, roughly doubling the Critical Memory Rate over two weeks. The system uses a Remember-Refine-Retrieve pipeline exposed via MCP server to give coding agents institutional memory that improves with use.
A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression
TACO introduces a self-evolving compression framework that automatically learns to shrink redundant terminal interaction history, cutting token overhead ~10% while boosting accuracy 1-4% across TerminalBench and other code-agent benchmarks.
After hitting Claude’s limits for months, I finally found a better workflow
The author shares a personal workflow adjustment combining Claude for reasoning and Gemini CLI for execution to bypass usage limits and reduce AI subscription costs.