Show HN: Lowfat – pluggable CLI filter that saved 91.8% of my LLM tokens
Summary
Lowfat is a lightweight CLI filter that reduces AI token costs by stripping unnecessary output before it reaches your agent, claiming up to 91.8% token savings. It provides shell integration, plugin support, and transparent rewriting for tools like Claude Code and OpenCode.
View Cached Full Text
Cached at: 06/05/26, 02:08 PM
zdk/lowfat
Source: https://github.com/zdk/lowfat
lowfat is a lightweight CLI tool that reduces AI token costs by filtering unnecessary CLI output before it reaches your agent.
Core focus
- Lightweight — Small single binary, small core; but extensible.
- Local-first — No telemetry; you own your data.
- Composable — UNIX-style pipes, mix built-ins and your own filters; not magic.
- User-owned —
lowfat historyshows what you run most; allow you to customize for your usecase.
Install
cargo install lowfat
# or
brew install zdk/tools/lowfat
Pre-built binaries on GitHub Releases.
Setup
Pick one of:
Claude Code hook — add to .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [{ "type": "command", "command": "lowfat hook" }]
}
]
}
}
Shell integration — auto-activates inside agent environments (CLAUDECODE=1, CODEX_ENV), or set LOWFAT_ENABLE=1 to force it on any shell:
echo 'eval "$(lowfat shell-init zsh)"' >> ~/.zshrc # or ~/.bashrc
OpenCode plugin — one command, no config editing:
lowfat opencode install # writes ~/.config/opencode/plugins/lowfat.ts
Restart OpenCode; commands are rewritten transparently before they run.
Remove it anytime with lowfat opencode uninstall.
Direct usage — prefix any command:
lowfat git status
lowfat docker ps
lowfat ls -la
Pi agent — in ~/.pi/agent/settings.json:
{ "shellCommandPrefix": "eval \"$(lowfat shell-init zsh)\"; " }
Usage highlights
# See what's configured and how loud each filter is being
lowfat info # status badge + active filters
lowfat info git # pipeline for `git`
lowfat info --config # full resolved config
# See what lowfat has saved you
lowfat stats # lifetime token savings
lowfat stats --audit # recent plugin executions
lowfat history # rank commands by potential savings
# Dial the aggressiveness
lowfat level ultra # max compression
LOWFAT_LEVEL=lite lowfat git log # one-off override
# Write a plugin
lowfat plugin new terraform # scaffold ~/.lowfat/plugins/terraform/
lowfat plugin doctor # check plugins (and pre-install any Python deps)
# Test a plugin against a sample without installing it
cat samples/git-diff-full.txt | lowfat filter --explain ./filter.lf --sub=diff --level=ultra
Learn more
- docs/ARCHITECTURE.md — high-level diagram: CLI, Runner, Plugins, Builtins
- docs/CONFIG.md —
.lowfatfile, env vars, pipeline DSL, built-in processors, thehistoryranking - docs/PLUGINS.md — lf-filter (the
.lfplugin DSL), shell escape hatches, PEP 723 + uv, AI agent prompt
Alternatives
License
Apache-2.0
AI notice
Multiple AI tools were used for this project
Similar Articles
Designing the hf CLI as an agent-optimized way to work with the Hub
Hugging Face redesigned its `hf` CLI to be optimized for both human users and AI coding agents like Claude Code and Codex, with agent-aware output rendering and benchmarking showing up to 6× token savings versus no-CLI baselines on complex tasks.
@tom_doerr: Reduces Claude Code and Cursor token costs by 60-95% https://github.com/yvgude/lean-ctx
lean-ctx is an open-source Rust-based context runtime that reduces token costs for AI coding agents like Claude Code, Cursor, Copilot, and others by 60–95% through file read compression and shell output optimization. It operates as a Shell Hook and MCP Server with 56 tools and multiple read modes.
@ClementDelangue: Token costs are why there will be no saas apocalypse / good dev tools are cached intelligence for agents! The popular t…
Hugging Face's hf CLI is shown to be far more token-efficient and successful for AI agents than hand-rolling raw API calls, with benchmarks showing up to 6x fewer tokens and 94% vs 84% task success, demonstrating that good abstractions are cached intelligence for agents.
How I easily cut my input token burn ~90% on long agent runs
The author shares a practical tip to reduce input token costs by ~90% on long agent runs using prompt caching: placing unchanged text (system prompt, tool definitions, context) at the start of every prompt to leverage cached prefixes from LLM providers.
git log costs your agent 624 tokens. It needs 55. Here's a list of the worst offenders
The article highlights how verbose CLI output wastes tokens for LLM coding agents and introduces a pattern-based compressor that reduces shell command output noise while preserving essential information.