Bigger context windows seem to be solving a different problem than understanding

Reddit r/artificial Tools

Summary

Repowise is an open-source tool that indexes codebases into five intelligence layers—dependency graph, git history, auto-generated docs, architectural decisions, and code health—and exposes them to AI coding agents via MCP tools for more accurate context and fewer tool calls.

One thing I've been wondering lately: We often talk about larger context windows as if they're equivalent to better understanding. But in practice those feel like different problems. Access to information keeps improving. Understanding relationships between pieces of information still feels much harder. I notice this most when working with larger software projects. You can give a model access to a huge amount of code, but that doesn't necessarily mean it understands how the system evolved, which components are tightly coupled, or where risk actually lives. Curious whether others think these are fundamentally different problems or if larger context eventually solves both. Been exploring this while working on RepoWise: https://github.com/repowise-dev/repowise
Original Article
View Cached Full Text

Cached at: 06/05/26, 09:13 PM

repowise-dev/repowise

Source: https://github.com/repowise-dev/repowise

repowise — the codebase intelligence layer for your AI coding agent

The intelligence layer that gives your AI agent context, ownership, decisions — and a code-health score proven to predict real bugs.

Five intelligence layers · Nine MCP tools · 15 languages · Multi-repo workspaces · One pip install

Live demo — repowise.dev

PyPI version License: AGPL v3 Python 3.11+ MCP compatible GitHub stars

Hosted for teams → · Docs · Discord · Contact

Layers · Code Health · Benchmarks · Languages · Quickstart · MCP tools · Comparison · Hosted


repowise demo — repowise init → Claude Code querying via MCP tools

Your AI coding agent reads files. It doesn’t know which ones change together, which ones are dead, or why they were built the way they were. It has the source code and no memory of how the codebase got there.

repowise fixes that. It indexes your codebase into five intelligence layers — dependency graph, git history, auto-generated docs, architectural decisions, and code health — and exposes them to Claude Code, Codex, and any MCP-compatible agent through nine task-shaped tools. The result: your agent answers “why does auth work this way?” instead of “here is what auth.ts contains” — with fewer tool calls, fewer file reads, and lower cost per query, at comparable answer quality (benchmarks ↓).


The five layers

repowise runs once, builds everything, then keeps it in sync on every commit. Each layer is queryable from the CLI, the MCP tools, and the local dashboard.

LayerWhat it gives youEdge
◈ Graphtree-sitter dependency graph across 15 languages · two-tier file + symbol nodes · 3-tier call resolution · Leiden communities · PageRank / centrality / execution flows · framework-aware route→handler edgesA real graph most tools never build
◈ Githotspots (churn × complexity) · ownership % · co-change pairs (hidden coupling) · bus factor · contributor profiles · module health · reviewer suggestionsBehavioral signals static analysis can’t see
◈ DocsLLM-generated wiki per module/file · incremental on every commit · freshness + confidence scoring · hybrid RAG search (FTS + vector via RRF)Stays current — rebuilt every commit
◈ Decisionsarchitectural decisions mined from 8 sources, evidence-backed (verified / fuzzy / unverified), linked to graph nodes, connected by supersedes/refines/conflicts_with edges, tracked for staleness★ Captured nowhere else
★ Code Health25 deterministic biomarkers, 1–10 score per file · defect-calibrated weights · coverage ingestion · trend alerts · refactoring targets · zero LLM, <30s★ Defect-validated — our edge ↓

Full deep-dive on every layer (graph, git, docs, decisions, hooks, auto-sync, dead code, CLAUDE.md generation): docs/INTELLIGENCE_LAYERS.md →


★ Code Health — the layer nobody else nails

Code health is repowise’s deepest differentiator — the one layer with no real equivalent, and the only one we can prove predicts real bugs.

repowise scores every file 1–10 from 25 deterministic biomarkers — McCabe complexity, deep nesting, brain methods, class cohesion (LCOM4), god classes, native Rabin–Karp clone detection, untested hotspots, function-level churn, code-age volatility, ownership dispersion, change entropy, co-change scatter, prior-defect history, test-quality smells, and more.

Zero LLM calls. Zero cloud requirement. Zero new runtime dependencies. Pure Python over tree-sitter + git data — finishes in under 30 seconds on a 3,000-file repo. The biomarker weights are calibrated against a real defect corpus, not hand-tuned; only the learned constants ship and the runtime stays fully deterministic.

repowise health                       # KPIs + lowest-scoring files
repowise health --coverage cov.lcov   # ingest LCOV/Cobertura/Clover → untested-hotspot
repowise health --refactoring-targets # ranked by impact / effort
repowise health --trend               # snapshots + declining / predicted-decline alerts

Does the score actually find bugs? Yes — and it out-ranks the leading commercial code-health tool. On the same 2,770 files across 9 languages, scored at the same leakage-free commit against the same defect labels:

Axis (head-to-head, paired tests)repowiseLeading commercial tool
Recall @ 20%-of-lines budget0.1730.074
Effort-aware ranking (Popt)0.6070.462
Defect density, size-normalized (defects/KLOC, Alert:Healthy)2.18×0.56×
Discrimination (ROC AUC)0.7310.705

Ranking by repowise health surfaces 2.3× the defects under a fixed review budget (Popt Δ +0.144, recall Δ +0.098, density Δ p = 0.003 — all paired, significant). Full methodology & CIs →

User guide & per-biomarker reference: docs/CODE_HEALTH.md


Benchmarks

Reproducible, on public codebases — repowise-bench →

1 · Agent efficiency — repowise does the exploration once, offline

Most of a coding agent’s spend goes to exploration — greping for symbols, reading candidate files, re-reading them as context grows. repowise does that work once so the agent skips it on every query. Paired SWE-QA runs on real repositories (same model, same harness, with vs without repowise’s MCP tools):

−70% tool calls  ·  −89% file reads  ·  −36% cost per query  ·  answer quality at parity

Best case shown; across the two benchmarks the range is −49% to −70% tool calls, −69% to −89% file reads, and −29% to −36% cost. Bonus: feeding an agent a commit via get_context costs 2,391 tokens vs 64,039 for the raw changed files — ~27× fewer. Reports: flask48 · sklearn48

2 · Distill — index-aware output distillation

Most of what an agent reads from a shell command is noise: 300 lines of passing tests around 4 failures, full commit bodies for “what changed recently”. repowise distill <cmd> compresses command output before the agent reads it — errors-first, exit code preserved, and every omission reversible via an inline [repowise#<ref>] marker (repowise expand <ref>). Paired runs on a public OSS repo, per command:

CommandRaw → distilled tokensSaved
pytest -q (11 failures)3,374 → 1,31761% — all 11 failure lines preserved
git log -503,064 → 33189%
git diff (30 commits)62,833 → 8,63586%

Small outputs pass through untouched (net-positive guard), and in an end-to-end spot-check the agent reached the identical root-cause diagnosis from distilled output as from raw. Opt-in Claude Code hook rewrites noisy commands automatically (shown for approval); repowise saved tracks tokens and dollars saved. Full guide: docs/DISTILL.md →

3 · Code health predicts real defects

Health scores are collected at a historical commit (T0); bug-fixing commits are counted over the following 6 months; the two are correlated — strictly no leakage. Across 21 open-source repositories spanning all 9 Full-tier languages:

  • Cross-project mean ROC AUC 0.74 [95% CI 0.68–0.79] at identifying the files that go on to receive bug-fixes — up to 0.90 on individual repos.
  • Survives controlling for file size (partial Spearman ρ = −0.16) — it is not just “flag the big files.”
  • Significantly out-discriminates recent churn (+0.10 AUC) and prior-defect history (+0.12 AUC), DeLong p < 1e-9.
  • Holds up on an external published dataset it has never seen (PROMISE/jEdit CK-metrics: AUC 0.76–0.78, within ~0.03 of the dataset’s own tuned model).

Full report: health-defect/BENCHMARK_REPORT.md →


Local dashboard

repowise serve starts a full web UI alongside the MCP server — no separate setup.

repowise web UI

Highlights: Chat (natural-language Q&A) · Docs (wiki with Mermaid + graph sidebar) · Graph (interactive, 2,000+ nodes, community coloring, path finder) · C4 Architecture (Context → Containers → Components) · Risk (hotspots, ownership heatmap, module health, dead code, blast radius) · Contributors (per-author profiles) · Decisions (evidence drawer, evolution timeline, decision-graph) · Health (biomarker scores, coverage, trends) · Security (local pattern scan) · Costs · Workspace (cross-repo contracts & co-changes). Full view-by-view list in docs/USER_GUIDE.md.


Supported languages

15 languages parsed to AST · 9 at the Full tier · framework-aware across all of them.

Full tier   Python TypeScript JavaScript Java Kotlin Go Rust C++ C#

Good tier   C Ruby Swift Scala PHP  · Partial   Luau

TierLanguagesWhat works
FullPython · TypeScript · JavaScript · Java · Kotlin · Go · Rust · C++ · C#AST parsing, import resolution, named bindings, call resolution, heritage extraction, docstrings; multi-project workspace resolvers; framework-aware edges; per-language dynamic-hint extractors; code-health biomarkers
GoodC · Ruby · Swift · Scala · PHPAST parsing, import resolution, named bindings, call resolution, heritage (mixins / derive / extensions / traits), docstrings; dedicated workspace-aware resolvers; Rails / Laravel / TYPO3 framework edges; dynamic-hint extractors
Config / dataOpenAPI · Protobuf · GraphQL · Dockerfile · Makefile · YAML · JSON · TOML · SQL · Terraform · Markdown · ShellIncluded in the file tree; special handlers extract endpoints / targets where applicable
Git-blame onlyObjective-C · Elixir · Erlang · Dart · Zig · Julia · Clojure · Haskell · OCaml · F# · …Tracked in git history (blame, hotspots, co-change); no AST parsing yet

Adding a language needs one .scm query file and one config entry — no changes to the parser core. Full per-language matrix, code-health checklist, and the contributor recipe: docs/LANGUAGE_SUPPORT.md →


Who it’s for

Start here
Individual developerspip install repowiserepowise init → query from Claude Code in minutes. 100% local, BYO API key, free under AGPL-3.0.
Teamsrepowise.dev hosted — zero ops, hosted MCP endpoint, auto re-index on every commit, plus the free Repowise PR Bot that comments on hotspots, hidden coupling, and declining health per PR.
EnterprisesOn-prem topology, SSO/SCIM, RBAC, CVE-aware security layer, workflow integrations, and commercial licensing (no AGPL obligation) — see docs/COMMERCIAL.md.

Quickstart

pip install repowise          # or: uv tool install repowise

Single repo

cd your-project
repowise init        # builds all five intelligence layers (one-time)
repowise serve       # starts MCP server + local dashboard

Multi-repo workspace

cd my-workspace/     # parent dir containing backend/, frontend/, shared-libs/
repowise init .      # scans for git repos, indexes each, runs cross-repo analysis
repowise serve       # workspace dashboard + per-repo pages

repowise init automatically registers the MCP server, installs a PostToolUse hook in ~/.claude/settings.json, generates .mcp.json at the project root, and offers a post-commit hook that keeps everything in sync. If the Codex CLI is installed and logged in, interactive runs also offer to write project-local .codex/config.toml, .codex/hooks.json, and a managed AGENTS.md; non-interactive runs require --codex. Skip Codex setup with --no-codex; force or skip AGENTS.md with --agents / --no-agents.

Claude Code plugin. Prefer a one-command setup? Install the plugin from the marketplace — it registers the MCP server and hook and adds /repowise:* slash commands (init, health, risk, dead-code, decision, …):

/plugin marketplace add repowise-dev/repowise
/plugin install repowise@repowise

To add the MCP server to another editor manually:

{
  "mcpServers": {
    "repowise": { "command": "repowise", "args": ["mcp", "/path/to/your/project"] }
  }
}

Init time: the graph, git, dead-code, and code-health layers build in minutes with zero LLM calls — run repowise init --index-only for a queryable index almost immediately. The one-time cost is the documentation layer (LLM-generated wiki pages, can run in the background). After that, every commit-triggered update takes under 30 seconds and only regenerates the pages your change touched.

Docs: Quickstart · User Guide · CLI Reference · Codex · MCP Tools · Distill · Workspaces · Auto-Sync · Config


Nine MCP tools

Most tools are designed around data entities — one module, one file, one symbol — forcing agents into long chains of sequential calls. repowise tools are designed around tasks: pass multiple targets in one call, get complete context back. Every response carries an _meta envelope with index_age_days, indexed_commit, and a stale_warning that fires only when the indexed HEAD diverges from live .git/HEAD.

ToolWhat only this tool answers
get_overview()Architecture summary, module map, entry points, git health, community summary. First call on any unfamiliar codebase.
get_answer(question)Hybrid retrieval (FTS + vector via RRF) + PageRank bias + 1-hop graph expansion → a cited answer with calibrated retrieval_quality. Returns structured best_guesses on low confidence. Collapses search → read → reason into one round-trip.
get_context(targets, include?)Triage card for files / modules / symbols: title, summary, signatures, hotspot bit, governing_decisions, and symbol_ids. include opens callers/callees, ownership, metrics, decisions, full_doc. Batch many targets.
get_symbol("file.py::Name")Raw source bytes for one indexed symbol with exact line bounds — cheaper and safer than Read + offset math.
search_codebase(query, kind?)Semantic search over the wiki, filterable by kind (implementation / test / config / doc), tagging each result’s search_method.
get_risk(targets, changed_files?)Hotspot scores, dependents, co-change partners, ownership, test gaps, security signals. Pass changed_files for PR mode → a directive block (will_break, missing_cochanges, missing_tests, governance_risk).
get_why(query?, targets?)Architectural decision records, status, evidence spans, and the supersession lineage chain. Falls back to git archaeology when no ADRs exist.
get_dead_code(...)Unreachable code by confidence tier with cleanup-impact estimates; cross-repo consumer detection in workspace mode.
get_health(targets?, include?)25-biomarker scores per file. Dashboard mode → KPIs + lowest-scoring files + module rollup; targeted mode → per-file findings. include: coverage, refactoring, trend.

Worked example (“Add rate limiting to all API endpoints” in 5 calls instead of ~30 greps+reads) and the full reference: docs/MCP_TOOLS.md →


How it compares

repowiseGoogle Code WikiDeepWikiSwimmCodeScene
Self-hostable, open source✅ AGPL-3.0❌ cloud only❌ cloud only❌ Enterprise only✅ Docker
Private repo — no cloud❌ in development❌ OSS forks only✅ Enterprise tier
Auto-generated documentation✅ Gemini✅ PR2Doc
MCP server for AI agents✅ 9 tools✅ 3 tools
Proactive agent hooks✅ Claude + Codex hooks
Auto-generated AI instructions (CLAUDE.md, AGENTS.md)
Code health score (1–10)✅ 25 biomarkers✅ 25–30
Brain Method / LCOM4 / god class
Test-coverage intelligence✅ LCOV/Cobertura/Clover
Untested-hotspot detection✅ coverage × hotspot
Health trend + declining alerts✅ rolling snapshots
Refactoring recommendations✅ deterministic
Git intelligence (hotspots, ownership, co-change)
Bus factor analysis
Dead code detection
Architectural decision records
Multi-repo workspace intelligence✅ co-changes, contracts, federated MCP
Local dashboard❌ IDE only

repowise is the intersection: behavioral git intelligence + a defect-validated code-health score + auto-generated docs + agent-native MCP + architectural decisions + multi-repo workspace intelligence — self-hostable and open source. Detailed breakdown: docs/COMPETITIVE_ANALYSIS.md.


For teams & enterprises

repowise.dev is the same engine, fully managed — at feature parity with self-hosted: every CLI command, every MCP tool, the full dashboard. We dogfood it on our own codebase: live snapshot → · explore public repos →.

On top of self-hosting:

  • Zero ops — managed deploys & webhooks, auto re-index on every commit.
  • Hosted MCP endpoint — point any MCP client at one URL, no local server.
  • Repowise PR Bot — free GitHub App, one deterministic comment per PR (hotspot touches, hidden coupling, declining health, dead code), zero LLM calls. Install → · Learn more →
  • CVE-aware security layer, cross-repo intelligence at scale, and integrations (Slack, Jira/Linear, Confluence/Notion, PagerDuty) (rolling out).

What’s GA / in development / planned, on-prem topology, SSO/SCIM/RBAC, and pricing: docs/COMMERCIAL.md · Get in touch →


Privacy

  • Self-hosted: your code never leaves your infrastructure. No telemetry. No analytics.
  • BYOK: bring your own Anthropic / OpenAI key. We never see your LLM calls. Zero data retention via Anthropic’s API policy.
  • What’s stored: the NetworkX graph, LanceDB embeddings (non-reversible vectors), generated wiki pages, git metadata. Raw source is processed transiently and never persisted.
  • Fully offline: Ollama + a local embedding model = zero external API calls.

CLI & configuration

repowise init [PATH]      # index codebase (one-time; --index-only skips LLM)
repowise serve [PATH]     # MCP server + local dashboard
repowise update [PATH]    # incremental update (<30s; --workspace for all repos)
repowise query "<q>"      # ask anything from the terminal
repowise health           # code-health KPIs + lowest-scoring files
repowise risk main..HEAD  # score a branch / PR range for defect risk
repowise dead-code        # unreachable-code report
repowise distill pytest   # compact errors-first output (reversible) — saves 60–90% tokens
repowise saved            # tokens & dollars saved by distillation
repowise doctor           # check setup, API keys, store drift

repowise init generates .repowise/config.yaml (provider, model, embedder, reasoning mode, exclude patterns, git commit depth). Full command set: docs/CLI_REFERENCE.md · config reference: docs/CONFIG.md.


Contributing

git clone https://github.com/repowise-dev/repowise
cd repowise
uv sync --all-packages
uv run repowise --version
uv run pytest tests/unit/

Full guide, including how to add languages and LLM providers: CONTRIBUTING.md.


License

AGPL-3.0. Free for individuals, teams, and companies using repowise internally.

For commercial licensing — the enterprise security & compliance layer, SSO/SCIM, RBAC, workflow integrations, priority support and SLA, or embedding repowise in a product without AGPL obligations — see docs/COMMERCIAL.md or contact [email protected].


Built for engineers who got tired of watching their AI agent cat the same file for the fourth time.

repowise.dev · Explore → · Discord · X · [email protected]

Similar Articles

Why do coding agents keep reopening files they already should understand?

Reddit r/AI_Agents

The author observes that coding agents often fail to maintain a persistent understanding of large codebases, leading to redundant reads and pattern mismatches. They introduce RepoWise, an experimental tool that leverages repository signals like dependencies and commit history to address this.

Feels like coding agents are good at finding code but bad at understanding projects

Reddit r/AI_Agents

Discusses the observation that while coding agents are effective at locating code, they struggle with deeper project understanding, such as component relationships and project style. The author introduces RepoWise, a tool that provides repository-level signals like dependency graphs and git history to address these issues.

@rwayne: Context Mode solves the other half of AI Agent context issues: sandboxed tool outputs + persistent sessions. A 56 KB Playwright snapshot compressed to 299 bytes, 98% of data never entering the context. Every file edit, Git operation, task decision is stored into…

X AI KOLs Timeline

Context Mode is a tool that solves AI agent context problems by sandboxing tool outputs and persisting sessions, achieving up to 98% compression of Playwright snapshots and using BM25 retrieval to reduce context window usage. It supports 15 platforms including Claude Code, Gemini CLI, VS Code Copilot, and is used by major tech companies.