@_avichawla: A smarter Claude model burns more tokens, not fewer! And it's not a minor 3-5% difference. But 54% higher token usage. …

X AI KOLs Following Tools

Summary

The article analyzes why smarter AI agents like Claude consume more tokens when interacting with human-centric backends like Supabase due to inefficient context discovery. It introduces InsForge, an open-source backend tool designed for agents that provides structured context to significantly reduce token usage and manual interventions.

A smarter Claude model burns more tokens, not fewer! And it's not a minor 3-5% difference. But 54% higher token usage. It sounds counterintuitive, but MCPMark V2 benchmarks confirmed this across 21 backend tasks. The reason has nothing to do with the model itself. Instead, it has to do with what the agent needs to know before it can start building. When you're building a full-stack app, CC must understand the entire backend, like: - what tables already exist - what RLS policies are active - what storage buckets are available - which auth providers are configured - and what edge functions are deployed Most backends don't hand over this info cleanly. For instance, with Supabase, asking for OAuth setup via MCP returns the entire auth docs, including sections on email/password, magic links, phone auth, SAML, and SSO. That's 5-10x more tokens than the agent actually needed. And this happens on every MCP call across every domain. The agent then discovers the state through separate calls to list_tables, execute_sql, and list_extensions, each returning a partial view. Some info, like which auth providers are configured, isn't queryable through MCP at all. And when something breaks, Supabase returns the same error code whether the rejection came from the platform layer or from the function code. The agent has no way to infer accurately, so it cycles through code-level fixes for a problem that might not be in the code at all. A better model does not have a magical way to skip these gaps. In fact, it tries even harder to fill them, which means more discovery queries, more reasoning, and more retries. That's why the token cost went up with a better Claude model. A smarter approach is actually implemented in InsForge, an open-source backend (self-hostable via Docker) that offers the same primitives as Supabase but structures everything around the assumption that an agent is operating the backend, not a human on a dashboard. Before writing any code, a single CLI call returns the full backend topology in ~500 tokens. The agent sees every table, auth provider, storage bucket, and available AI models in one structured response. Instead of one broad skill like Supabase that triggers on everything, it has four narrowly scoped skills. - Creating tables only activates the CLI skill. - Debug skill only activates when code breaks. - Building frontend only activates the SDK skill. - Wiring third-party auth only activates the integrations skill. This keeps the agent's cognitive load lean since it only loads what matches the current task. The CLI returns structured JSON with semantic exit codes on every operation, so the agent always knows whether something succeeded or failed and why. There are no ambiguous 401s that may indicate three different things. I tested both backends on the same full-stack RAG app and recorded the full sessions. Supabase: - consumed 10.4M tokens - needed 10 manual interventions InsForge: - consumed 3.7M tokens - completed the entire build without any errors This isn't a Supabase-specific problem. Most backends were designed for humans who can see dashboards and interpret raw errors. When an agent operates the backend instead, every missing piece of context needs a discovery call, and every ambiguous error enters a retry loop. Fixing this requires giving agents structured backend context before they start writing code. InsForge is an open-source implementation of exactly this, and you can self-host it via Docker. GitHub repo (9k+ stars): https://github.com/InsForge/InsForge… (don't forget to star it ) You can find my walkthrough on building the full-stack RAG with Supabase and Insforge in the article below.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/10/26, 04:24 AM

A smarter Claude model burns more tokens, not fewer! And it’s not a minor 3-5% difference. But 54% higher token usage. It sounds counterintuitive, but MCPMark V2 benchmarks confirmed this across 21 backend tasks. The reason has nothing to do with the model itself. Instead, it has to do with what the agent needs to know before it can start building. When you’re building a full-stack app, CC must understand the entire backend, like: - what tables already exist - what RLS policies are active - what storage buckets are available - which auth providers are configured - and what edge functions are deployed Most backends don’t hand over this info cleanly. For instance, with Supabase, asking for OAuth setup via MCP returns the entire auth docs, including sections on email/password, magic links, phone auth, SAML, and SSO. That’s 5-10x more tokens than the agent actually needed. And this happens on every MCP call across every domain. The agent then discovers the state through separate calls to list_tables, execute_sql, and list_extensions, each returning a partial view. Some info, like which auth providers are configured, isn’t queryable through MCP at all. And when something breaks, Supabase returns the same error code whether the rejection came from the platform layer or from the function code. The agent has no way to infer accurately, so it cycles through code-level fixes for a problem that might not be in the code at all. A better model does not have a magical way to skip these gaps. In fact, it tries even harder to fill them, which means more discovery queries, more reasoning, and more retries. That’s why the token cost went up with a better Claude model. A smarter approach is actually implemented in InsForge, an open-source backend (self-hostable via Docker) that offers the same primitives as Supabase but structures everything around the assumption that an agent is operating the backend, not a human on a dashboard. Before writing any code, a single CLI call returns the full backend topology in ~500 tokens. The agent sees every table, auth provider, storage bucket, and available AI models in one structured response. Instead of one broad skill like Supabase that triggers on everything, it has four narrowly scoped skills. - Creating tables only activates the CLI skill. - Debug skill only activates when code breaks. - Building frontend only activates the SDK skill. - Wiring third-party auth only activates the integrations skill. This keeps the agent’s cognitive load lean since it only loads what matches the current task. The CLI returns structured JSON with semantic exit codes on every operation, so the agent always knows whether something succeeded or failed and why. There are no ambiguous 401s that may indicate three different things. I tested both backends on the same full-stack RAG app and recorded the full sessions. Supabase: - consumed 10.4M tokens - needed 10 manual interventions InsForge: - consumed 3.7M tokens - completed the entire build without any errors This isn’t a Supabase-specific problem. Most backends were designed for humans who can see dashboards and interpret raw errors. When an agent operates the backend instead, every missing piece of context needs a discovery call, and every ambiguous error enters a retry loop. Fixing this requires giving agents structured backend context before they start writing code. InsForge is an open-source implementation of exactly this, and you can self-host it via Docker. GitHub repo (9k+ stars): https://github.com/InsForge/InsForge… (don’t forget to star it ) You can find my walkthrough on building the full-stack RAG with Supabase and Insforge in the article below.


InsForge/InsForge

Source: https://github.com/InsForge/InsForge

InsForge

The all-in-one, open-source backend platform for agentic coding.

License Downloads Contributors Visit InsForge.dev gitcgr

Follow on X Follow on LinkedIn Join our Discord

InsForge%2FInsForge | Trendshift

Vercel OSS Program

Help us reach more developers and grow the InsForge community. Star this repo!

InsForge

The all-in-one, open-source backend platform for agentic coding. InsForge gives your coding agent database, auth, storage, compute, hosting, and AI gateway to ship full-stack apps end-to-end.

https://github.com/user-attachments/assets/345efbc6-ca63-4189-bde0-12ef3bda561b

How it works

Coding agents interact with InsForge through one of two interfaces:

  • MCP Server (self-hosted and cloud): exposes InsForge’s operations as tools any MCP-compatible agent can call.
  • CLI + Skills (cloud only): a command-line interface paired with Skills that agents invoke directly from the terminal.

Both interfaces let coding agents operate the backend like backend engineers:

  • Read backend context and state: Pull documentation, schemas, metadata (deployed functions, bucket contents, auth config), and runtime logs, so the agent has what it needs to write code, verify what it built, and debug when something breaks.
  • Configure primitives: Deploy edge functions, run database migrations, create storage buckets, set up auth providers, and configure other backend resources directly.
graph TB

    subgraph TOP[" "]
        AG[AI Coding Agents]
    end

    subgraph MID[" "]
        SL[InsForge]
    end

    AG --> SL

    SL --> AUTH[Authentication]
    SL --> DB[Database]
    SL --> ST[Storage]
    SL --> EF[Edge Functions]
    SL --> MG[Model Gateway]
    SL --> CP[Compute]
    SL --> DEP[Deployment]

    classDef bar fill:#0b0f14,stroke:#30363d,stroke-width:1px,color:#ffffff
    classDef card fill:#161b22,stroke:#30363d,stroke-width:1px,color:#ffffff

    class AG,SL bar
    class AUTH,DB,ST,EF,MG,CP,DEP card

    style TOP fill:transparent,stroke:transparent
    style MID fill:transparent,stroke:transparent

    linkStyle default stroke:#30363d,stroke-width:1px

Core Products:

  • Authentication: User management, authentication, and sessions
  • Database: Postgres relational database
  • Storage: S3 compatible file storage
  • Model Gateway: OpenAI compatible API across multiple LLM providers
  • Edge Functions: Serverless code running on the edge
  • Compute (private preview): Long-running container services
  • Site Deployment: Site build and deployment

⭐️ Star the Repository

Star InsForge

If you find InsForge useful or interesting, a GitHub Star ⭐️ would be greatly appreciated.

Quickstart

Cloud-hosted: insforge.dev

InsForge.dev

Self-hosted: Docker Compose

Prerequisites: Docker + Node.js

1. Setup

You can run InsForge locally using Docker Compose. This will start a local InsForge instance on your machine.

Deploy on Docker

Or run from source:

# Run with Docker
git clone https://github.com/insforge/insforge.git
cd insforge
cp .env.example .env
docker compose -f docker-compose.prod.yml up

2. Connect InsForge MCP

Open http://localhost:7130

Follow the steps to connect InsForge MCP Server

Connect InsForge MCP

3. Verify installation

To verify the connection, send the following prompt to your agent:

I'm using InsForge as my backend platform, call InsForge MCP's fetch-docs tool to learn about InsForge instructions.

4. Running Multiple Projects

You can run multiple InsForge projects on the same host by using different ports and project names.

# Create a separate env file for each project
cp .env.example .env.project1
cp .env.example .env.project2

Edit .env.project2 with different ports:

POSTGRES_PORT=5442
POSTGREST_PORT=5440
APP_PORT=7230
AUTH_PORT=7231
DENO_PORT=7233

Start each project with a unique name:

docker compose -f docker-compose.prod.yml --env-file .env.project1 -p project1 up -d
docker compose -f docker-compose.prod.yml --env-file .env.project2 -p project2 up -d

Each project gets its own isolated database, storage, and configuration. Manage them with:

docker compose -f docker-compose.prod.yml --env-file .env.project1 -p project1 ps      # status
docker compose -f docker-compose.prod.yml --env-file .env.project1 -p project1 logs -f  # logs
docker compose -f docker-compose.prod.yml --env-file .env.project1 -p project1 down     # stop

One-click Deployment

In addition to running InsForge locally, you can also launch InsForge using a pre-configured setup. This allows you to get up and running quickly with InsForge without installing Docker on your local machine.

RailwayZeaburSealos
Deploy on RailwayDeploy on ZeaburDeploy on Sealos

Contributing

Contributing: If you’re interested in contributing, you can check our guide here CONTRIBUTING.md. We truly appreciate pull requests, all types of help are appreciated!

Support: If you need any help or support, we’re responsive on our Discord channel, and also feel free to email us [email protected] too!

Documentation & Support

Documentation

Community

  • Discord - Join our vibrant community
  • Twitter - Follow for updates and tips

Contact

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Star History Chart

Badges

Show your project is built with InsForge.

Made with InsForge

Made with InsForge

Markdown:

[![Made with InsForge](https://insforge.dev/badge-made-with-insforge.svg)](https://insforge.dev)

HTML:

<a href="https://insforge.dev">
  <img
    width="168"
    height="30"
    src="https://insforge.dev/badge-made-with-insforge.svg"
    alt="Made with InsForge"
  />
</a>

Made with InsForge (dark)

Made with InsForge

Markdown:

[![Made with InsForge](https://insforge.dev/badge-made-with-insforge-dark.svg)](https://insforge.dev)

HTML:

<a href="https://insforge.dev">
  <img
    width="168"
    height="30"
    src="https://insforge.dev/badge-made-with-insforge-dark.svg"
    alt="Made with InsForge"
  />
</a>

Star us on GitHub to get notified about new releases!

Similar Articles

Claude Token Counter, now with model comparisons

Simon Willison's Blog

Simon Willison upgraded his Claude Token Counter tool to support comparing token counts across different Claude models, revealing that Claude Opus 4.7's new tokenizer uses 1.46x more tokens than Opus 4.6 for the same text, resulting in ~40% higher costs despite identical pricing.

@akshay_pachaar: https://x.com/akshay_pachaar/status/2053166970166772052

X AI KOLs Timeline

The article discusses a shift in AI agent tool usage from the 'MCP vs CLI' debate to 'Code Mode,' where agents write code to dynamically import tools, significantly reducing context window usage. It highlights Anthropic's approach and Cloudflare's implementation, demonstrating a 98.7% reduction in token consumption for specific tasks.

@_avichawla: Anthropic's in trouble, again. The entire Claude experience is now available at 1/6th the price. Kimi now does everythi…

X AI KOLs Timeline

Anthropic's in trouble, again. The entire Claude experience is now available at 1/6th the price. Kimi now does everything Claude does, powered by K2.6, a 1-trillion-parameter MoE model that activates only 32B parameters per token. It covers all three features Claude has (Chat, Code, and Cowork): 1) Kimi Chat runs in four modes - Instant for fast responses - Thinking for deep reasoning - Agent for multi-step execution - and Agent Swarm for parallel workloads. There's a 262K context window across