Tag
The article analyzes a 2026 paper by Bai et al. showing that subagents and context bloat cause token costs in long agent runs to be ~1000x higher than chat, and presents three practical fixes (PLAN.md, read budget, out-of-band notes) that reduce token usage by 70-90%.
Goldman Sachs predicts AI agent token use will multiply 24 times by 2030, citing cost concerns as Uber and Microsoft rethink expensive agent usage, highlighting a key challenge for the AI boom.
An analysis of AI coding agent costs reveals that agentic workflows can use up to 3,500x more tokens than a simple ChatGPT call, with most waste coming from redundant context loading. The article suggests tracking repeated file actions and using efficient models to cut costs.
Token costs are emerging as a key enterprise concern for AI adoption, with CIOs struggling to manage spending across different models and use cases. OpenAI announced Guaranteed Capacity to address long-term compute access.
Google's Antigravity 2.0 uses 96 AI agents to autonomously create a functional operating system in 12 hours with under $1K in token costs, and it can run the game Doom.