How I stopped context window bloat in continuous Anthropic agent loops (Opus + Sonnet architecture)

Reddit r/AI_Agents Tools

Summary

A developer shares an architectural pattern to manage context window bloat in continuous Anthropic agent loops, using KV caching, dynamic tool schema loading, and decoupling executor/advisor roles with Claude 3.5 Sonnet and Claude 3 Opus.

I’ve been spending a lot of time deploying multi-agent architectures, and one of the biggest bottlenecks in running continuous agentic loops is hitting context limits and the resulting API latency spikes. I wanted to share an architectural pattern that has been working well for me to manage memory and compute using Claude 3 Opus and 3.5 Sonnet. Here are the three main components of the setup: * **KV Prompt Caching for Latency:** Instead of sending the full system prompt on every turn, I'm utilizing KV caching to isolate latency. The core instructions and static context stay cached, which significantly speeds up the loop iteration. * **Defer Loading Tool Schemas:** Stuffing the initial context with every possible tool schema is what usually causes bloat. I shifted to dynamically loading tool schemas only when the agent's initial routing dictates they might be needed. * **The "Advisor Strategy" (Decoupling roles):** To balance cost and reasoning, I decoupled the execution and advisory layers. I use Claude 3.5 Sonnet as the high-speed "Executor" for standard routing and tool calling. When the logic gets too complex or an error needs debugging, the context (after going through a memory compaction/summarization step) is routed to Opus, which acts purely as the "Advisor" before handing control back to Sonnet. I'd love to hear how you all are handling memory compaction and long-running transcripts in your own agent loops. Are you doing summarize-and-replace, or something else?
Original Article

Similar Articles

Effective harnesses for long-running agents

Anthropic Engineering

Anthropic introduces a two-part solution using an initializer agent and a coding agent to enable the Claude Agent SDK to effectively handle long-running tasks across multiple context windows by maintaining a clean, incremental state.