Fable 5 just made cost-aware model routing mandatory for agent builders

Reddit r/AI_Agents Models

Summary

Anthropic released Fable 5, a powerful new model with high pricing, making cost-aware routing essential for agent builders due to token fan-out and high output costs.

Anthropic dropped Fable 5 today, their new Mythos-class model above Opus. Pricing is $10/M input and $50/M output, exactly double Opus 4.8. If you build agents, the rate card is not the part that should worry you. The part that should worry you is fan-out. One user "question" to an agentic system is never one completion. It's a planning pass, a handful of sub-agent spawns, tool call loops, retries, and self-verification passes. Anthropic is explicitly marketing Fable 5 for multi-day autonomous sessions with sub-agent delegation. A single complex request can fan out into tens of millions of tokens, and at $50/M output that's a four-figure bill for what the end user experiences as asking one question. I tested this firsthand on the consumer side. On the Max 20x plan I was burning roughly 2% of my usage window per minute during a heavy session. Same workloads on Opus 4.8 never came close to limits. The model thinks longer and writes more per turn, so the effective cost per task is well above the 2x the price sheet suggests. What this changes for agent architecture: The flat default-to-the-best-model approach is dead at this tier. You need a router in front: cheap model (Haiku/Sonnet class) for classification, extraction, and glue work, mid-tier for standard reasoning, Fable only for the steps that genuinely need frontier capability. Prompt caching matters more than ever (90% input discount). Token budgets and per-task cost ceilings need to be first-class citizens in your orchestration layer, not an afterthought. And you need observability on cost per task, not just cost per call, because the fan-out is where budgets die. Uber reportedly blew through their annual AI budget in four months, before this pricing tier even existed. To be clear, the model is genuinely a step up and for hard long-horizon problems it probably pays for itself. But "which model" is now an economic decision your orchestrator makes per step, not a config value you set once. How is everyone handling routing today? Static rules per task type, an LLM judge picking the model, or just eating the cost?
Original Article

Similar Articles

Fable 5 Is Dead. And Honestly? We Might Be Better Off

Reddit r/openclaw

US government forced Anthropic to pull its most powerful model, Fable 5, just days after launch. New benchmarks from OpenRouter show that fused panels of budget models can match or exceed Fable 5's performance at half the cost, raising questions about the value of frontier models.