@DeRonin_: i ran Fable 5 the whole day and still haven't touched my limits why? i stopped paying surgeon rates for small talk here…
Summary
A user shares a detailed workflow strategy for efficiently using multiple AI models (Fable, Opus, Codex, DeepSeek, GLM, Qwen, Kimi) by delegating tasks based on cost and capability, using a single CLAUDE.md routing table, and avoiding small talk to reduce token usage.
View Cached Full Text
Cached at: 07/02/26, 08:27 PM
i ran Fable 5 the whole day and still haven’t touched my limits
why?
i stopped paying surgeon rates for small talk
here’s how i actually run it:
- i don’t small-talk it
every “thanks!” makes it re-read the whole conversation at its price point
it’s an architect, not a roommate. Fable treats Opus 4.8 like the new Haiku now, so i push everyday tasks (standups, cleanup, small edits) to Opus and save Fable for the real problem
- i route everything from one CLAUDE.md
this is the big one. i keep a single routing table in my CLAUDE.md and let Fable act as the orchestrator that reads it and dispatches
rough shape of the file:
- Fable → planning, architecture, reviewing every stage
- Opus / Codex → implementation labor
- DeepSeek + GLM + Qwen (dirt cheap) → bulk grunt work: boilerplate, test writing, data cleaning, translations, first-draft docs
- Kimi / long-context models → reading huge files so Fable never spends its tokens on it
Fable never touches the cheap work directly. it plans, delegates to the right tier per task, then checks it against the plan. the expensive brain only spends tokens deciding
that one file is why my bill went DOWN while my output went up
- one big brief, not twenty tiny prompts
it holds hours of context in its head, so i hand it the whole messy thing at once: full context, the constraints, what i’m actually scared of
i gave it a refactor i’d been dreading for weeks in a single brief and it came back done. drip-feeding it line-by-line wastes the one thing it’s best at
- i frame requests defensively to dodge the classifier
the new one is jumpy and government-mandated. harmless prompts (especially security or bio stuff) get silently downgraded to Opus
learned this the hard way. now i phrase it defensively: “review this for compliance” instead of “find vulnerabilities”
and if it trips anyway, i don’t argue with it. fresh chat, rephrase neutrally, move on
- i never ask it to explain its reasoning
that one request can trip the same filter, and your work quietly gets handled by a weaker model while you think you’re still on Fable
- i give it a finish line it can’t fake
instead of “make it work” i write “run the tests, paste the output, or stop after 25 turns”
i skipped the brake once and watched it burn through my afternoon. someone else got billed $960 on a single prompt. the pasted-proof line also kills fake “done” reports
- i save it for the hardest thing on my desk
if you use it like a cheaper model, it performs like one and just costs more
the gap only shows up on problems hard enough to reveal it. so i bring it the thing i’ve been postponing for 3 weeks
use the specialist like a specialist and your limit lasts all week
gl
Similar Articles
I ran Fable 5 for half day and the guardrails are the real story
Anthropic's Fable 5 AI model shows impressive reasoning and context digestion but suffers from high latency, cost, and silent fallback to Opus 4.8 for certain domains, which can disrupt workflows.
@DeRonin_: https://x.com/DeRonin_/status/2054235707791778034
A practical guide on reducing AI coding expenses by 80% through smarter token management, including multi-model routing, prompt caching, and context discipline, rather than simply switching to cheaper models.
@DeRonin_: How I actually route between models : Tweet drafts : Sonnet 4.6 Long-form articles : Opus 4.6 Code work : Kimi 2.6 Agen…
A user shares their personal routing strategy between various AI models for different tasks like tweet drafts, articles, code, agentic loops, and image generation, arguing that single-model setups lead to higher costs.
A post to actually talk about peoples' experiences with Fable
A user shares their positive experience using the Fable AI model for analyzing a 180-page document on Japanese literature, noting it outperformed other models despite high token usage.
@diegocabezas01: Use Fable 5 as orchestrator and Opus + Codex to execute (to save fable usage): Fable 5 (max reasoning) = orchestrator O…
A tweet thread explaining how to configure Fable 5 as the orchestrator with Opus and Sonnet as subagents, plus Codex as a peer engineer, in Claude Code to optimize model usage and task delegation.