@AlchainHust: https://x.com/AlchainHust/status/2064676532212097418
Summary
This article provides a detailed review of Anthropic's newly released Claude Fable 5 model, and demonstrates the author's process of using it to develop a Mac App '翻箱' in one day. The model has significant improvements in code generation and stability.
View Cached Full Text
Cached at: 06/10/26, 07:56 PM
Claude Fable 5 Hands-On Deep Test! In 5 Hours, I Built a Mac App I’d Been Thinking About for a Long Time!
Woke up this morning, and Anthropic dropped another new model: Claude Fable 5.
Claude 5 is great, but why the hell introduce another name like Fable… Wasn’t the Opus, Sonnet, Haiku naming scheme annoying enough?
And why not call it Mythos?? A company sure has that classic Silicon Valley big-company vibe of just throwing out random names.
But jokes aside, this model is indeed powerful. One sentence to sum it up: Fable 5 is basically Mythos with safety guardrails. Mythos was Anthropic’s model previously only available to governments and invited institutions, known only by name externally. Now they’ve added guardrails and released it publicly.
And the most important thing to know: From now until June 22, Fable 5 is directly included in your Claude subscription – free for Pro and Max users. Starting June 23, it will be removed from the subscription quota, and you’ll need to pay separately on a usage basis. If you want to use it for free, you only have this two-week window.
One picture is enough to understand its capabilities:
SWE-Bench Pro test involves throwing real bugs from open-source projects at the model to fix itself. 80.3% means it can independently fix eight out of ten real bugs, leaving the second place 11 percentage points behind. The official announcement repeatedly emphasizes: The longer and more complex the task, the larger its lead. The most vivid official example is Stripe: migrating a 50 million line Ruby codebase – the team estimated two months, Fable 5 finished it in one day.
The first half of this article explains the model clearly. The second half shows you a practical test you won’t see elsewhere: I used it to actually build an app I’d been thinking about for a long time, from starting to packaging the installer – all in one day.
After about 5 hours, my feeling is: Fable 5’s code stability is significantly better than before. As long as you clearly describe your requirements or the problem you want to solve, it can basically handle it in one go.
The Backstory of Fable
In April this year, Anthropic launched a project called Project Glasswing, making Mythos available to government network defense departments and critical infrastructure providers. Only invited organizations could use it; the outside world only knew its name.
This release on June 9 actually launched two models simultaneously: Claude Mythos 5 and Claude Fable 5. They are the same underlying model, differing only in the safety classifier. Think of this as a security guard standing at the model’s gate, checking incoming questions and outgoing answers, stopping dangerous topics like bioweapons or cyberattacks directly. Mythos 5 has lax security, continuing to be for invited partners only; Fable 5 has strict security and is publicly available. The official definition is straightforward: a Mythos-class model made safe for general use.
The names are complementary: Mythos is Greek for “myth,” Fable is Latin for “fable” – one Greek, one Latin, both about storytelling. The same story, two ways of telling it; the tightness of the guardrails is directly embedded in the product name.
I understand the design logic, but I still want to complain: Haiku, Sonnet, Opus – users finally memorized which is which, and now two new names, Fable and Mythos, pop up. The model selector is starting to look like a literature elective. It’s sentimental, but really unnecessary. Just call it Claude 5 – the sky won’t fall.
Let me list a few key specs: context window 1M tokens, single output up to 128K, thinking mode forced on and cannot be turned off. This “cannot be turned off” will be mentioned later – it directly relates to my user experience.
Is It Expensive? Let’s Talk About My Actual Usage Today
The API pricing is the most discussed topic: $10/million input, $10/million input, $10/million input, $50/million output – exactly double Opus 4.8. Currently, in Claude Code, it also consumes tokens twice as fast as Opus models.
Let me share my own situation: I woke up at 9 AM today and have been working on 2-3 projects simultaneously, all with Fable 5, and haven’t hit any limits on the $200 Max plan yet.
But there’s a caveat: Anthropic reset usage when they released the model, essentially giving everyone a fresh start. I hope OpenAI fucking jumps into the competition soon so these resets come even more frequently.
As for after June 23, if it’s truly charged at API prices, that would be quite expensive. So my attitude is: use it to death for these two weeks, and deal with June 23 when it comes.
I Just So Happen to Have a Project I’ve Been Thinking About
Done with the model talk. For me, the criterion to judge a model is simple: Can it actually build what I want to build?
Here’s the thing: I never write code myself; all my products are written by AI. This workflow has a specific side effect: AI helps you start ten projects in an afternoon, but they’re scattered across folders with unidentifiable names, and you can’t see what the agent changed.
In my daily scenarios, my needs are quite basic: I want to easily open and view projects started by agents; I have many writing projects where drafts need repeated revisions, so I need a handy editor; for drawing and design tasks, I need to view the agent’s generated results one by one; when something the agent made goes wrong or doesn’t meet my expectations, I want to easily take screenshots and drag reference files to feed it.
None of these tasks involve coding, but every single one is stuck between the file system and the agent: Finder can see files but isn’t good for feeding the agent; the terminal can feed the agent but can’t see files. So my daily routine involves switching back and forth between Finder, Cursor, and browser windows, spending ages finding a file generated yesterday.
What I’ve always wanted is a true link between the file system and the agent: browse and preview local files on the left, a real terminal running the coding agent on the right – every time the agent modifies a file, the left side lights up instantly. A vibe coding cockpit.
I had previously built a web-based prototype – a local file browser page that could search and preview, but that’s it. Everything I really wanted was stuck later: an embedded real terminal, file monitoring, an editor, packaging and signing – the workload of a full desktop app. Before, it wasn’t impossible, but it was too tedious to modify. I’d go back and forth many rounds for one requirement, and eventually just put it aside.
On the day Fable 5 was released, I decided to give it a try with this project.
One Day, From Idea to Installer
Here’s the timeline:
On the afternoon of June 9, I first made a basic version with Opus 4.8. An Electron desktop shell, embedded terminal, and three-way linkage between files, terminal, and preview. But some core experiences never quite worked.
On the morning of June 10, after getting access to Fable, I started a major overhaul: code editor, Markdown WYSIWYG, image annotation editing, and a full layout redesign. Then I packaged, signed, and generated a .dmg installer.
I didn’t skip any validation. My delivery standard for this project was: 5 independent AI sub-agents, each acting as a heavy vibe coder, a native-design aesthete, a zero-documentation new user, a ten-year terminal veteran, and a destructive quality officer, scoring the finished product, real screenshots, and code – all must be ≥90 points with no red lines to pass. The first round was rejected: aesthetics hit a red line, terminal robustness wasn’t enough, and there were data security gaps. After fixing and re-reviewing, a total of four rounds, it finally passed.
It’s not a demo or a prototype. It’s sitting in my Applications folder. As I write this article, it’s open.
In the terminal at the bottom right, you can see Fable 5’s launch notification.
FanBox: What It Looks Like
The app is called 翻箱 (FanBox). You can also read it as an agent box: a tool to better manage agents and the file system, bringing “find files → run agent → see what it changed” into one window.
The design goal is that every file “looks like itself” – you know what it is without opening it.
Here are a few capabilities I use most:
Live Dashboard. Every time an agent writes a file, that file’s card ripples on the spot and glows based on modification frequency. When multiple projects run agents in parallel, the light follows the agent’s progress – “watching AI work” finally feels real-time.
Session Replay. In the changes panel, there’s a play button. Drag the timeline like scrubbing a video to replay what files the agent changed step by step during that period. After the agent runs a long task for 30 minutes, just drag the timeline to see what it did.
Drag Files to Feed Agent. Drag files or folders from the file list into the terminal – the path is automatically inserted into the input line. Select a piece of text in the preview, click once to send it to the terminal as context for the agent. Conversely, file paths that appear in the terminal are clickable and open in FanBox.
⌘K to Find. Remember a fragment of the name and you can search for files and folders. The top right of folder cards automatically shows project type badges like node/web/py – you can instantly recognize ten projects started in an afternoon.
Quick In-Place Editing. Code and JSON use Monaco (same engine as VS Code), Markdown is Notion-style WYSIWYG, and images can be directly annotated with arrows, blur, etc. Edit wherever you see, no need to open another editor.
This article was written in FanBox – previewing drafts on the left, Claude Code running in the terminal on the right:
Here’s a screenshot of my desktop cluttered with screenshots and screen recordings – without the terminal agent window open, it’s not much different from Finder:
After an agent made two changes in a folder, the card lights up like this:
Oh, and it has three themes that switch the color scheme, fonts, icons, and code highlighting as a whole: a fluorescent green/charcoal black terminal style, a cream paper/terracotta orange archive style, and a black/white/red index style.
My own definition of this product’s boundary is: FanBox doesn’t compete with Finder for file management; it doesn’t do plugins or debugging. Heavy work continues to be handled by the IDE. It only makes the single chain of “find + preview + quick edit + command agent” smooth. Everything runs locally, zero external network requests, data doesn’t leave your machine.
That said, it’s still a very early version, mainly solving my own problems. The aesthetic and features are tailored to my own needs – not intended to please anyone. I expect it to remain a simple personal project for a long time.
Real Feelings About Fable 5 During Development
Back to the model itself. The first day of development used Opus 4.8 as a base, the next day used Fable 5 for a major overhaul. Using two generations of models back-to-back on the same project, the difference is concrete: not faster, but fewer mistakes – solving the problem in one go.
A typical example: FanBox’s image thumbnail feature. The initial version would freeze for several seconds when clicking into a folder with many images. Performance issues like this used to be the most annoying: the model guesses a cause, makes a change, it gets better but not fixed, another guess, another change – three or four rounds and the code gets messier.
This time, I just described, “Directories with lots of images lag when clicked.” It identified two stacked root causes: thumbnails were loading the entire original image file, and each click was rebuilding the entire file grid. Then it fixed both in one go: added a thumbnail interface with caching, changed clicks to only toggle selection style without rebuilding. Click response dropped to under 0.1 seconds – imperceptible. One round, no rework.
Similar was the Chinese directory name garbling in the terminal. This involves xterm.js’s wide character handling – quite obscure. It directly pointed out using the unicode11 addon and warned that it’s an experimental API requiring explicit enablement. Such instances of accurately hitting niche problems happened many times during the day.
The packaging phase was even more striking. Electron packaging in China’s network environment is a series of pitfalls: binary download blocked, native module compilation failures, new node versions breaking build tools. Previously, environmental issues like this could waste a whole evening. This time, it navigated around mirror sources, changed compilation schemes, and adjusted packaging configurations one by one, bypassing the entire chain of problems. I just watched.
Why is this? I suspect the answer lies in that “cannot be turned off” mentioned earlier. Fable 5’s thinking mode is forced on, and individual responses clearly take longer than Opus. Slowness and accuracy are likely two sides of the same coin: think longer before acting, lower error rate, lower error rate means no rework. What it saves isn’t typing time – it’s rework.
To be honest: I didn’t run controlled experiments feeding the same problem to old models. These are subjective comparisons, not same-problem benchmarks. But the difference between “five rounds of back-and-forth before, one round now” is large enough that no instrument is needed.
Later I saw Boris Cherny, the author of Claude Code, evaluate Fable 5: “The biggest step since Opus 4.5.” He also emphasized judgment and debugging ability. This completely matches my own feeling.
Finally
FanBox’s installer is ready and has been open-sourced. You can also take my open-source code and modify it to create your own agent box 👇
https://github.com/alchaincyf/fanbox
As for Fable 5, my advice is simple: if you’re a Pro or Max subscriber, before June 23, switch the model to Fable 5 in Claude Code and feed it a project you’ve always wanted to do but thought was “too much trouble.” You have the free quota – I used it to build FanBox.
If you haven’t tried Claude Code yet, you can start with my Orange Book “Claude Code: From Beginner to Expert,” available on WeChat Reading.
The cockpit I’d been thinking about for a long time – from getting Fable 5 to completing the macOS app packaging – took 5 hours.
That project you’ve had on hold for a while might just be one weekend away.
Similar Articles
@seclink: Anthropic released Claude Fable 5 on June 9. As a “Mythos”-class model, it leads benchmarks in coding, research, and visual processing, especially good at large-scale projects like code migration. The model is now available on http://Claude.ai…
Anthropic released Claude Fable 5 on June 9, a “Mythos”-class model that leads benchmarks in coding, research, and visual processing, especially good at large-scale projects like code migration. It is now available to Pro, Max, Team, and Enterprise users.
Claude Fable 5 benchmarks
Anthropic released benchmarks for Claude Fable 5, a new AI model, showing significant performance improvements.
Introducing Claude Fable 5
Anthropic releases its most powerful model, Claude Fable 5, which is of the mythical tier and introduces an automated safety review mechanism that redirects high-risk requests to Opus 4.8, balancing powerful capabilities with safety risks.
@FinanceYF5: Fable 5 used itself to edit its own launch video. The Anthropic Claude Code team never touched a video editor — transcription, color grading, ffmpeg, Figma MCP, Remotion rendering, all done by Fable writing code and orchestrating tools. AI edited the video that announced the birth of AI...
Fable 5 fully automated the editing of its own launch video using Anthropic Claude Code team's tools (including ffmpeg, Figma MCP, Remotion, etc.), achieving AI-driven video production, creating a recursive effect.
Claude Fable 5 and new AI safety fables (14 minute read)
Anthropic released Claude Fable 5, a major new model with significant capability improvements across benchmarks and new safety measures, marking a pivotal moment in AI development.