Big Idea

The argument in one line.

The unlock with Fable 5 is not switching to it but routing each stage of your build to the cheapest model that still clears the bar, because effort tier and model choice are two separate dials most users never touch.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You use Claude Code and are on a Fable 5 subscription, or are planning to upgrade.
You have burned through weekly credits faster than expected and want a systematic fix.
You run multi-agent or dynamic workflows and need a model-routing strategy across planning, execution, and verification stages.
You want a concrete cheat sheet for when to use Fable max vs medium vs Sonnet vs Opus 4.8.

SKIP IF…

You want benchmark comparisons against GPT, Gemini, or other providers -- the video explicitly skips benchmarks.
You do not use Claude Code or the Anthropic Claude platform.

TL;DR

The full version, fast.

Fable 5 is the most capable model Anthropic has shipped, but treating it as a daily driver will exhaust your credits in days. The central insight: Anthropic already auto-downgrades Fable to Opus 4.8 for sensitive requests, routing by risk. You apply the same mechanic, routing by cost. Use Fable at high or max only where decisions compound -- planning and final verification. Run all execution volume through Opus 4.8, Sonnet, or local models. Switch mid-session via /model so the spec stays in the thread. At medium effort Fable already beats Opus 4.8 at max, so you rarely need the top tier for anything but the plan.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 01:12

01 · With great power comes token burn

Hook framing: Fable 5 is real, but so is the cost. The premise is set without benchmarks.

01:12 – 02:10

02 · The real trap -- June 22 meter and addiction

Anthropic announcement on screen: Fable 5 exits flat-rate subscription June 23 and becomes metered. The trap is forming a dependency before the meter hits.

02:10 – 04:09

03 · System prompt x-ray: Fable 5 vs Opus 4.8

Leaked system prompt compared against Opus 4.8. 80 percent identical. Five new rules around self-harm and life sciences. Filler-word instructions removed -- suggesting retraining.

04:09 – 04:54

04 · Anthropic routes down for safety, you route down for cost

Fable auto-downgrades to Opus 4.8 on cybersecurity and health requests. Steal that mechanic: route bulk work to cheaper models voluntarily.

04:54 – 06:11

05 · When to use each tier of effort (the Goldilocks zone)

Core framework. Fable medium beats Opus 4.8 max. Effort tier is a separate dial from model choice. Planning and shipping = high/max. Execution = medium or lower, Sonnet, or Opus.

06:11 – 07:53

06 · The tactical loop: plan expensive, /model, execute cheap

Concrete mechanic: run Fable on max for planning, create a spec file, type /model to switch to Sonnet or Opus for execution, then re-invoke Fable to probe edge cases.

07:53 – 10:41

07 · Three real recipes: marketing site, 3D website, CRM

Three worked examples. Marketing site: Fable high plan, Opus med build, Fable low verify. 3D website: Fable x-high plan, Opus plus Sonnet agents build, Fable high verify. CRM: Fable max plan, x-high dynamic workflows, high-model verify.

10:41 – 11:07

08 · Don't be tribal: mixing models and providers

Brief section against model loyalty. Fable plus Codex via OpenAI plugin extension. Use whatever clears the task at lowest cost.

11:07 – 12:21

09 · The one workflow to walk away with

Consolidated: Fable max/high planning, orchestrate execution with agents using skills/MCPs/CLIs, Fable verification, ship, iterate. The paradigm persists as model names change.

12:21 – 12:43

10 · Fable vs Opus, tier by tier

Comparison: Fable low ties Opus 4.8 max. Fable medium beats it. Fable max crushes it.

12:43 – 14:13

11 · How to use Fable 5 responsibly

Wrap. Benchmarks are manufactured; results are what matters. Fable short-circuits on cybersecurity. Opus is still more reliable day-to-day. CTA for free cheat sheet and community.

Atomic Insights

Lines worth screenshotting.

Fable 5 at medium effort already beats Opus 4.8 at maximum effort -- throttling down costs less and still wins.
Anthropic auto-downgrades Fable to Opus for cybersecurity and health requests; you can apply the same routing logic to every bulk task you run.
80 percent of the Fable 5 system prompt is identical to Opus 4.8 -- new rules are almost entirely around self-harm and life sciences safety categories.
Filler-word instructions were removed from the Fable 5 system prompt, suggesting those habits were retrained into the base model rather than injected at runtime.
Most Claude Code users never touch the effort dial -- they pick a model and leave effort at default, leaving easy token savings on the table.
Using /model mid-conversation lets you shift from Fable to Sonnet without losing context -- the spec lives in the thread, not in the model.
Fable still short-circuits on cybersecurity-adjacent requests even when benign and above board, making Opus the safer daily driver for now.
Sub-agents in a dynamic workflow should almost never run on Fable; the orchestrator sets the plan, cheap models do the volume.
Verification stage scales with risk: marketing site gets Fable low, 3D website gets Fable high, CRM gets high model for auth and integration checks.
Planning is where decisions compound -- it is the only stage that earns max-effort spend. Execution volume does not.
Benchmarks can be manufactured to grab headlines; the only signal that matters is whether the output meets your specific task bar.
You can combine Fable with Codex via the OpenAI plugin extension -- model loyalty is a budget liability, not a virtue.

Takeaway

Model choice and effort tier are two different decisions.

WHAT TO LEARN

Picking the smartest model is only half the decision -- the effort dial is equally powerful and almost universally ignored.

01With great power comes token burn

Power and cost scale together -- Fable 5 is genuinely better, but every token costs more, so defaulting to the best model for everything is unsustainable.

02The real trap -- June 22 meter and addiction

Anthropic is moving Fable 5 to metered billing after a flat-rate trial window -- forming a dependency before the meter starts is the trap to avoid.

03System prompt x-ray: Fable 5 vs Opus 4.8

80 percent of the Fable 5 system prompt is identical to Opus 4.8, with additions concentrated in self-harm and life sciences safety rules.
Filler-word instructions were removed from the prompt, suggesting those tendencies were retrained into the model rather than patched at the prompt layer.

04Anthropic routes down for safety, you route down for cost

Anthropic already auto-downgrades Fable to Opus for risky asks -- routing by task type, not by habit, is the transferable lesson.

05When to use each tier of effort (the Goldilocks zone)

Fable 5 at medium effort already beats Opus 4.8 at maximum effort -- throttling down costs less and still wins.
Effort tier is a separate dial from model choice; planning and shipping earn high/max, execution volume earns medium or lower.

06The tactical loop: plan expensive, /model, execute cheap

The /model command lets you downgrade mid-session without losing context -- use it after the spec is locked to shift execution to a cheaper model.
After execution, re-invoke Fable to probe edge cases rather than trusting the cheaper model to self-certify.

07Three real recipes: marketing site, 3D website, CRM

Complexity of the plan determines how high you go on the first Fable call; sub-agents always use cheaper models regardless of project complexity.
Verification scales with integration risk -- the more moving parts that can silently fail, the higher the model tier needed to spot them.

08Don't be tribal: mixing models and providers

Provider loyalty is a cost liability -- Fable and Codex can run side by side; use whichever clears the specific task at the lowest tier.

09The one workflow to walk away with

The paradigm of plan-expensive / execute-cheap / verify-smart will persist as model names change -- it is the routing logic that matters, not the specific models.

10Fable vs Opus, tier by tier

Fable low ties Opus max, Fable medium beats it, Fable max crushes it -- even a heavily throttled Fable run outperforms a fully maxed Opus run.

11How to use Fable 5 responsibly

Benchmarks are manufactured for headlines; only your specific task results matter.
Fable still short-circuits on cybersecurity-adjacent requests even when benign, making Opus the safer default for daily coding.

Glossary

Terms worth knowing.

Effort tier: A setting within Claude Code (low / medium / high / max) that controls how much compute the model spends reasoning before answering. Distinct from the model choice itself.
Tokenomics: The economics of token consumption -- how cost scales with input/output length, model tier, and frequency of use.
Dynamic workflow: A Claude Code pattern where an orchestrator model spins up a variable number of sub-agents at runtime based on the task.
Ultracode: The highest-tier agentic execution mode in Claude Code, combining dynamic workflows with tool-use orchestration.
Metered access: Usage-based billing where each token consumed charges separately, as opposed to a flat-rate subscription with a usage cap.
/model command: A slash command in Claude Code that lets you change the active model mid-conversation without starting a new session, preserving all prior context.

Resources

Things they pointed at.

01:30linkAnthropic Fable 5 subscription announcement ↗

02:10linkLeaked Fable 5 system prompt (GitHub repo)

07:45productClaude Code Living Course (Early Adopters community) ↗

13:40productFable 5 effort-tier cheat sheet (free) ↗

Quotables

Lines you could clip.

00:51

“The average person will run out of credits by the time they say good morning to Fable.”

Punchy, quotable, lands the cost problem in one sentence→ TikTok hook↗ Tweet quote

13:17

“Benchmarks don't matter. They can be doctored. They can be manufactured. They grab the headlines, but the only thing that matters are your results.”

Contrarian take that lands hard, zero setup needed→ TikTok hook↗ Tweet quote

06:48

“Fable five on low is still a very competent model.”

Counter-intuitive claim -- short, specific, shareable→ newsletter pull-quote↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogy

So if you're planning on using the brand new Fable five model, you have to remember that with great power comes great responsibility and even greater amount of token burn. So this video isn't designed to walk you through a series of benchmarks because I'm sure you've seen enough of those already and you're bored of it.

It is designed, however, to show you when and where it makes sense to take advantage of such a brand new powerful level of intelligence. Because this model not only marks a shift into a brand new family of models for Anthropic, but also gives you a glimpse into what the new world of tokenomics might look like.

And now that we're in this territory, it's not as easy anymore as just switching to the brand new latest model, going on max mode, and calling it a day.

It's about learning when and where it makes sense to use different tiers of such a potent, but eye wateringly expensive model so you can actually use it in your day to day workflows. Because the average person will run out of credits by the time they say good morning to Fable, and you don't want to be in that position unnecessarily.

So if you watch this video till the very end, I promise you you'll get a very nuanced take on where to use different tiers of effort for things like Fable, Opus, and to how use them all in unison. So if you want to be ahead of the pack, then let's dive in. Now, what I'm about to show you is the real trap of going all in with this model and just spending all your time building with it.

Assuming you even have enough credits to do so. So you have up until June 22 to push this model using your existing subscription before it becomes metered, meaning you have to pay for additional access via API. So if you start getting addicted to this level of intelligence and using it for everything from building specs to building apps to answering a question on what you should have for lunch today, you're not gonna be having a good time.

And knowing that Anthropic is planning on IPO ing this year, I pay very close attention to sentences like this where it says, if capacity allows, will they bring the Fable model back into the subscription that you'll be paying every month? Because for now, a flat rate combined with a resetting of limits leads you down a path where you get so used to this new standard of output, they don't want to give it up.

So the goal is to supercharge your productivity, not to get you addicted to a brand new model. Now within a few hours of the model releasing, a team allegedly extracted the entire system prompt from Fable five.

So what I did is I spun up a brand new session using Fable and I took this system prompt that apparently comes from the model and compared it to the 4.8 Opus version to look for where they differ and where they have very similar qualities. Because theoretically, if you were to reach this AGI level model, you would need less of a system prompt not more because a lot of those instructions should be baked in or synthesized or learned during the training process.

When you take a look at the Fable five system prompt it's still the entire length of Harry Potter or The Hobbit. So this tells me that it is better, but it still needs a lot of hand holding. And when you compare it to Opus 4.8, around 80% of the system prompt is still pretty much the same.

So despite it being apparently way smarter than Opus 4.8, again 80% of the documents are pretty much the same. The five new rules that are in Fable are really around life sciences, self harm, and preventing that.

And interestingly, it has more explicit instructions to prevent self harm and find different nuanced ways that people are trying to bypass those specific restrictions.

So assuming that the system prompt is legit, these are examples of explicit additions to the fable system prompt. And all of them like I said revolve around things like self harm, look alikes, naming a diagnosis, crisis management, disorders, etcetera and things that is removed are things around using filler words like genuinely, honestly, actually.

So when I see those being removed that tells me maybe the model has now been trained to smooth out that kink or that habit from the core model itself. So when it comes to raw competency leaps for things like coding, that's where these improvements happen within the model itself.

But I noticed that for things like behavior, those are things that are harder to adjust over time. And based on reading the model card, this model seems to be identical to Mythos except it has a very hard line on safeguards where every single request that passes through this specific API looks through is this a cyber security question, a life sciences question, a health question, and if so, it automatically downgrades your model from Fable to OPUS 4.8.

So if it already does that on its own, what can we learn from this exact behavior to voluntarily apply it to the way we work? In this case, if you have bulk work, you don't want to spend your tokens time and money on Fable. You could use Fable to plan the approach, but actually execute all of those different tasks with cheaper models with Opus 4.8, with Sonnet, with different tiers of effort, which again, most people don't play with.

They just default to choosing the model, but not choosing the combination of the model with the level of effort necessary. Which leads us to the crux of the video, which is when and where do we use different tiers of effort and where is the Goldilocks zone depending on the specific task you're trying to execute. Now the average person will upgrade their Claude code and then use Fable on high mode by default and then one shot a website or two they don't actually need to see if it works and then they'll lose all their credits for the week if not the month.

But you, however, could use something like Medium because Medium actually beats Opus 4.8 on its max mode. So understanding the level of power difference will let you spend way less tokens to accomplish a much better result.

If you needed some heuristics or a cheat sheet on when to use different tiers of effort, if you're planning or you're in the final stages of shipping and deploying, it makes sense to use something like high or max mode on something like Fable.

But when it comes to things like execution, you can not only use lower tiers of Fable, but you can offload those to Sonnet, to Opus, and even local models.

Because based on where Agentic Engineering seems to be headed, we will have incredibly smart models in the future where you might use it for one or two prompts, finish your limit for the day or the week, and then actually do the core execution with lower level models using that provider or using your core local computer.

So if you're gonna have to adopt that habit then, why not build it now? And to reiterate this tactically, you would start something like Fable on max effort or high effort, whatever you could get away with at the lowest level, and then create a spec file or a plan file.

And based on going back and forth to create a plan, sometimes I spend an hour, two hours purely just planning. Once I have that plan, you can then offload this to a lower model. All you'd have to do tactically in a session is go to your models, do slash model in the middle of the conversation.

Then instead of fable, either you can go left or right to lower it to something like medium effort or low effort and you'd be surprised. Fable on low is still a very competent model.

Or you can change it to something like SONNET on high effort or medium effort or you can go to the default opus. So depending on your workflow, it makes a lot more sense nowadays to switch models in the middle of a conversation.

And if you want to close the loop using something like Fable, then after these models have ran and executed this plan, then you can push them. You can probe them and ask them, have you tested these different fringe cases? Have you looked for these tiers of edge cases?

And then you can run Fable to look through all of the executed code or structure or document or whatever it is that you produced to look over it with this high tier model. And by the way, if you like the way that it break down these concepts and focus on the signal rather than all the noise and hype, then you'll love everything we have in the early adopters community.

I update our Claude Code Living Course every single week, and we have tons of resources and coaches to help you with things like cybersecurity, monetizing AI, and practically teaching AI to businesses. So if that sounds interesting, check out the first thing down below and maybe I'll see you inside.

Alright. Back to the video. And to make this even more tactical, let's go through three different examples.

So let's say you want to build a general marketing website, then a three d website, so this needs more additional frameworks and dependencies and then a CRM app. For the very first one, the marketing site, we could start with planning on high mode using Fable and then we could execute it using opus on medium, high, max or fable on medium.

You have to use some judgment, some taste. Is this a really complicated task or are you just generating a PowerPoint file? You probably don't need a nuclear bomb to generate a PowerPoint file.

So using this judgment and being proactive will save you tons of time, money, and the ability to still use this model even when it's metered very sparingly when it makes the most sense. And then if you want to fan out or spawn multiple sub agents, each one of those sub agents will use different lower level models. You don't even have to worry about using Fable for those models.

And then you could end off by running a verification step on Fable low mode where it goes through, maybe uses the clauding Chrome MCP to open up a brand new browser, click around your website, take a look at the accessibility, the readability of the text, the overall look and feel, and then go through that agentic loop to self improve itself.

For three d website, because three d websites use this framework called three js which is a JavaScript framework, you would maybe use x high for planning or even max on planning depending on the complexity of it and how likely you want it to one shot this entire website. Then when it comes to building, because again it might be more complicated, you might even use a dynamic workflow.

So you don't want to use Fable for a dynamic workflow, maybe use Opus 4.8 and then it spins out a series of Sonnet and Opus agents to help you take care of it. And then after this, when it comes to assembling all the different parts of the website, this might use some sub agents as well. Then you could go through a verification step on high mode this time because you want to verify that while you're not at your computer, this three d rendered website still looks legitimate, looks clean, and looks exactly what your plan reflected.

And last but not least, when it comes to a CRM. A CRM has a lot of moving parts. You have different tabs, different endpoints, you have different security needs, you have different ways of inputting data and potentially exporting data.

So that plan might need infinitely more love and it makes more sense to use a fable max mode to produce that plan. And to build it, it might make sense to use the x high to fan out a series of workflows using dynamic workflows. Most likely, won't use Fable for those sub agents.

It will use different tiered models. When it comes to verification again, it might make sense to still use a high level model. So you can see here depending on the use case, depending on the subsequent steps and sequential steps, you might wanna use different tiers of intelligence using different models and you can even use them from different providers.

Because I also use codecs side by side with Opus using the plug in extension from OpenAI. So you can use a combination of Fable with Codecs 5.5, 5.6, whatever comes out there.

So you don't have to be tribal. You just have to be very efficient at understanding what is the right tool for the right job. So if you wanted to walk away from this video understanding a core possible workflow that you could execute, you could start with Fable five on either max or high mode, then go into planning, then orchestrate any of the execution with a series of agents that have access to things like skills, like additional agents, MCPs, command line interfaces.

Then you could use something like Fable to execute a verification step. And then you can get to the verification stage, you ship it, and then you can go through this loop for different iterations, different features. But understanding this, even if the names change, which they will, if not by the end of today, maybe by the end of this month, the entire paradigm will persist.

So when it comes to things like ultra code and dynamic workflows, today this is the state of the art, But this entire snapshot is meant to be modular. So if we get something brand new that's infinitely better, this comes out and you swap that in.

And the same thing here, when we inevitably get another literature based named model like Soliloquy six, then this will be the exact same paradigm. It will probably be eye wateringly expensive, especially this will be post IPO.

It will be very expensive. So we will have to be very responsible as solo practitioners, as organizations, as entrepreneurs to use the right model at the right price for the right task.

And to give you the TLDR of the TLDR, these are different scenarios where Fable five, depending on the effort level, might be directionally equivalent to a certain tier of Opus. So Fable five on low could be very tied with Opus 4.8 max.

Medium would technically beat 4.8 max, and then Fable five max would completely eclipse Opus 4.8. Each model will naturally have its own different behavior.

So something like Fable as of today still short circuits when it have any request that has anything related to cybersecurity even if it's not the core goal or it's not malicious and it's completely above board. So depending on your workflow, I still don't see Fable as a daily driver model based on initial testing.

Opus is more trusted for now, but using it and understanding the flow of when to use different effort levels will help you immensely. And to end off, it's important to remember that benchmarks don't matter. They can be doctored.

They can be manufactured. They grab the headlines, but the only thing that matters are your results. So hopefully that gives you a good primer on how to use this Fable model responsibly.

It's definitely an amazing model. I've already used it to one shot all kinds of bugs that have struggled with using existing models, so there's no question that it's amazing, but there is a huge question as to how long it would be sustainable to use it for your day to day tasks. And to help you solidify everything I walked you through, I made a cheat sheet guide of where and when to use these different effort levels in the second link down below, so feel free to grab that for free to help you with your next project.

And as always, if you want to go infinitely deeper on things like Claude Code, AgenTic workflows, and how to use AI in general practically in enterprises and businesses, then check out the first thing down below, and I'll maybe see you inside my early a adopters community. And for the rest of you, if you found this to be a breath of fresh air from all the hype that's being pushed down your throat, then feel free to leave me a like, leave a comment, and I'll see you in the next one.

The Hook

The bait, then the rug-pull.

With great power comes an even greater token bill. Unlike most AI tool videos that lead with benchmarks, this one leads with the credit limit. The title is a warning, the hook is a price tag, and everything that follows is a routing strategy.

Frameworks

Named ideas worth stealing.

04:54model

The Effort Tier Matrix

Plan: Fable high/max
Orchestrate: Fable medium or Opus
Execute: Sonnet / Opus / local
Verify: Fable low to high (scales with risk)
Sub-agents: always cheaper models

Effort and model are two separate dials. Most users pick a model and never touch the effort dial.

Steal forAny agentic Claude Code build where token budget matters

04:09concept

The Routing Mechanic (steal from Anthropic)

Anthropic auto-downgrades risky asks from Fable to Opus 4.8. Apply the same mechanic for cost: route high-stakes planning to Fable, route volume execution down to cheaper models.

Steal forMulti-step builds where planning and verification need intelligence but execution is mechanical

07:53list

Three Recipe Cards

Marketing site: Fable high plan / Opus med build / Fable low verify
3D website: Fable x-high plan / Opus plus Sonnet agents build / Fable high verify
CRM: Fable max plan / x-high dynamic workflows / high-model verify

Complexity of the project drives how high you go on planning; sub-agents always use cheaper models; verification scales with integration risk.

Steal forScoping any Claude Code project before starting

CTA Breakdown