Modern Creator
Jan Marshal · YouTube

I Spent $200 Testing Claude Fable 5 (I'm Not Sure It's Worth It)

A head-to-head build of the same finance app in Fable 5, Opus 4.8, and GPT 5.5 - same prompts, same workflow, $200 in 24 hours.

Posted
today
Duration
Format
Review
educational
Views
985
37 likes
Big Idea

The argument in one line.

Claude Fable 5 is the best coding model available, but the quality delta over Opus 4.8 is small enough that its 2x+ price and lack of fast mode make it a hard sell for anyone who codes with AI every day.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • You already use Claude Code or Cursor daily and are deciding whether to upgrade to Fable 5 API credits.
  • You are benchmarking frontier models for a real shipping workflow, not synthetic evals.
  • You want a side-by-side UI comparison of what Fable 5 vs Opus 4.8 actually produce from the same prompt.
  • You are evaluating GPT 5.5 as an alternative to Anthropic models for front-end code generation.
SKIP IF…
  • You are on a free or Pro plan and just want to know if Fable 5 is included - the answer is yes until June 22.
  • You need a deep architectural review of code quality - this is a workflow and UI comparison, not static analysis.
  • You are new to AI coding tools and have no baseline for comparison.
TL;DR

The full version, fast.

Fable 5 produces cleaner code, tighter PRD questions, and better UI than Opus 4.8 - but the margin is narrower than the price suggests. In a direct build-off of the same Finance Hub app, Fable 5 asked 5 precise questions where Opus asked 21, generated DRY-er code, and delivered better UI out of the box. The catch: it is slow (30 minutes per complex module vs. 7 in Opus fast mode), has no fast mode, and will cost 2-3x more per month for daily use. GPT 5.5 finished last on UI quality by a significant margin. For most builders, Opus 4.8 with fast mode is the better daily driver.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0000:47

01 · Cold open

Setup the premise: $200 spent, same app built three times with Fable 5, GPT 5.5, and Opus 4.8.

00:4704:20

02 · What is Fable 5 and the Mythos class

Explains Mythos class via Project Glasswing blog. Internal preview model had too-strict safeguards; Fable 5 is the public release with additional guardrails.

04:2007:00

03 · Pricing and the June 22 deadline

Free on Pro/Max/Team through June 22. After that: $10/M input, $50/M output, usage credits required. Creator spent $200 in 24 hours.

07:0007:48

04 · CursorBench: Fable 5 vs the field

Chart walkthrough. Fable 5 is number 1 at high/max reasoning effort. Medium and low effort: use Opus 4.8 instead.

07:4807:56

05 · Sponsor: TestSprite

AI testing agent sponsor segment. Not core content.

07:5612:55

06 · Live app comparison: Finance Hub UI

Side-by-side demo of all three Finance Hub builds. Fable 5 rated best. Opus 4.8 close behind. GPT 5.5 called ugly and unshippable.

12:5516:40

07 · PRD workflow comparison

Fable 5: 5 precise questions. Opus 4.8: 21 questions, some redundant. GPT 5.5: 0 questions without explicit prompting.

16:4020:00

08 · Speed and the fast mode gap

Fable 5 has no fast mode. Complex Fable 5 back-end module: 30 minutes. Same module in Opus fast mode: 7 minutes.

20:0023:20

09 · Code quality deep dive

Fable 5 produces cleaner DRY-er code with typed props. Opus adds redundant console.log useEffects. GPT 5.5 scatters types into 600-line files.

23:2025:50

10 · Safety filter encounter

Security audit request hit a safety filter mid-session and auto-switched to Opus 4.8. Guardrails currently too aggressive; expected to loosen.

25:5028:44

11 · Final verdict and sign-off

Fable 5 is the best model. Not groundbreakingly better. Expensive, slow, temporary on subsidized plans. Creator returning to Opus 4.8 as daily driver.

Atomic Insights

Lines worth screenshotting.

  • Fable 5 asked 5 precise PRD questions; Opus 4.8 asked 21 - including many that were unnecessary.
  • A single Fable 5 back-end module took 30 minutes to generate; Opus 4.8 with fast mode did the same in 7.
  • Fable 5 has no fast mode - you get one speed: slow, expensive, and thorough.
  • GPT 5.5 generated a 600-line file with types scattered throughout instead of shared modules - a maintainability problem at scale.
  • The creator spent $200 in Fable 5 API credits in 24 hours; at that rate, daily use costs $2,000-3,000 per month.
  • Fable 5 is free on Pro/Max/Team plans through June 22 - after that it requires usage credits at $10 per million input tokens.
  • Opus 4.8 fast mode costs 2x but runs 4x faster - the cost/time ratio beats Fable 5 for most workflows.
  • The Fable 5 safety filter blocked a security audit mid-session and auto-switched to Opus 4.8 - the guardrails are currently too aggressive.
  • Fable 5 Finance Hub used Shadcn UI charts; Opus used Recharts - both shipped production-ready UI, just different stacks.
  • If you show both UIs to a non-technical user, they would say they look similar - the delta is real but not groundbreaking.
  • Fable 5 is a Mythos-class model - the same underlying capability as the invitation-only Claude Mythos Preview, with safety guardrails added for public release.
  • GPT 5.5 did not ask any questions during PRD creation until the creator explicitly invoked a grill-the-doc skill - the PRD was weak without it.
Takeaway

When the better model is not the right daily driver

MODEL SELECTION

A head-to-head build proves Fable 5 is the best coding model available - and also reveals exactly why most people should not use it every day.

  • The quality difference between Fable 5 and Opus 4.8 is real but incremental - a non-technical person looking at both UIs would not call it groundbreaking.
  • How many precise questions a model asks before starting a PRD is a fast, cheap proxy for its reasoning depth on your specific workflow.
  • Fable 5 has no fast mode, making it 3-4x slower than Opus 4.8 for complex generation tasks - speed is a real cost at daily workflow scale.
  • At medium or low reasoning effort, Fable 5 does not beat Opus 4.8 on benchmarks - high effort is the only tier where the premium is justified.
  • Safety guardrails on new frontier models start aggressive and loosen over time - a model that blocks your security audit today may handle it fine in a month.
  • GPT 5.5 fails on code architecture: it scatters type definitions into large individual files rather than shared modules, creating a maintainability problem invisible in demos.
  • The cheapest way to evaluate a new model is to build something real - synthetic benchmarks do not catch workflow-level friction like PRD quality or slow generation.
  • The difference between Fable 5 and Opus 4.8 as a daily driver compounds to roughly $1,500-2,000 per month - a number that changes the calculus for most independent builders.
Glossary

Terms worth knowing.

Mythos class model
A category designation Anthropic uses for its most capable frontier models. Fable 5 is the first public Mythos-class release, following the invitation-only Mythos Preview (Project Glasswing).
Fast mode
A Claude Code setting that doubles cost but significantly increases generation speed. Available on Opus-tier models, not Fable 5.
CursorBench
A community benchmark measuring AI model performance on real coding tasks in Cursor IDE, scored across different reasoning effort levels.
PRD (Product Requirements Document)
A structured document defining what a software product should do before any code is written. In AI coding workflows, the model often generates and critiques its own PRD.
Grill the doc
A workflow skill where the AI model interrogates a PRD by asking clarifying questions, surfacing ambiguities before implementation begins.
DRY principle
Don't Repeat Yourself - a software design principle that each piece of knowledge should have a single, unambiguous representation. GPT 5.5 violated this by duplicating types and functions.
Shadcn UI
A component library for React providing pre-built, customizable UI components. Used by the Fable 5 build in this comparison.
Recharts
A composable charting library built on React. Used by the Opus 4.8 build instead of Shadcn UI charts.
Resources

Things they pointed at.

00:52linkProject Glasswing (Anthropic blog)
07:00toolCursorBench
07:48productTestSprite
Quotables

Lines you could clip.

01:29
If you want to use this model every single day, then be ready to spend about 2 to $3,000 per month.
Concrete number that reframes the pricing debate instantlyTikTok hook↗ Tweet quote
24:45
The back end dashboard creation literally took thirty minutes to create. On the other hand, Opus did it in about seven minutes.
Specific time numbers make an abstract speed complaint realIG reel cold open↗ Tweet quote
15:25
Both will get you from a to z. One will do it a bit better. But is it worth it? I don't think so.
Punchy verdict that stands alone with no contextTikTok hook↗ Tweet quote
25:45
Fable is not a fast model. It's slow. It reasons a lot. It gets you good results, but it's not really a joy to use.
Honest nuanced critique from someone who actually shipped with itnewsletter pull-quote↗ Tweet quote
12:10
Oh my god. What is this? Oh, this looks so ugly. It's hideous. Just look at it.
Visceral reaction to GPT 5.5 UI - genuinely funny and shareableTikTok hook↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

analogystory
00:00Anthropic just released Fable five, their new Mythos class model. And apparently, it might be the smartest coding model available right now.
00:11So naturally, I made a very financially responsible decision.
00:16I spent $200 in API credits to build the same finance tracker three times.
00:23Once with Fable five, once with GPT 5.5, and finally, once with Opus 4.8.
00:31Same app idea, same workflow, same prompts, same development process.
00:37And after testing all three, I can say this. Fable five is probably the best coding model I've ever tested. Period.
00:47But I'm not sure if I will actually keep using it going forward. But okay.
00:53What even is Fable five and why is it interesting? Well, as you already know, Fable five is a normal model, an LM. But what makes different to OPUS 4.8 and GPT 5.5?
01:07Well, essentially, Fable five is a Mythos class model. But what is a Mythos class model? Well, here's the thing.
01:15Anthropic released this blog article about two months ago. Project Glasswing.
01:21Securing critical software for the AI era. Blah blah blah.
01:26Whatever that means. And inside of here, they said, hey. Today we are releasing a new model called Claude Miffus Preview.
01:34It's the best model we have ever created. And inside of here, Anthropic said, hey. The model is so good that we can't release it to the public.
01:43It will just end up with a huge mess and people will do the wrong things. So they released this model just to the biggest companies out on the market, enterprises. But hey, surprise surprise.
01:55Two months later, we now get this model, Fable five, which is a Mythos class model. So now you might ask me, okay, Jan. And what has now changed to two months ago?
02:06Why is now Fable five a public release? Well, essentially, they now added a lot of safeguards. If you want to just test your application, the coding agent, the model might say, hey.
02:18I'm not sure if this is safe. I will now stop the session, and I can't continue any further. The safeguards right now are not good at all.
02:26They are way too strict, and they don't allow you to just use or work on your application normally. So here, they have a full on section on the new safeguards. Again, currently, they are not very good.
02:37They are not very lenient. Hopefully, this will change in the future, but for now, it is what it is.
02:43But now you might say, okay. Change in the future. This means the model will be available forever.
02:49Right? No. No.
02:50No. Not so fast, my friend. Now this is something that I find super funny because this model won't be available forever.
02:58So from today through the June 22, Fable five is included on pro, max, team, and seed based enterprise plans at no extra cost. But on the June 23, they will remove Fable five from these plans, And using it after that will require usage credits.
03:18So if you are a Claude code user and if you are subscribed to any of these plans, then, yes, you get access right now. But in two weeks, it's time to say goodbye, best model.
03:30Bye bye. Because after that, you will only be able to use the model based on API usage. This means it will require usage credits, and we both know that this is quite expensive because this also means that you won't get any subsidized usage.
03:46Oh, man. This is annoying. So now you might say, okay, Jen.
03:50So if I don't get any subsidized usage in two weeks anymore, is the model at least cheap? No.
03:57Not really. So Fable five is being offered at $10 per million input tokens and $50 per million output tokens, less than half the price of Claude Miffe's preview.
04:09So is this cheap? No. Is it cheaper than the preview model that we never got access to?
04:15Yes. Is it also cheaper than OPUS 4.1, I think it was?
04:19Yes. Because when OPUS 4.1 got released, it was super expensive. So this model, in general, is not cheap.
04:26It's not super expensive. It will cost you quite a bit if you use it based on API usage, but it's not a huge deal breaker.
04:35Nevertheless, in my testing alone, in twenty four hours, I spent over $200 in API usage and API credits. So that's important to remember.
04:45If you want to use this model every single day, then be ready to spend about 2 to $3,000 per month if you use it every single day. And for most people, 99 percent of people, this is a deal breaker, including me.
05:00I won't pay that at all. So as a next step, you might say, okay, Chen. Mhmm.
05:05Mhmm. This makes all sense, but how does the model perform? Well, let's look at CursorBench.
05:12As you see here, I added a few models. Fable five, Opus 4.8, GPT 5.5, and Composer 2.5.
05:21Now I think we both see right away that Fable five is the best model right here. Now please look at the y axis. The higher the score, the better.
05:31Then look at the x axis. The more it goes to the left, the more expensive everything is. And inside of here, you will see, yes, Fable five is the best, especially if you use the highest reasoning effort max.
05:44But it's also the most expensive. Surprise. Surprise.
05:48Now one thing I have realized after testing Fable five and also looking at multiple benchmarks and tests is that Fable five only makes sense if you at least use high reasoning effort. So as you see here, if you use medium or low reasoning effort, I wouldn't recommend using Fable five. It makes more sense to use Opus 4.8 with either high or extra high reasoning.
06:14The results are super similar, but it's way cheaper. So this is one recommendation that I have and something that I have also learned while testing the model quite thoroughly.
06:25And here you will also see that GPT 5.5 is just no comparison. It's way worse.
06:31Composer 2.5 is also a way worse model. And if we now add, I don't know, where is Gemini, the model I hate the most, then you will see here that Gemini well, let's forget it.
06:42It's not a comparison. It isn't able to compete with Fable five. Fable five is definitely the best model out on the market, and Fropic did not lie with this blog article where they said, hey.
06:53This is the best model we have ever created. It is the best model, but don't forget, it's the most expensive model. And I wouldn't say that there is a huge jump between OPUS 4.8 and Fable five.
07:06It's better, but it's not crazy better. Now before we continue, there is one thing we have to talk about first. Testing.
07:15Because with models like Fable five, code generation is not really the bottleneck anymore. The real question is, can we catch the problems before they hit production? And that's where today's sponsor TestSprite comes in.
07:31TestSprite is an AI testing agent for engineering teams.
07:36You can connect it to your app through the MCP workflow or the web portal. Give it a PRD, a spec, or just the area you want tested. And it builds a focused test plan for you.
07:49But the important part is this. Test Sprite does not just read your code and guess. No.
07:56No. No. It actually opens the app, clicks through the real user flow, tests the front end, checks back end behavior, and runs everything in parallel.
08:08In the new test sprite three point o dashboard, you can watch those agents explore your app life, see the tests running, and when something fails, you get the error, trace cause, and suggested fix in one place. You can even replay the agent session and watch exactly where the bug happened.
08:29So when AI generated code breaks something, you're not just staring at a vague failed check, but you can actually see what went wrong. So at this point, the question isn't anymore can the model write code? It can.
08:44The real question is can we trust the shipped code? And in this case, TestSprite is exactly what helps us close the loop. And the best part, TestSprite has a super generous free tier, so you can already get started today.
09:01Check it out using the link below. But now, let's continue with the video. Now you might say, okay, gents.
09:09So we now looked at the theoretical side of things, and we can definitely say that Fable five is a good model. Again, looking at this chart, we can see that Fable five outperforms every single model out on the market with every reasoning effort besides low.
09:25And looking again at the cost, yes, max reasoning effort is expensive. I wouldn't recommend using it, but FableFiFi is a very nice middle ground, which is not super expensive and kinda manageable.
09:39But here's the thing. Benchmarks don't always portray reality. Some models might perform very well in benchmarks.
09:47Gemini, I'm looking at you. But they might perform quite bad in real life with your specific workflow, with your specific ideas.
09:55And that's exactly why I took the time and tested the three best models out on the market. Fable five, OPUS 4.8, and GPT 5.5.
10:06I built the same application three times Finance Hub, and, essentially, it's the one stop solution to all of your finances.
10:14Now let's first of all analyze the very first example, the very first application, and I won't tell you what model it is. And let's for now focus on the UI.
10:25I will do a hard refresh. We have a beautiful animation, this nice little mock up, then we get a few cards.
10:32This all looks good. It's clean, and it does not look tacky or something like that. I can now also sign in, and I will get redirected to the dashboard.
10:41We get a suspense fallback. I get a few cards. This chart, which of course works, then this chart.
10:48Then we have recent activities. This looks fine. The layout is not great, but it's manageable.
10:53Right? Let's go to the transaction. Let's get a general feeling of the UI.
10:58This looks good. Let's look at the spending. Oh, nice.
11:02We have a beautiful chart. Let's go to the assets. What do we get here?
11:06This all is fine. Right? It works.
11:08It looks good. Everything is fast. We have a very uniform design.
11:13We have one specific design language. This does not look bad. This is something that you can ship into production, production and users won't complain, which is important.
11:22Let's now look at the second example. So this again is Finance Hub. If I do a hard refresh, we don't really get a beautiful animation, which isn't good, but hey, it is what it is.
11:33We get again a very similar mock up, but which is actually interactive, which is something that you don't see every day. We again have a few animations, looks good, and it's very similar to the first example.
11:45Let's check out the dashboard. What do we have here? Now this already looks way better, at least in my opinion.
11:52We, of course, have a theme toggle, and this here has way more info. Net worth cash flow. We can have a chart, then right here, these charts.
12:00And what I see instantly is that this here uses shared CNUI charts. If I go back to the first example, this does not use shared CNUI.
12:10This uses read charts. So this is something important to remember. Let's go to the transactions.
12:17Right here, we get two issues, but that's fine. Let's go to the spending. Uh-huh.
12:22Interesting. Okay. Let's go to the subscriptions.
12:25So this again looks very similar to the first example. Though I would say that this here is a bit more refined. Again, if I do a hard refresh, look at this.
12:34It looks beautiful. And I would also say that this application is a bit faster. So there is probably better code.
12:41And now let's go ahead and check out the third application. 321. Oh my god.
12:48What is this? Oh, this looks so ugly. It's hideous.
12:53Just look at it. The theme is ugly. Everything is ugly.
12:57We don't even have to really dive into it. Let's quickly open the dashboard just to look at this hideous dashboard. Wow.
13:05I mean, this is something that you should never ship into production. This will make every user unsubscribe instantly minus m r.
13:13No. I'm joking. But I think we both know what model created this application.
13:18It was GPT 5.5 with high reasoning effort. I even used my signature workflow.
13:24I used all of the skills. I used the same workflow as with the other two applications, and still the UI is hideous.
13:31It's not comparable, and this is just something that does not look good. It's as simple as that.
13:37And again, this gives GPT 5.5, a leading model, and still we get such a bad result. Let's again go back to the previous example.
13:46So this is the second application I showed you guys, And this is probably my favorite one because it looks very refined and there are a lot of small details which are very important. Like, if we go back to the net worth page, we have the net worth history. But I can also instantly view the assets versus debts.
14:07These are the small details which change a lot, and that's very important. So what model created this application here? Well, surprise surprise, it was Fable five with high reasoning effort.
14:20This here is a very good result, and this is something that you can definitely ship into production. And with that, the third application has been created by Opus four point eight with high reasoning effort. And let's again look at it.
14:34This here looks good. Right? This is something that you can ship into production.
14:38And the most important thing is that this here looks very similar to Fable five. And that's something that I have realized while testing the two models. The results are always super similar.
14:49The Fable results are better. That's important to remember. But the foundation is always the same.
14:55The responses by the models are very similar. Yes. Fable outperforms the model in every shape and form in terms of code, in terms of UI, in terms of reasoning.
15:06But it's always very similar. And that's something that I like because Opus models in general work very well for my specific workflow. And Fable five, in this case, also works very well with my signature workflow.
15:19It's very similar, and I get similar results, which are even better than what I'm used to. But this still does not answer the question, is Fable five worth it? Especially considering the price.
15:32And I'm not sure because it gives you better results. But is it groundbreaking? Does it give you a huge leap?
15:39Not really. If you would now show these two web sites to one person who does not know what models are, what LMs are, and does not really know what UI and UX is, would the person say that this here is a huge leap and that this is groundbreakingly better than this?
15:56Probably not. This person would say, yeah. Both look kinda similar.
16:00This here looks a bit better. It has maybe better small details, but it does not really matter. And I can give you another example.
16:07Let's look at these two phones. This year's OPUS 4.8. This is Fable five.
16:12Like, this year's a better phone. It has a better camera, a better processor, a better everything.
16:17But at the same time, they are kinda the same thing, the same foundation, the same brand, the same operating system. This year will give you a better result, but it isn't groundbreakingly better than this result.
16:29And that's what I want you to kinda remember. Both will get you from a to z. One will do it a bit better.
16:36But is it worth it? I don't think so. And that's my honest opinion.
16:40Some people will disagree with me. That's totally fine. But I don't see the need for Fable five, especially for the price.
16:48Again, let's look at Cursorbench. The cost for Fable five high is $10.80. The cost for Opus 4.8 high, let's see where is it, is $4.40.
16:59So Fable five costs more than double. Is it worth it? I don't think so.
17:05But now you might say, Jan, wait wait wait. The UI isn't everything. What about the code quality?
17:12How does the model feel? Is it quick? Is it fast to respond?
17:15Well, let's check it out. As mentioned, for every application, I used the same workflow.
17:22I let the agent create a PRD. I grilled the PRD. I then created the back end implementation plan.
17:28I let the agent create the authentication pages. I used this skill very heavily when letting the agent implement front end UI, so I always used the same workflow. And what I've realized is that the results between Opus 4.8 and Fable five are almost the same.
17:47So this here is my OPUS 4.8 session where I asked the l m, the agent, to create a PRD. And inside of here, I then let the agent also grill me.
17:57Now please look at the questions. The word account is overloaded. The asset is also overloaded.
18:04Then what else do subscriptions double count against expenses? What else do we have here? When does the subscription hit a specific month's expenses?
18:13Then also please look at the general structure. We get these bullet points, then my recommended answer, blah blah blah. This here looks fine, I guess.
18:22Right? Now let's go to the Fable example. Inside of here, I first of all ask to the agent to create a PRD.
18:29It did so nice. As a next step, I again asked it to grill me. And look at the questions.
18:35Checkings account, one concept or two. We again have the same format, and again my recommendation. Then inside of here, do transactions move asset values?
18:45We again have the same layout. And the thing is also, if you compare the questions, they are very similar.
18:51I got the same questions asked by Opus 4.8 with different wording. The one thing that I've realized is that the fable is a little bit more, you could say, refined.
19:02It's a bit smarter. Inside of here, it only asked me five questions, which were super important.
19:08But if I go back to the Opus example, it asked me 21 questions, and a lot of them were not needed.
19:16So Opus does not think as well as Fable, and we saw that from the benchmarks. But is that again a huge game changer? No.
19:24Not really. It's fine. I can answer 21 questions if the result is similar.
19:29And it is. Yes. Fable five in this case was a bit faster because I was able to just get over this whole PRT quicker.
19:36I did not have to answer as many questions, but it's not a huge leap. Another thing is the model speed in general.
19:43When using Opus 4.8, I always use fast mode as you see here. Now fast mode is more expensive, two times more expensive, but you get significantly faster speeds.
19:55Now if I go back to the Fable example, the thing is you don't get access to fast mode. Like, you only have normal mode and that's it.
20:04And the thing is Fable five, with a 1,000,000 token context window and high reasoning effort, is slow as hell. It takes forever.
20:13The back end dashboard creation literally took thirty minutes to create. On the other hand, Opus did it in about seven minutes.
20:21Again, I used fast mode and all of that, but still, I feel like Fable is not a fast model. It's slow. It reasons a lot.
20:29It gets you good results, but it's not really a joy to use. I like fast models, and I get fast mode with Opus 4.8, which I use every single day.
20:38So if you want a snappy model, then Fable five is not for you. Let's also quickly look at GPT 5.5. Inside of here, I also asked the agent to create a PRD and then to grill me.
20:51And here's the thing, I wasn't a fan of the experience. First of all, GPT 5.5 did not ask me any questions when creating the PRD.
21:01Only once I invoked the grill with doc skill, it started asking me valid and needed questions. And this also means initially the PRD was very lackluster and very bad.
21:13Only after invoking the squirrel with Docskill, I got a somewhat good result. Still, I did not like how the model felt. It did not work very well with my specific workflow.
21:24It just was kinda I don't know. It just wasn't a joy to use. Another thing you will realize is that GPT 5.5 does not really explain things very well.
21:34Like, here we have question and then the recommended answer. That's it. But if I go back to the fable example, it first of all explains the question, then it gives me options, the recommended option.
21:46This here is way more detailed. This here gives me options. It explains things, and it works better with my workflow.
21:53As an engineer, I want to know what options I have and what option is maybe recommended. With GBT 5.5, I just get question and a recommended answer.
22:03It does not give me any room to breathe, which I don't like. GPT 5.5, in general, is just a model that I don't love.
22:11It gives okay ish results in the back end. The front end is trash. Let's forget it.
22:17But it's not a good all rounder. If you want an all rounder, then don't use GPT 5.5. You won't get the results that you need.
22:25So my takeaway from this first session was that I instantly felt a difference. Not because Fable did something magical. No.
22:33But because it knew when to stop, when to not ask any further questions. Because if I go back to the GPT 5.5 session, I had to do something funny.
22:43Let me scroll to the bottom. After the twenty sixth question, I had to say, hey.
22:48This is enough questions. Stop. I can't anymore.
22:52And that's something that you will just realize when using Fable five. It is a smart model, and it can't just do things or it can reason about things better, and it feels way more like a senior engineer that is confident in its ability, which I appreciate a lot.
23:08Another thing we have to talk about is the general code quality. And surprise surprise, Fable five in general creates better code than Opus 4.8 and GPT 5.5. Now is this perfect code?
23:23No. I don't want you now do a full on code review, but the code in general is clean. You can use it.
23:29Nevertheless, I would still recommend you to review all of the code because the agent, the l m in this case, still likes to duplicate things, and it does not always follow the dry principle. Don't repeat yourself. But it's definitely better than Opus 4.8 and GPT 5.5.
23:47One reason for that is probably, again, the high reasoning and the general smartness, if that makes sense. The model understands things way better on a way deeper level, and therefore, the code also becomes better.
24:01In this example, we always have metadata set. We have props. The props are always typed.
24:07So this is good. This is good code that you can ship. I would still recommend you to review it because nothing is safe, but it's better than what the competitors do.
24:16Opus 4.8 also generates good code. Is it as good as the code created by Fable? Definitely not.
24:23There are duplicate functions. The agent loves to repeat itself, create weird use effects that are not needed. Like, for example, look at this use effect console log error.
24:33I don't need that. This here is not needed, but the agent created the code. Opus 4.8 is not as smart.
24:40You feel it right away, but there is not a huge margin. The code in most cases is good enough, but it likes to add weird use effects, weird comments, weird console logs, weird just duplicate functions that you have to clean up yourself.
24:55This is something that you have to know. And GPT 5.5, well, this is my least favorite model.
25:01I don't hate OpenAI, by the way. Great company.
25:03But GPT 5.5 performs the worst out of the bunch. As an example, this here is a random file, financial something, and we have this type, account type. Why do we have it inside of here?
25:15I am 100% sure that this type has been defined somewhere else. This should live in some sort of shared file.
25:22It doesn't. Why do we have so many types inside of here? This is not needed.
25:26Again, constant account type labels, constant account type icons, constant blah blah blah, function blah blah blah. We don't need all of these functions inside of this one file.
25:37This file is huge, like 600 lines. These types shouldn't live in this file. And that's something that you will just realize when using GPT 5.5.
25:47It does not provide or it does not generate the cleanest code. You have to be very on top of the agent to get good code. The code works.
25:57That's fine. But it isn't clean. And if you don't do a thorough review, then good luck with your project in a couple of months because it won't be maintainable.
26:06You will have functions like, again, look at this investment asset types. Why does it live inside of here?
26:12Why does this not live in some sort of shared file? Why do we hard code this right here? This is not needed.
26:17And that's what I don't like about GPT 5.5. And that's why I pretty much stopped using it. Another thing I want to talk about quickly are the safeguards and restrictions.
26:28Right now, Fable five is kinda annoying to use. If you want to do a security review, then good luck.
26:35In this case, this user wanted to run a security vulnerability bug performance and privacy audit. And as you see here, Fable five hit a safety filter, and the conversation was automatically switched to Claude Opus 4.8.
26:50Now this is something that will probably change in the future, and probably the model will become a bit more lenient. But right now, the filters are, as the user says, crazy. It's not nice.
27:00It's not enjoyable to use. And you have to always think like, hey, how should I, like, maybe instruct the agent so that it does not get the wrong idea of what I want? I don't need any filters.
27:11Blah blah blah. It's not good. But I think it will be fixed in the future, so this is something to look forward to.
27:16But, yeah, that's all I did. These are my thoughts on Fable five. It's a very good model, and it's the best model out on the market.
27:24It writes better code. It asks better questions. It has fewer bugs.
27:28The UI looks better. It works right away. This year was a one shot, and this is quite cool to see.
27:34You see how powerful LMs can be. Nevertheless, it's slow, it is expensive, and it's kinda temporary.
27:42Because if you won't pay for the API usage, then please say bye bye to the model. I personally will probably move back to Opus 4.8 because it's a great daily driver.
27:53I can get close enough to these results, and it's way cheaper. And that's kinda important because I already pay a lot for Opus 4.8, and Fable would just make the cost double, which would lead then to thousands of dollars per month, which I can't spend on an LM.
28:08But, yeah, what do you think? Please let me know in the comments. I would love to know what your thoughts are.
28:14What you think of Fable five? How it compares to all of the models that you used previously? What you think about the pricing?
28:21The security restrictions? The general two weeks timeline, which I find annoying and weird. But, hey, it is what it is.
28:28Let me know. I read every comment. Also, please don't forget to like and subscribe.
28:33It would mean a lot to me and my heart. So please do it. And with that out of the way, enjoy your day and see you in the next video.
28:41Over and out. Bye bye.
The Hook

The bait, then the rug-pull.

Two hundred dollars in API credits. Twenty-four hours. One app built three times with three different models. That is what it took to get an honest answer about whether Fable 5 - Anthropic's new Mythos-class model - actually earns its price tag over the Opus 4.8 most builders already rely on.

Frameworks

Named ideas worth stealing.

00:35model

Same-app three-model build-off

  1. Same app (Finance Hub)
  2. Same workflow: PRD to grill to backend to auth to frontend
  3. Same prompts
  4. Three models: Fable 5, Opus 4.8, GPT 5.5

Build the identical product end-to-end with each model to control for workflow variables and isolate model quality differences.

Steal forAny model evaluation - removes confirmation bias by forcing you to ship the same thing three times
13:05concept

PRD question count as a reasoning proxy

How many questions a model asks before writing a PRD reveals its reasoning depth. Fable 5: 5 precise. Opus 4.8: 21. GPT 5.5: 0 without prompting.

Steal forUse question count and precision as a cheap proxy benchmark for model quality before committing to API costs
07:00model

Reasoning effort sweet spot matrix

  1. Fable 5 max: best results, most expensive
  2. Fable 5 high: best value within Fable ($10.80 per task)
  3. Opus 4.8 high: 80% of Fable quality at $4.40 per task
  4. Fable 5 medium/low: not recommended over Opus

At medium or low reasoning effort, Fable 5 does not justify its premium over Opus 4.8. Only at high effort does the quality gap become defensible.

Steal forModel routing rules: Opus for routine daily work, Fable 5 only for highest-stakes single sessions
CTA Breakdown

How they asked for the click.

VERBAL ASK
27:40subscribe
please don't forget to like and subscribe. It would mean a lot to me and my heart.

Standard subscribe CTA at the very end. No product pitch, no newsletter. Sponsor (TestSprite) handled mid-video.

Storyboard

Visual structure at a glance.

open
hookopen00:00
Project Glasswing reveal
contextProject Glasswing reveal00:52
pricing window
valuepricing window04:20
CursorBench chart
evidenceCursorBench chart07:00
live app demos
demolive app demos07:56
PRD comparison
valuePRD comparison12:55
safety filter hit
caveatsafety filter hit23:20
verdict
ctaverdict25:50
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

32:54
Theo - t3․gg · Talking Head

Fable is Mythos, and it is really good.

A 33-minute first-take from a developer who spent $3,000 on inference in 24 hours — benchmarks, real demos, session math, and the hidden safety intervention that silently degrades the model without telling you.

June 11th
21:09
AI Edge · Talking Head

Claude Fable — First Look and Honest Review

A 21-minute first-hours take on the public release of the Mythos-class model — what it does, what it costs, and a practical framework for deploying it without burning your token budget.

June 9th
12:42
Alex Finn · Tutorial

Claude Opus 4.8 actually blew my mind

A 12-minute field report on every change in the new model — benchmarks, pricing, Dynamic Workflows, Ultracode — plus a live one-shot 3D game demo and a concrete recommendations ladder.

May 28th
Chat about this