Big Idea

The argument in one line.

Sonnet 5 closes the gap to Opus 4.8 close enough that running Opus for every task is now pure waste — the right call is Opus for planning, Sonnet 5 for execution, cutting costs roughly in half while keeping output quality nearly identical.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You are currently spending $100+ per month on Claude API tokens inside Hermes, OpenClaw, or Claude Code and want to understand where Sonnet 5 can replace Opus without hurting output.
You use Claude Code for building apps and want a concrete two-model workflow that separates expensive planning from cheaper execution.
You want a current benchmark comparison — Sonnet 5 vs Sonnet 4.6 vs Opus 4.8 — across agentic coding, reasoning, computer use, and knowledge work.
You are tracking the Fable 5 situation and want the freshest leaked information on availability and access requirements.

SKIP IF…

You do not pay for Claude API usage and have no plans to — the cost-optimization framing will not be relevant.
You are looking for a deep technical explanation of how Sonnet 5 works internally; this is a practical usage video, not a model-architecture breakdown.

TL;DR

The full version, fast.

Claude Sonnet 5 is a full generational upgrade — not a point release — that beats Opus 4.6 on every benchmark and comes within 5% of Opus 4.8 on most tasks, at roughly half the cost. The presenter's recommended workflow splits the two models: use Opus 4.8 in ultra plan mode (with workflow-spawned sub-agents) to generate a detailed architecture, then switch to Sonnet 5 on medium to execute that plan cheaply. The same logic applies to Hermes and OpenClaw users who can swap the Claude API model string directly. For Fable 5, leaked Claude Code strings suggest the model will return soon behind a US-only identity verification gate and API-only pricing.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 00:41

01 · Intro — the claim

Frames Sonnet 5 as the best bang-for-buck model, previews benchmarks, Hermes/OpenClaw usage, best practices, and Fable 5.

00:41 – 03:41

02 · What Sonnet 5 is

Bullet breakdown: beats Opus 4.6, nearly matches Opus 4.8, fraction of the price, significantly faster, big upgrades in reasoning, tool use, coding, and knowledge work. Real cost data shown — $1,375 API spend dashboard.

03:41 – 06:25

03 · Benchmarks and cost

Side-by-side benchmark table across agentic coding, multidisciplinary reasoning, computer use, and knowledge work. Cost: $8 Opus 4.8 medium vs ~$4 Sonnet 5 for 5% less performance.

06:25 – 10:25

04 · Demo — Sonnet 5 vs ChatGPT 5.5

Same 3D boat simulator prompt in Claude Code and ChatGPT 5.5. Sonnet 5 wins visually; ChatGPT 5.5 produces a static boat with broken controls.

06:32 – 10:15

05 · Best practices — two-model workflow

Opus 4.8 ultra + plan mode for architecture (sub-agent workflows spawn), then Sonnet 5 medium for execution. Covers Hermes/OpenClaw model-string swap.

10:15 – 11:56

06 · Fable 5 intel

Leaked Claude Code strings: Fable 5 returning soon, likely behind API-only pricing plus US-only identity verification.

Atomic Insights

Lines worth screenshotting.

Sonnet 5 is a full generational jump, not a patch — Anthropic assigned it a whole number, not a decimal, which signals a meaningfully different capability tier.
Sonnet 5 destroys Opus 4.6 on every published benchmark, making Opus 4.6 effectively obsolete for anyone still running it in production.
Half the price of Opus 4.8 with only a 5% pass-rate drop is not a tradeoff — it is a free upgrade for any task that does not require the top 5%.
If your planning is thorough enough, the model you use for execution barely matters — good specs reduce the cognitive load that costs money.
Spending $1,300 per month on Claude tokens inside an agent platform like Hermes is real and common — Sonnet 5 cuts that bill without changing the workflow.
Sonnet 5 scored 63.2% on agentic coding benchmarks against Opus 4.8's 68.2% — close enough to be interchangeable for most app-building tasks.
Switching Hermes or OpenClaw to Sonnet 5 requires nothing more than updating the Claude API model string in your agent config — no platform change needed.
Ultra mode in Claude Code spins up sub-agent workflows automatically for planning tasks — that compute investment pays off because it reduces the execution burden downstream.
Claude still leads every other provider in agentic performance; Sonnet 5 extending that lead at a lower price point widens the moat further.
Fable 5's return appears to require both API pricing and US-only identity verification — a meaningful access restriction that could exclude a large share of current users.
ChatGPT 5.5 produced a static 3D ship with non-functional controls; Sonnet 5 produced one with moving waves, crashing physics, and working sliders in the same prompt.
Sonnet 5 is not a replacement for Opus 4.8 everywhere — it is a strategic replacement in cost-sensitive and speed-sensitive contexts only.

Takeaway

Use the right model for the right job.

WHAT TO LEARN

Paying for Opus-level compute on every task is like hiring a senior architect to sweep the floor — Sonnet 5 makes the separation economically obvious.

Not every task needs the most expensive model — the biggest cost savings come from identifying which work requires deep reasoning and which just requires reliable execution.
A detailed plan generated at high compute dramatically reduces the cognitive load needed at execution time, making cheaper models viable for the majority of the work.
When using any Claude-backed agent platform, you control which model the agent calls by updating the API model string — you are not locked to whatever default the platform sets.
Benchmark numbers are useful context but pass rate alone does not capture what matters — for most production tasks, the 5% gap between Sonnet 5 and Opus 4.8 is invisible in practice.
Model access restrictions such as identity verification and API-only gating are a meaningful signal about where AI providers see their highest-risk use cases and who they are optimizing for.
A real $1,300 monthly API spend on a single agent platform is not unusual for active builders — any model upgrade that cuts that in half compounds into thousands of dollars saved annually.

Glossary

Terms worth knowing.

Hermes: A third-party AI agent platform that allows users to build and run Claude-powered autonomous agents; pricing is based on the underlying Claude API token usage.
OpenClaw: Another agent platform in the Claude ecosystem, similar in structure to Hermes, where swapping the model string changes which Claude model the agent uses.
Ultra mode: A Claude Code setting that enables maximum compute allocation for a session, including the ability to spawn workflow-based sub-agent clusters for complex tasks.
Plan mode: A Claude Code session mode where the model generates a structured plan and asks clarifying questions rather than immediately writing code — used to front-load architectural thinking.
Workflow (Claude Code): Claude's sub-agent orchestration feature: from a single session, the model can spin up multiple parallel agents to divide and process complex work simultaneously.
Fable 5: Anthropic's most capable model tier, positioned above Opus; currently unavailable through standard channels, with leaked code strings suggesting an imminent return behind identity verification.
Pass rate: The percentage of benchmark tasks a model completes correctly; used here to compare Sonnet 5 to Opus 4.8 on standardized coding and reasoning evals.

Resources

Things they pointed at.

00:00productClaude Sonnet 5 ↗

04:20productChatGPT 5.5 ↗

11:40productVibe Coding Academy ↗

01:05toolHermes agent platform

01:05toolOpenClaw agent platform

Quotables

Lines you could clip.

00:02

“Claude Sonnet five has released, and it is by far the best bang for your buck in AI right now.”

Strong opening claim with no setup needed→ TikTok hook↗ Tweet quote

02:03

“I've spent $1,300 in the last month on Claude tokens inside Hermes.”

Concrete personal spend number — instantly relatable and shareable→ IG reel cold open↗ Tweet quote

07:49

“You're paying about half the price for roughly a little bit worse performance than Opus four eight. That's pretty good.”

One-liner that encapsulates the entire cost argument→ newsletter pull-quote↗ Tweet quote

11:50

“It is not a full replacement for Opus. It is a replacement in very strategic areas.”

Nuanced takeaway that counters the hyperbolic positioning — credibility builder→ newsletter pull-quote↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

story

00:00It happens. Claude Sonnet five has released, and it is by far the best bang for your buck in AI right now. It has almost the performance of Opus four eight, but for a fraction of the price.

00:12In this video, we'll go through every single change with SONNET five, but more importantly, tell you how you should be using it now, should you be using it in Hermes, should you be using an Open Claw, should you be using it in Claude code, And where you shouldn't even be touching it at all. I'll also give you a few tips on how you can get the absolute most out of this model immediately.

00:33And then on top of that, we'll go into what this potentially means for Claude Fable five. Now let's lock in and get into it. So this is a big one.

00:42It is a full number upgrade. This isn't a four eight four nine. No.

00:46It is SONNET five, a full number upgrade, and it comes with a lot of big changes. First of all, let's talk performance. One, it blows OPUS four six out of the water.

00:56I don't know if you remember. It was, like, a month and a half ago, OPUS four six comes out, and it was really good. This lightweight, cheaper, quicker model destroys it.

01:06That is massive. And that is big because Claude has been number one when it comes to agentic models forever.

01:14When it comes to OpenClaw and Hermes, nothing comes close. And a lot of people have been upset because you gotta pay API pricing for clawed models inside Hermes. Well, now you don't need to pay as much because you can use SONNET five.

01:27It is almost as good as Opus four eight, and I'll show you some of the numbers in a second here, so stick around for that. But it's almost as good as OPUS four eight, which OPUS four eight, I believe, is the smartest model on planet Earth right now if you don't count Fable five. More on that later.

01:41Here's a big one, fraction of the price. Everyone and their mothers has been crying that AI has gotten too expensive lately. Everyone's been crying again about paying API prices for Claude with Hermes with Open Claw.

01:54Fraction of the price, you're saving a ton. If you're anything like me, your usage with Hermes is out of control when you use clawed. I've spent $1,300 in the last month on clawed tokens inside Hermes.

02:07So, yeah, this is a big welcome upgrade. It is significantly faster.

02:12That is a big sticking point with Opus as well. It can get pretty slow at times, and the big upgrades come to reasoning, tool use coding, and knowledge work. Basically, the four horsemen of agentic work.

02:24I really think this is the model Anthropic puts out to really nail Open Claw and Hermes use cases. This isn't your clawed code, agentic loop infinite autonomy model.

02:37This is your you're working with Hermes. You're working with OpenClaw. You're doing basic coding tasks, and it is your agent that is partnering with you.

02:45So let's talk about the numbers real quick. Absolutely destroys SONNET four six on basically every single measurable benchmark there is.

02:54When it comes to Opus four eight, doesn't beat it in any specific benchmark, but it comes really, really close. A little bit better on knowledge work, but computer use, everything else, it is very, very close. But, again, that is amazing because you're not paying nearly as much for Sonnet five as you are for Opus.

03:11That's why you're now plugging this into everything you do. Now let's look at cost versus performance. If you take a look here for similar tasks, you're paying out $8 for Opus four eight on medium.

03:23Similar task, you are paying about half that price for SONNET five and only getting a small 5% downgrade on the pass rate. So you're paying about half the price for roughly a little bit worse performance than Opus four eight. That's pretty good, especially if you're using any sort of agents.

03:39So let's do this. First, we're gonna do a quick performance test of SONET to see how it fares against Chad g b t five five, and then we'll go into how to use it best. I'll show you some best practices with using it for Claude code as well as Hermes.

03:52So I am in Claude code desktop. I believe this is the best way to be using Claude. A lot of people use a CLI.

03:58I like the desktop. You can monitor all your sessions really well. The user experience is really nice.

04:03You can plug in anything you want. I use linear, a whole bunch of other plugins. So I'm using Claude desktop.

04:08I recommend you use the same thing. If you go to the bottom right, you'll see it right there. Sonnet five, miss you fable five.

04:14Sonnet five, boom. I'm going to put in a prompt.

04:17I will also be giving to Chad g b t five five so we can run this test. I think those are two comparable models. This with Chad g b t five five.

04:25This is going to build a really nice three d boat simulator. I'm gonna put the prompt down below if you wanna copy it and, uh, run it yourself as well. I'm gonna hit enter on that.

04:36I am also gonna be putting this in codex and giving this to just Chad g BT55Medium. I'm gonna hit send on that at the same time, but we're gonna see what performs better, Chad g b t five five or Sonnet, then I'm gonna give you all those master tips on how to be getting the most out of Sonnet. Alright.

04:52Let's do this. Let's start with Chad g b t five five. This was Chad g b t five five's three d ship simulator.

05:00In the prompt, if you take a look at it, it says allow you to configure the rain, the wind, the waves. A little disappointing. One, I can't move the camera at all.

05:09Two, the ship is not moving. Three, the water is not moving. The rain actually is quite impressive, to be quite honest with you.

05:15That is a lot of rain. Let's see. We go wave height, nothing happens.

05:19Rain density, that is a lot of rain. The weather's nice.

05:23Everything else, nights, not so much. Let's take a look at what sonnet five did here. I like it better.

05:29The waves are moving. The ship is moving. Is there any rain going on at the moment?

05:34No. You kinda see thunder there, but I like the way the ship and the waves look a lot more. Let's bump up the wind.

05:41Yeah. It makes the waves. Oh, the other way oh, the waves are crashing into that ship.

05:45Wow. The ship is going nuts. I would not wanna be on that ship.

05:48This is rather impressive. It looks like it is better than five five, to be quite honest, you based on this test. So one thing to note here as well from a pricing perspective, even though I like these results better from Sonnet five than from Chad GPT, and the UI is still way better with Claude Sonnet than Chad GPT.

06:06I don't know why Chad GPT can't figure out the UI side. I will say this. The pricing is still significantly better with ChatGPT in almost every single aspect.

06:16So if price is important to you, if you're paying for API usage for your pure coding efforts like this, I still probably lean So we're back in Claude code desktop here again, What I believe is the best way to use Claude and Claude code. Here's how you wanna use SONNET. You want to use SONNET for all your basic coding tasks, and you wanna use Opus four eight in the ultra mode for a lot of your planning and really complex tasks.

06:42So for instance, I'm starting out this new project. It is a productivity app. What I'm going to do is I'm going to go into plan mode, and what I'll also do is go into Opus four eight and go into ultra code.

06:54Now this is probably only safe for you if you're in the 20 x max. If you're anything lower, you probably just wanna go into max mode here. But if you have the 20 x, you go into ultra code.

07:04This is where you're going to do the planning of the entire app. So if you're building an application, you're planning some monster functionality out, some monster app out, you go into plan mode, you go into ultra code. The reason why I like ultra code is it can spin up workflows at any time.

07:21For those who don't know, workflows is basically Claude's sub agent functionality where it spins up potentially thousands of sub agents to do work for you. So I like the ultra code mode.

07:31So I have a prompt to build this productivity app, basically a Notion clone. I'm gonna hit enter on that in plan mode in ultra code. It is going to use tons of compute to make sure this is planned out well.

07:42This is the key here. When you're doing actual execution, you don't need a ton of compute if the plan mode was done with a lot of compute.

07:52Right? So if you have a really nice detailed plan, you don't need the smartest model in the world to do the execution. So we still use Opus four eight for the planning, but what we're gonna do in a second is use SONNET five for execution of that detailed plan.

08:07So we're going through the plan mode. It's asking a ton of great questions around, do I want it to be multiplayer? What kind of writing support does it have?

08:16Is it a writing system? Yep. We're gonna give it Notion AI functionality.

08:19This is a great strategy. If you pay for any apps, just rebuild it in Claude code. Right?

08:24You'll save tons of money. We're gonna do MVP first. So it's going in.

08:27It's building a plan, and here is what I love. It is starting a workflow to design the architecture. So what you'll see here is actually spin up tons of sub agents to design the architecture.

08:38This doesn't happen with SONNET. Right? So this is why you wanna be doing this Opus, so you get the maximum compute in the important part, which is the planning.

08:46Look at this. Five agents working, tons of tokens, tons of tool use.

08:50This is awesome. Alright. Looks like it built out the entire plan, put it in a markdown file, which is sick.

08:56I'm going to go into SONNET five. And because we have such a good plan built out, we can go in, do sonic five on medium. So this can be dirt cheap.

09:06And we can say, okay. Now execute on the plan and sonic five will get to work.

09:12If we were doing opus four eight with this, this would cost us way more money. This would be very expensive to do. But now that SONNET five is past opus four six, almost four eight, we can run it on SONNET.

09:22We'll get the same quality project done for way less. This is great if you're on one of the cheaper plan models. As for Hermes agent Open Claw, if you are watching this video shortly after I put it out, SONNET five probably won't be in your directory of new models.

09:41But what you can do is go to your agent. If you're already using the Claude API, just say, hey. Switch the Claude API to the SONNET five string.

09:50Look it up online, and it can switch in the back end for you, and you'll be good to go. You'll be on the new SONNET five model. I recommend using Sonic five now in your Hermes and Open Claw.

10:00Claude, again, makes the best models when it comes to agents. It really isn't close. Chad GBT 5.5 is usable.

10:07Claude is the goat, though. The issue, again, very expensive. Sonic five brings those costs down.

10:12I'd recommend using Sonic five through the API. As for Fable five, it looks like it is going to return soon. A bunch of strings have been found in the Claude code around Fable five, including looking like it is going to require API usage as well as looking like it is going to require verification.

10:32So you're actually going to need to identity verify to make sure you're in The United States Of America. If you're outside The US, I'm so sorry. If you're inside The US, prepare to give up your identity in order to use it, which is fine.

10:45I guess it is what it is. I just want the model back. I personally will be giving the identification so I can use the model.

10:51I don't do anything crazy or illegal with AI, so I really have nothing to fear. But good news, looks like Fable five's coming back very, very soon. The bad news is, uh, you're gonna have to pay API pricing, and you're gonna have to give up your identity in order to use it.

11:05It is what it is. That is Sonnet five. It is not replacing Opus four eight for me.

11:10It's only replacing Opus four eight for cheap and quick and easy tasks in times in which I'm looking to save money, like with using Open Claw and Hermes because I am spending thousands a month on those. This should bring down my bills by a little bit while getting me comparable performance to Opus four six. So it is not a full replacement for Opus.

11:30It is a replacement in very strategic areas. If you learned anything at all, leave a like down below, subscribe, turn notifications. All I do is make amazing videos about AI doing full live boot camp on Sonnet five this week in the Vibe Coding Academy.

11:45Link for that is download as the number one community in AI on the entire Internet. Make sure to join.

11:50You will learn a ton. It'll be the best time of your life. Sign up for that.

11:53Hope this was helpful. See you in the next video.

The Hook

The bait, then the rug-pull.

Sonnet 5 is not a patch. It is a full generational number — and in the first thirty seconds, the host has already declared it the highest-value model in AI right now. What follows is the evidence: benchmarks, a live head-to-head against ChatGPT 5.5, a real $1,300-per-month bill cut in half, and a two-model workflow that separates the expensive thinking from the cheap execution.

Frameworks