Why Modern Creator?

Nate Herk | AI Automation · YouTube

Claude Fable 5 Made This Entire Video By Itself

A 5-minute video that proves its own thesis: one prompt, no filming, no editing, a finished YouTube video.

Posted

June 12th

1 months ago

Duration

05:46

Format

Demo

educational

Views

9K

943 likes

Part of the collectionThe Fable 5 PlaybookAll 45 Fable 5 breakdowns, synthesized into one page.

Read the playbook

Big Idea

The argument in one line.

Claude Fable 5 long-horizon focus makes it the first model capable of coordinating a complete multi-tool video production pipeline — script, voice, avatar, motion graphics, editing, and self-verification — from a single prompt without human oversight.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You use Claude Code professionally and want to understand what the Mythos-class Fable 5 tier unlocks versus Sonnet or Opus.
You are building AI-powered content production pipelines and want a real end-to-end example with real token costs.
You work with ElevenLabs, HeyGen, or ffmpeg and want to see how Claude orchestrates them inside a /goal session.
You are evaluating whether a dollar-200-per-month Max plan is worth it for agentic video production use cases.

SKIP IF…

You want a step-by-step tutorial you can copy — the creator explicitly says results are not replicable without his pre-built Hyperframe skills.
You are not already familiar with Claude Code's /goal command and skill system — this video assumes fluency with both.

TL;DR

The full version, fast.

Nate Herk used Claude Fable 5 and a single /goal prompt to generate a complete YouTube video — researched script, ElevenLabs voice clone, HeyGen avatar, ffmpeg editing, GSAP motion graphics — in about one hour. The model self-verified by rendering frames and re-rendering failures. Total cost: roughly 380K tokens, or about 40% of a dollar-200-per-month Max plan. Honest caveats: you cannot replicate this without pre-built Hyperframe skills, sub-agents used cheaper models for verification, and Sonnet is probably sufficient now that the pipeline exists.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 00:30

01 · The artifact plays

AI-generated video runs with on-screen overlays revealing AVATAR SYNTHETIC, VOICE CLONED, SCRIPT CLAUDE.

00:30 – 01:56

02 · Fable 5 capability overview

Mythos tier, Stripe 50M-line migration, Pokemon FireRed vision test, Slay the Spire long-horizon test, pricing.

01:56 – 03:02

03 · Pipeline walkthrough

Script via voice playbook, ElevenLabs chunks, HeyGen Avatar 5, Playwright workaround, ffmpeg stitch, GSAP graphics, self-verification.

03:02 – 03:56

04 · Fourth wall break

Creator steps out of the AI artifact and speaks directly — confirms the video was entirely AI-generated.

03:56 – 05:46

05 · Actual Claude Code session

Screen recording: ~380K tokens, 40% of /month plan consumed in one hour, full /goal prompt on screen, caveats about replicability.

Atomic Insights

Lines worth screenshotting.

Claude Fable 5 used Playwright to click HeyGen's UI when the Avatar 5 API did not expose the model — browser automation is the escape hatch when APIs lag behind.
Voice generation chunks must stay under 60 seconds or the cloned voice starts to drift — a constraint that shapes the entire pipeline architecture.
The /goal prompt's most effective line was a reputation stake: 'you should only stop when you are 100% confident — it will damage my reputation.' Consequences outperform specifications.
Self-verification via frame rendering — render, review visually, re-render failures — is the quality gate that makes fully unsupervised production viable.
Sub-agents in the verification workflow ran on models below Fable 5; only the orchestrator was Mythos-class, which kept costs to 380K tokens.
One hour of Fable 5 work consumed 40% of a dollar-200-per-month Max plan — the economics only work for content you would otherwise pay a human editor significant money to produce.
The Mythos tier was locked to vetted security partners until Fable 5 — this video is a first-week stress test by a practitioner, not a controlled benchmark.
Claude built every motion graphic card as live HTML animated with GSAP, then rendered it into the video — code as compositing tool, not just logic.
Fable 5 beat Pokemon FireRed using only raw screenshots with no maps or navigation aids — older Claude models needed a full helper harness.
A 50-million-line Ruby codebase migration that would have taken a team two months was compressed to a single day in the Stripe benchmark.

Takeaway

One prompt built a finished video — here is what the cost means.

WHAT TO LEARN

The pipeline is real, but the 80-dollar session bill and pre-built skill dependency are the honest constraints behind the headline.

Chunking TTS generation into sub-60-second segments prevents voice drift — a non-obvious constraint that shapes the entire pipeline architecture for any AI voiceover workflow.
When an API does not expose a feature you need, Playwright browser automation is the escape hatch — Claude drove HeyGen's UI by hand until the API caught up.
Self-verification via frame rendering — render frames, review visually, re-render failures — is the quality gate that makes fully unsupervised production viable without a human review loop.
A reputation stake in the prompt ('it will damage my reputation') outperforms a specification list as a quality signal — giving the model context for why quality matters produces better judgment.
Sub-agents handling verification ran on models below Fable 5; only the orchestrator was Mythos-class — a cost-management pattern worth copying for any long-horizon agentic workflow.
One hour of Fable 5 orchestration consumed 40% of a 200-dollar-per-month Max plan — the economics only hold for content you would otherwise spend significant human time and money producing.

Glossary

Terms worth knowing.

Mythos class: Anthropic's model tier above Opus, previously restricted to vetted security partners. Fable 5 is the first Mythos-class model on a standard paid plan.
Hyperframes: A Claude Code skill system for generating HTML/GSAP animated cards that render as video motion graphics inside ffmpeg compositions.
/goal: A Claude Code slash command that frames the session around a single outcome the model pursues autonomously, spawning sub-agents and tools until the goal is complete.
Avatar 5: HeyGen's newest motion engine for AI avatar video rendering, which was not initially accessible through HeyGen's public API at time of this video.
Long-horizon focus: The ability to maintain task coherence across millions of tokens — the primary improvement Anthropic cites for Fable 5 over Opus 4.
Voice drift: Degradation in voice clone quality when a TTS generation runs too long; mitigated by splitting scripts into sub-60-second chunks.

Resources

Things they pointed at.

01:56toolElevenLabs ↗

02:10toolHeyGen Avatar 5 ↗

02:35toolPlaywright ↗

02:45toolffmpeg ↗

02:45toolGSAP ↗

Quotables

Lines you could clip.

00:00

“I just typed one prompt into Claude Code and walked away. And everything else — the research, the script, the voice, the avatar, the motion graphics — all of it happened on its own.”

Self-contained, no setup needed, captures the thesis in two sentences→ TikTok hook↗ Tweet quote

03:56

“This ate up about 40% of my a month plan. So in one hour, it ate up almost half of the plan.”

Honest cost disclosure — the counterpoint everyone needs after the flex→ IG reel cold open↗ Tweet quote

05:10

“I said, you should only stop when you are 100% confident that this is a high quality video. This will be going out to my YouTube channel. So if it doesn't look good, it will damage my reputation.”

The prompt engineering insight — reputation stakes as quality signal→ newsletter pull-quote↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogystory

So I literally just opened up Claude Fable, gave it this slash goal, went down to the gym, and came back to this. What you're watching right now was not filmed. This avatar is AI.

The voice you're hearing is a clone of mine, and every single word of the script was written by Claude. I didn't write this. I didn't film it.

I didn't edit it. And while it was being made, I never saw a single frame of it. I just typed one prompt into Claude code and walked away.

And everything else, the research, the script, the voice, the avatar, the motion graphics, all of it happened on its own. So this week, Anthropic released Claude Fable five, and that's basically the only reason this video can exist. It's the first time a Mythos class model, that's the tier above Opus, has been available to anyone on a paid plan.

Until now, that tier was locked to vetted security partners, and it's state of the art on nearly every benchmark they tested. So let me show you guys what this thing is actually good at and then exactly how it made this video.

So the coding numbers first because they're kind of nuts. Stripe said Fable five compressed months of engineering into days. And in the announcement, there's a 50,000,000 line Ruby code base where it ran a full migration in a single day, a job that would have taken a whole team over two months by hand.

And Vision took a big jump too. It can rebuild a web app's source code just from screenshots, and it actually beat Pokemon FireRed start to finish on raw screenshots alone.

No maps, no navigation aids, where older Claude models needed a whole helper harness just to play. But the one that matters most for this video is long horizon focus. This thing stays locked in across millions of tokens.

Anthropic gave it a file based memory, like literally just files it could write notes to and had it play Slay the Spire, and it reached the final act three times more often than Opus four eight. Now it's not cheap. $10 per million input tokens, 50 on the output, but you guys are about to see what that buys you.

Okay. So real quick. How did this video actually get made?

First, the script. Claude read Anthropic's full announcement, fact checked every claim you just heard, and wrote this entire thing in my voice using a voice playbook built off my actual transcripts. Then the voice.

It sent that script over to Eleven Labs where I've got a voice clone trained on my real videos. And the trick is you can't just generate, like, four straight minutes of audio because the longer a generation runs, the more the voice starts to drift. So Claude split the script into chunks just under a minute each and generated them separately.

Then every chunk went to HeyGen to render on my avatar on the Avatar five model, their newest motion engine. And for a while, you couldn't even select avatar five through the API, so the workaround was Claude literally driving a browser with Playwright and flipping every video by hand. Their new API finally exposes it, so this one went straight through.

But at that point, it's just a pile of raw avatar clips, and nothing's been edited yet. And then the editing, which is usually the part that takes a human days. Claude stitched the avatar clips together with f f m peg, ran a word level transcription, and built every motion graphic in this video as actual code HTML animated with GSAP inside hyperframes, timed to the exact words I'm saying.

Then it checked its own work. It rendered out frames from every scene and visually reviewed them, and anything that looked off got fixed and rerendered until it all passed. So one prompt went in and a finished, fully edited YouTube video came out the other side.

That's what a Mythos class model does the same week it comes out. But anyways, that's gonna do it for this one. So if you guys enjoyed the video or learned something new, please give it a like.

It definitely helps me out a ton. And as always, I appreciate you guys making it to the end of the video. I'll see you on the next one.

Thanks, everyone. I mean, isn't that amazing? Even all those sound effects, everything in there, one shot by CloudFable five.

Now two quick things to keep in mind. First of all, if you copy that exact same prompt, I'm not convinced you would get the exact results because I've got a few different, like, hyperframe skills that are already in there. And then number two, I don't think you actually need Fable to do all this.

I could definitely replicate this style now that I've already built it out once. I could build a skill around it, but I think that I could replicate that style with probably even Sonnet. This is the actual session that I ran.

This only took an hour as you can see, and I used slash goal. So the goal was achieved in an hour. It took about 400,000, 380,000.

But keep in mind, I did have it spin up a dynamic workflow at the end to verify everything. So it had a bunch of agents taking screenshots and verifying everything.

Even those sound effects, all of the sound effects in that final render was built right in here with Claude Fable and Hyperframes. You'll also notice that I was on max, so obviously there was a lot of energy being put into here.

But when it spun up the sub agents, all of those sub agents in the workflow were not Fable. But you do seriously have to be careful.

This was obviously me doing an experiment, and I just wanted to see what it could do. This ate up about 40% of my $200 a month plan.

So in one hour, it ate up almost half of the plan. So obviously, be careful. You can see here it says done.

The video is ready to upload. This is where it lives. This is how long it is.

Here's what I built. Here's how it was verified. And it has this weird thing where I'm trying to scroll up to show you guys the prompt, it like cut off.

So let's see if I can recover that. Okay. So here is the exact goal prompt that I set.

I'm not going to read this entire thing, but you guys can pause the video and and read it if you want. You'll notice here at the end what I did is I I gave it context. I said, you should only stop when you are a 100% confident that this is a high quality video.

This will be going out to my YouTube channel. So if it doesn't look good, you know, it's high risk. It will damage my reputation.

Now obviously, like with slash goal, you wanna do things that are pretty objective, but I have found that when I give it context so it understands why we're doing something, it tends to understand a little bit better. I also said here, now after you build it, verify it. Use a dynamic workflow to visually verify and validate that the entire video is perfect.

The motion graphics come in on time. There's nothing out of bounds. Everything is aesthetic, and everything fits within the goal of a completely finished and fully vetted and reviewed YouTube video.

So anyways, this was my Glido word vomit into Claude goal, and then that's what we got. So that is gonna do it for today. You'll even notice that my hey, Jen, ended the video the exact same way I always do, which is if enjoyed the video or you learned something new, please give it like.

Definitely helps me a ton. And as always, I appreciate you guys making it to the end of the video. I'll see you all in next one.

Thanks, guy.

The Hook

The bait, then the rug-pull.

What you're watching was not filmed. The avatar is synthetic, the voice is a clone, and every word of the script was written by Claude. The creator typed one prompt, walked away, and came back to a finished video he had never seen.

Frameworks

Named ideas worth stealing.

01:56list

AI video production pipeline

Script: Claude reads source, fact-checks, writes in creator voice
Voice: ElevenLabs, chunked under 60s to prevent drift
Avatar: HeyGen Avatar 5, Playwright workaround if API lacks it
Edit: ffmpeg stitch plus word-level transcription
Motion graphics: HTML/GSAP via Hyperframes, rendered into video
Verification: frame render, visual review, re-render failures

Six-stage pipeline Claude orchestrated autonomously in one /goal session

Steal forAny AI content production workflow — the chunk-to-prevent-drift insight alone is worth stealing

CTA Breakdown

How they asked for the click.

VERBAL ASK

03:03subscribe

“if you guys enjoyed the video or learned something new, please give it a like”

Delivered by the AI avatar before the creator breaks the fourth wall — structurally clever because the AI delivered the CTA before the human revealed the trick

MENTIONED ON CAMERA

01:56toolElevenLabs ↗

02:10toolHeyGen Avatar 5 ↗

02:35toolPlaywright ↗

02:45toolffmpeg ↗

02:45toolGSAP ↗

FROM THE DESCRIPTION

PRIMARY CTAWhere the creator wants you to go next.

AFFILIATECommission earned if you click.

OTHER LINKSAlso linked in the description.

Storyboard

Visual structure at a glance.

open

hookopen00:00

reveals

hookreveals00:19

model tier

valuemodel tier00:36

pipeline

valuepipeline02:03

voice drift

valuevoice drift03:27

code = cards

valuecode = cards04:06

fourth wall

hookfourth wall05:03

session

ctasession05:46

Frame Gallery

Visual moments.

open

Frame at 00:06 from Claude Fable 5 Made This Entire Video By Itself

Frame at 00:10 from Claude Fable 5 Made This Entire Video By Itself

Frame at 00:15 from Claude Fable 5 Made This Entire Video By Itself

reveals

Frame at 00:23 from Claude Fable 5 Made This Entire Video By Itself

Chat about this