Big Idea

The argument in one line.

Workflows do not change what subagents are; they move the orchestrator out of the chat context and into a deterministic script, which is what finally lets agent fan-out scale without the main session forgetting itself.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

An engineer who already uses subagents and keeps hitting context bloat when the main session has to manage six or more of them at once.
Someone deciding whether to reach for a multi-agent workflow or a plain skill for a recurring task, and wants the actual tradeoff rather than hype.
A Claude Code user on a Pro plan who needs to know why workflows are off by default and what the token bill actually looks like before turning them on.
A non-developer curious whether massively parallel agents are useful outside of code, such as for deep research or idea stress-testing.

SKIP IF…

You have never used Claude Code and want a from-scratch setup guide; this assumes you already know what a subagent is.
You need a stable production system today, where the recommendation here is to keep using skills, not workflows.
You are looking for prompt-engineering tactics rather than an architecture-level explanation of how orchestration moved into a script.

TL;DR

The full version, fast.

Claude Code's Dynamic Workflows move orchestration out of the chat context and into a deterministic workflow.js script that holds state in variables, runs loops, and returns only final answers to the main session. The subagents themselves are unchanged; what changed is the manager, which is why fan-out to hundreds of agents no longer makes the main window forget or compact under load. A live deep-research run shows the cost honestly: 105 agents and 3 million tokens over fifteen minutes, most of it spent on adversarial three-vote fact-checking. You control it at three levels (prompt, inspect, edit the script), set a different model per phase, and Anthropic's own guidance is to reach for a workflow only when a task fans out across many similar items or needs deterministic, resumable orchestration. For everything else, a skill or plain chat is cheaper and more reliable.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 00:17

01 · Intro

Workflows, not Opus, were the most valuable announcement; what the video will cover.

00:17 – 04:22

02 · How Workflows Actually Work

Subagent recap, why Claude-as-orchestrator breaks at scale, the shift to a workflow.js manager, runtime, journal, and hard limits (16 concurrent, 1000 total, no shell from the script).

04:22 – 09:00

03 · Live Demo: deep-research + Startup Forge

Runs /deep-research on vitamin C through its five phases, and a Claude-invented Startup Forge workflow that ideates, judges, stress-tests, and pitches.

09:00 – 14:59

04 · Inside The Script + How To Control It

Model-per-phase shown in the .js, editing the script, the deep-research run finishing at 105 agents and 3 million tokens, and where most agents went.

14:59 – 16:16

05 · When To Actually Use This (And When Not To)

Three control levels, four ways to start and three to turn off, default-on/off by plan, and Anthropic's verbatim criteria for when a workflow beats a skill.

Atomic Insights

Lines worth screenshotting.

Workflows do not change what subagents are; they only change who orchestrates them, moving the manager from Claude's context into a script.
A subagent runs an expensive task in its own isolated context and returns only the small final answer, so a 60,000-token job costs the main session about 500 tokens.
Claude breaks down as an orchestrator at scale because it has to hold every subagent's intermediate state, routing decision, and result inside its own context window.
A workflow.js script holds state in variables and runs deterministic loops, so only final answers return to the chat and nothing bloats the main context.
A run journal tracks completed agents, so you can pause a workflow and resume it later with finished work returned from cache instead of rerun.
Workflows cap at 16 concurrent agents but allow up to 1,000 agents total per run, so a large swarm still completes, just not all at once.
The script itself has no direct filesystem or shell access; only the agents it spawns can touch files, which keeps the orchestrator sandboxed.
The deep-research deep dive used 105 agents and 3 million tokens in fifteen minutes, and roughly 75 of those agents went to verification alone.
Verification dominates token cost because each of the top 25 claims gets three independent fact-checkers, and a claim needs two of three refutes to be killed.
Isolating work into subagents does not make it cheaper; the tokens still count fully against your usage, so a single run can exceed three million tokens.
You can assign a different model to each phase, such as Haiku for brainstorming, Sonnet for scoring, and Opus for synthesis, by editing the script or asking up front.
There are four ways to start a workflow and three ways to turn it off, including a config toggle, a settings.json flag, and an environment variable.
Workflows are on by default for Max and team plans but off by default on Pro, because an unbounded run can consume an entire plan's budget in a day.
Anthropic's own rule is to use a workflow only when a task fans out across many similar items, needs deterministic loops, or must be resumable and repeatable.
For recurring production work, a skill beats a workflow because you want the same deterministic behavior every day without paying the multi-agent token premium.

Takeaway

When agent fan-out earns its token bill.

WHAT TO LEARN

Workflows are worth reaching for only when a task fans out across many similar items or needs deterministic, resumable orchestration, because everything else costs less as a skill or plain chat.

A subagent runs its expensive work in an isolated context and hands back only the small answer, so the orchestrating session never sees the bloat.
The chat window fails as an orchestrator at scale because it must hold every agent's intermediate state, routing, and results in its own limited context.
Moving orchestration into a script keeps state in variables and returns only final answers, which is the actual mechanism that lets fan-out scale.
A run journal records completed agents, so a workflow can pause and resume with finished work returned from cache rather than rerun from scratch.
Concurrency caps at 16 agents while total agents per run reach 1,000, so large swarms still finish but throttle rather than run all at once.
Isolating work into agents does not lower the bill: the deep-research demo still burned three million tokens across 105 agents in fifteen minutes.
Verification is the cost sink because each top claim gets three independent fact-checkers and is only discarded when two of them refute it.
Assign a cheaper model to high-volume phases and an expensive one only to final synthesis to keep multi-agent runs affordable.
Reach for a workflow when the work fans out, loops deterministically, needs resumability, or repeats; otherwise a skill or chat is cheaper and steadier.
Treat agent count and token spend as the real constraint, not capability, since the limiting factor is cost long before the feature runs out of power.

Glossary

Terms worth knowing.

Subagent: A separate Claude session spawned with its own isolated context window to perform one specific task and return only its result, keeping the main conversation's context clean.
Dynamic Workflow: A Claude Code feature where a generated script orchestrates many subagents instead of the chat session doing it, holding state in variables and running deterministic loops.
workflow.js: The JavaScript script that acts as the workflow's orchestrator. It defines phases, which model runs each phase, parallelism, and budget limits, and can be inspected or hand-edited before running.
Journal: A state record kept during a workflow run that tracks which agents have completed, enabling a run to be paused and resumed with finished work returned from cache.
Orchestrator: Whatever holds the plan and coordinates subagents. The point of workflows is that this role moves from Claude's chat context into the deterministic script.
Compaction: When a long Claude session nears its context ceiling, older detail is summarized to make room. The summary is lossy, so earlier information is not fully recovered.
Adversarial fact-checking: A verification pattern where multiple independent agents try to refute each claim, and the claim is discarded only if a majority of them succeed in refuting it.
Ultracode: An effort setting that auto-triggers a workflow for every substantive task rather than waiting for you to ask for one explicitly.
Persona prompting: Assigning each agent a distinct point of view or role, such as a skeptic with a specific lens, so parallel agents attack a problem from genuinely different angles.

Resources

Things they pointed at.

05:05tool/deep-research skill (Anthropic pre-built workflow)

09:00productClaude Code Dynamic Workflows

00:00productOpus 4.8

15:25linkAnthropic workflows docs (code.claude.com/docs/en/workflows) ↗

Quotables