Big Idea

The argument in one line.

Subagents are not a power-user edge case — they are the correct default any time a task would dump context into your main session that you will never re-read, and the five-signal decision framework tells you exactly when that threshold is crossed.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You use Claude Code daily and have wondered when to reach for subagents versus keeping work in your main session.
You have hit context-window pressure mid-project and want a systematic way to prevent it.
You want to reduce API costs by routing research and summarization tasks to cheaper models.
You are building repeatable workflows (security audits, research passes, doc generation) and want them to fire automatically without explicit invocation.
You understand Claude Code basics — YAML front matter, the .claude folder — and are ready to go deeper on agent architecture.

SKIP IF…

You have never used Claude Code — this assumes fluency with the tool's project structure and slash commands.
You need multi-agent team orchestration where agents talk to each other; subagents here are strictly one-to-one with the main session.

TL;DR

The full version, fast.

A subagent is a fresh, isolated Claude session your main session delegates work to — it keeps your context clean, can run a different model tier to save money, and fires automatically when its YAML front-matter description matches the task at hand. You build one by creating a single Markdown file with a name, description, model, and tools in the YAML block. The description is the trigger — too vague and it misfires, too broad and it collides with skills. The decision rule is simple: if the task would dump a pile of output you will never re-read into your main session, it belongs in a subagent.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 01:24

01 · Cold open: five-persona book review

Five subagents — beginner, software engineer, business owner, publisher, enterprise exec — run in parallel reviewing a book chapter. Demo is live before any explanation.

01:24 – 03:31

02 · What is a subagent

Orchestrator/specialist framing. Main session delegates; subagents report back. Context-clean benefit explained. Token counter shown live.

03:31 – 05:31

03 · Built-in vs custom agents + live research demo

Built-in agents fire automatically; custom agents are Markdown files. Desktop app demo: researcher subagent spun up to investigate Fireflies.ai using Haiku while Opus stays in main session.

05:31 – 06:55

04 · Anatomy of a subagent file

Single Markdown file = YAML front matter + instructions body. Walkthrough of a real ClickUp-searcher agent. Color field shown live (green badge in desktop app).

06:55 – 08:40

05 · The four settings that matter

name, description, tools, model. Description is the trigger — precision prevents misfires. Claude Code docs shown as the canonical reference for all YAML options.

08:40 – 09:31

06 · How to write one that actually works

Precise description, then a tight instructions body. Agents and skills compose — skills can invoke subagents and vice versa. Iteration is required; first try will not be perfect.

09:31 – 10:51

07 · Skills vs subagents

Core difference: clean context window + ability to run a different model + parallel execution. Skills fire in the main session; subagents get their own. They are complementary.

10:51 – 12:51

08 · Project-level vs global-level

Project agents travel with the repo. Global agents are personal tools usable across every project. Can exist in both locations simultaneously. session-handoff skill example used.

12:51 – 18:11

09 · Building one live with /agents

Slash command walkthrough: create, scope (project), generate with Claude, trim description, set tools to read-only, pick Haiku, set memory to project. Real misfire diagnosed: unclosed YAML quote broke the front matter silently. Fix demonstrated.

18:11 – 20:38

10 · Subagents as specialists + awesome-claude-code-subagents repo

Assembly-line of specialists framing. GitHub repo of community-built subagents (API designer, backend dev, GraphQL architect, TypeScript, SQL, etc.). Warning: check for prompt injections in downloaded Markdown.

20:38 – 22:11

11 · Save money + read-only permission layers

Smart boss / cheap workers model: Opus orchestrates, Haiku does the reading. Tool restrictions are a hard permission layer; prompting 'don't write files' is just a suggestion. Max turns explained.

22:11 – 23:46

12 · When to use one — the five signals

Decision framework: reading many files, generating walls of output, repeating jobs, parallel independent tasks, unbiased reviewer. Skip if: quick edit, sequential steps, agents need to talk to each other, sub-agent needs the full conversation context.

23:46 – 26:42

13 · Dynamic workflows + recap

Dynamic workflows (UltraCode keyword) can spin up 40+ agents at once — powerful but burns session limits fast. Recap slide: share via project repo, personal ones go global, cheap workers model, parallel jobs, UltraCode for giant parallelism.

Atomic Insights

Lines worth screenshotting.

A subagent is just a Markdown file — the same format as a skill, but with a clean context window and the option to run on a different model.
The description field in your YAML front matter is the trigger, not documentation — precision there is what separates reliable invocation from constant misfires.
Progressive disclosure means Claude reads only the YAML front matter first; it pulls the full body only when it decides to invoke the agent, saving tokens on every prompt.
Subagents cannot talk to each other — if five agents need shared state, you need an agent team, not a collection of subagents.
A read-only subagent enforced by tool restrictions is categorically safer than a prompting instruction that says 'don't write files' — one is a permission layer, the other is a suggestion.
Delegating a 300-page research read to a Haiku subagent and receiving only a summary is the cheapest way to keep Opus focused on reasoning rather than ingestion.
Dynamic workflows (keyword: UltraCode) can spin up 40+ subagents at once — they eat through session limits fast enough that casual use will surprise you.
Skills can invoke subagents, and subagents can invoke skills — they compose, they do not compete.
The parallel-jobs use case is the most underused: if 15 independent tasks can run simultaneously, doing them serially is just wasted time.
A subagent with no memory (memory: none) wakes up completely blind every time — useful when you need an unbiased reviewer who has not seen your prior decisions.
Project-level agents travel with the repo; global agents are personal tools available in every project. A subagent placed in both locations is valid.
The YAML front matter must be closed correctly — an unclosed quote silently breaks the entire agent the same way it breaks JSON.
When a subagent finishes, only the summary it returns pollutes your main context — not the 22,000 tokens it consumed doing the work.
The 'inherit from parent' model option means a subagent scales automatically with your main session's model without manual updates.
You can launch a terminal session already inside a subagent via the --agent-name flag — useful for focused work but rarely needed in practice.

Takeaway

One Markdown file, one decision rule, one cost trick.

WHAT TO LEARN

Subagents are not advanced infrastructure — they are a single file with five YAML fields, and the only real decision is whether the work would dump context you will never re-read.

01Cold open: five-persona book review

Spinning up five subagents with distinct personas in parallel is a one-prompt operation — the demo is live before any explanation to show, not tell.

02What is a subagent

A subagent is a fresh session: it has no memory of your main context, runs independently, and only the summary it returns can pollute your main window.
The context-clean benefit is most visible with token counter monitoring — watching the main session stay flat while a subagent handles a research pass makes the value concrete.

03Built-in vs custom agents + live research demo

Built-in agents (researcher, explorer, etc.) fire automatically in the background — you have probably seen them invoke without asking; custom agents extend this to your own workflows.
Routing a research task to a Haiku subagent while your main session runs on Opus is the core cost-saving pattern: cheaper model for ingestion, expensive model for synthesis.

04Anatomy of a subagent file

An agent's color badge in the Claude Code desktop app is purely visual identification — it is set in the YAML as a single field and has no effect on behavior.

05The four settings that matter

Of all the YAML fields, description deserves the most iteration — it is the only field that determines whether Claude invokes the agent automatically or ignores it entirely.
The official Claude Code docs list every supported YAML field; using Claude to read them and generate front matter for a new agent is faster than reading the spec manually.

06How to write one that actually works

A subagent body is just a prompt with ordered steps — write it the same way you would write a detailed task instruction, then tighten after the first real use.

07Skills vs subagents

Skills and subagents are structurally identical files; the meaningful difference is that a subagent gets a fresh context window and can run a different model, while a skill runs inside the main session.
They compose: a skill can spin up subagents, and a subagent can invoke skills — the two are additive layers, not competing approaches.

08Project-level vs global-level

Project-level agents are the right default for team or client work — they ship with the repo so collaborators get the same agent behavior without manual setup.
Global agents are the right default for personal tools you use everywhere: session handoff, research, critique routines that are not codebase-specific.

09Building one live with /agents

The /agents slash command in Claude Code generates a complete Markdown file from a natural-language description — the generated description will be too long; trim it to the essential trigger phrases.
An unclosed quote in YAML front matter silently breaks the entire agent — the same class of bug as a malformed JSON key, and equally invisible until you go looking for it.
When an expected agent does not fire, the diagnostic is to explicitly ask Claude why it did not invoke it and what the description said — that conversation is the fastest way to tune the trigger.

10Subagents as specialists + community repo

Community-built subagents (awesome-claude-code-subagents on GitHub) are free Markdown files covering dozens of specialties — the only due diligence required is checking for prompt injections before adding any file to your .claude folder.
Treating each subagent as a subject-matter specialist rather than a generic helper is what makes the assembly-line model work — narrow scope in the instructions produces more reliable outputs.

11Save money + read-only permission layers

Tool restrictions in YAML are a hard wall — they prevent actions regardless of what the agent is prompted to do. Relying on prompt instructions alone to restrict behavior is not a permission layer.
Max turns is the guardrail for looping agents — set it when a subagent might iterate on a large codebase or research corpus and you want a forced return to the main session.

12When to use one — the five signals

The five signals (reading many files, generating wall output, repeating jobs, parallel jobs, unbiased review) are the operational version of the rule of thumb — any single signal is sufficient to reach for a subagent.
The skip criteria are equally important: if the task requires agents to communicate with each other, a subagent chain is the wrong architecture; use an agent team instead.

13Dynamic workflows + recap

Dynamic workflows triggered by 'UltraCode' are the scaling tier above subagents — they auto-generate the delegation plan and can parallelize across 40+ agents, but they consume session quotas fast enough to notice immediately.
The summary rule applies at every scale: even a 210-agent dynamic workflow should return a distilled result to the main session, not the raw output of every agent.

Glossary

Terms worth knowing.

Subagent: A fresh, isolated Claude session spun up by a main session to handle a delegated task. It has its own context window, can run a different model, and communicates results back to the main session only when done.
Orchestrator: The main Claude Code session that assigns work to subagents and synthesizes their results. It never talks directly to other subagents — all communication routes through it.
Progressive disclosure: Claude Code's behavior of reading only a skill or agent's YAML front matter first, then pulling the full body only when it decides to invoke. Saves tokens on every prompt that doesn't match.
YAML front matter: The metadata block at the top of a subagent or skill Markdown file, delimited by triple dashes. Sets the agent's name, description, model, tools, color, and memory scope.
Dynamic workflow: A Claude Code feature (triggered by the keyword 'UltraCode') that automatically generates a multi-subagent execution plan, potentially spinning up dozens of agents in parallel for a complex task.
Tool restrictions: YAML fields (tools / disallowedTools) that explicitly define which actions a subagent can and cannot take. A read-only agent enforced this way cannot write or delete files regardless of what it is prompted to do.
Project-level vs global-level: Project-level agents live inside a repo's .claude/agents folder and ship with the codebase. Global agents live in the user's home .claude/agents folder and are available across every project on the machine.
Max turns: An optional YAML setting that caps how many back-and-forth exchanges a subagent can make before it must return to the main session, preventing runaway research or code-review loops.

Resources

Things they pointed at.

18:11linkawesome-claude-code-subagents (GitHub repo) ↗

06:55linkClaude Code subagents documentation ↗

03:31toolFireflies.ai ↗

26:00productNate Herk's free Skool community (AI Automation Society) ↗

Quotables

Lines you could clip.

22:30

“Is this about to dump a pile of stuff into my chat that I'll never read again? If that's ever yes, delegate it to a subagent.”

The entire video distilled into one decision rule. No setup required.→ TikTok hook or IG reel cold open↗ Tweet quote

20:38

“Have your smart boss — the Opus model you talk to on the day to day — working with a bunch of little Haiku agents. It's gonna save you a lot of money.”

Memorable framing of tiered model economics, immediately actionable.→ Newsletter pull-quote or TikTok hook↗ Tweet quote

14:30

“There's a big difference between a permission layer being explicit tools it's allowed to use and just prompting 'hey, don't do that, or you don't need to read that.' One is a wall. The other is a suggestion.”

Cuts through a common misconception about AI safety. Self-contained.→ Newsletter pull-quote↗ Tweet quote

08:40

“The description is basically the trigger. The more precise your descriptions are, the more often Claude Code will actually trigger them, and you won't get misfires.”

Single most actionable tip in the video. Short and standalone.→ TikTok caption or IG carousel slide↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

analogy

00:00So I don't know what's going on up here, but I just told Claude Code to spin up five different sub agents, and they all have different personalities. One is gonna be a complete beginner. One will be a software engineer.

00:08One will be a business owner. One will be a publisher. And it comes back and it says, okay.

00:12I'm kicking off all five now, each with a distinct persona and lens. These will run-in parallel. You can see that this is now running four agents.

00:19The fifth one's about to spin up. And on the bottom, if I click into a different session, so we've got the main or we've got like the beginner, and I enter this conversation, we can actually see what's going on here.

00:28Meaning if I scroll up, I can see the actual prompt that the main session kicked off to this sub agent. So here we have Linda, 58 years old, a retired elementary school teacher, you are a complete beginner to AI. And then we see the actual task which is to read all the chapters and give a bit of a review.

00:44And so all of the other sub agents probably have a very similar prompt if I go to, like, the enterprise exec. So same exact prompt except for here, you're role playing as David, 52, a COO at a 12,000 person Fortune 500 financial services company. So anyways, the point being, what we can do is have our main session up here, and the main session can delegate to as many different sub agents as we want.

01:04And all the sub agents can have different chat models, different personas, different skills, different subject matter expertise. And if you watched my video where I ranked all of my favorite Cloud Code features, sub agents ranked number six. So today, what I'm gonna do is I'm gonna tell you guys exactly how to use them, what they are, when you need to use them, and how you can use them better than 99% of people using Cloud Code.

01:22So let's not waste any time and get straight into today's video. Okay. So what is a sub agent?

01:26You guys just saw a demo. We have a main chat. So right here is where I said, hey.

01:29Can you spin up five different sub agents? And what it did is it right here kicked off five different ones, and then it comes back with an overall review. Apparently, I need to do some work on this book because I only got about an eight.

01:39More info on my book will be coming soon. But anyways, the main session is basically the orchestrator. It says, okay.

01:44Cool. So I am the one who's actually talking to Nate, but what I can do is I can spin up a bunch of sub agents that can only talk to me, and I can assign them work. Go read these files.

01:52Go do this research. Go fix that bug. And then you come back to me with a report of what you did, and I'll communicate that back to Nate.

01:58So there are a ton of different reasons why these sub agents are useful and why they exist. So let's just start with this first one, which is that it keeps your context clean. So let's say I'm in Cloud Code.

02:06Right? And I'm just talking, hello. How are you doing?

02:08What's going on? Let's build something. Right?

02:10Like, maybe we're doing research. Maybe we're building an app, whatever it is. You start to fill up your context window, which you guys can see right here with my status line.

02:16You can see right now we're about 48,000 tokens in, 5% of the way up. And so as this starts to fill up, it starts to get polluted with information. But if you kick something off to a sub agent, as you guys saw earlier, it's a completely fresh chat.

02:30So just to show you guys another real quick visual demo, I'm in the desktop app, which is a little bit, you know, easier to see, and it's visually more pleasing than the terminal sometimes. But let's say I said, hey, Cloud Code. Go ahead and kick off a sub agent to do some research for me about a product called fireflies.ai.

02:46And so this is my main session. You know, I can talk to this thing. It'll help me do research on different tools.

02:50And then what happens is it kicks off a researcher agent to do the research. And what's cool is right now you'll notice I'm using Opus, right, which is obviously the most expensive model, but we can have the sub agent kick off and do research with Haiku or Sonnet. So we're getting this research for cheaper, and we're getting a fresh context.

03:05So if I click on this agent here, you can see this is basically the prompt that the main agent sent over to this sub agent, which was, hey. Research Firefly's dead AI. Give us what it is, core features, how it works, pricing, give us all this stuff.

03:17And now this agent is the one over here searching the web and creating its opinions rather than our main session. So this helps preserve your main context in case you're ever doing a ton of research or reading a ton of stuff that you don't want to fill up your main context window. Right?

03:30So that's one thing. There's also built in sub agents, which is the one we just saw. Right?

03:34That was, like, basically a built in cloud code research agent that will you've probably seen it get invoked automatically without you even asking it to be invoked. And then you've got custom sub agents that you're actually able to build yourself. And if you guys remember earlier in the demo when we spun up those different agents, I said, hey.

03:49One should be a software engineer. One should be a beginner, blah blah blah. You remember those all said general purpose.

03:55So those were still built in native generic agents that just had a prompt. So that doesn't mean that we built those custom agents.

04:02That was just a general purpose agent that Cloud Code prompted differently. We If wanted to actually build a custom agent, that would be a markdown file. So if I open up my Versus code, you guys know in the dot Claude folder, we have different things.

04:14And the one you probably know the best is called skills. So in the skills folder, let's just take a look at real quick my agent builder skill. What this is is it's a markdown file.

04:22This lives as markdown so that I could send it to you guys. I could put it in my community. I could send it to my team.

04:27And all someone has to do is put this markdown file in their dot claud in a skills folder, and then they're able to use it. And so a sub agent is the exact same actual tangible thing as a skill dot m d file.

04:39It's just called something else. You know, we've got the YAML front matter up here, and then we have the instructions of what the skill does and the actual steps to take. So if I open up my agents folder, also in my dot claud, you can see I've got a different a couple different agents here.

04:51Right? So this one, let's just look at, is called the ClickUp dash searcher dot m d, and that's an agent that's called ClickUp searcher. We've got the YAML front matter up here, name, ClickUp searcher.

05:00We've got the description. We've got the model, which I've defined here. We've got the color, which means if I actually use the ClickUp searcher agent, it shows the color.

05:08So, actually, let me just show you. Can you go ahead and use the ClickUp searcher agent to show me what we've talked about today in the weekly commitments channel? And so what you'll notice is I invoked that with completely natural language.

05:17I'll have the ClickUp searcher agent pull today's messages, and then right here I can see the green color. So that's all it means when you actually assign an agent a color. It's just so you can actually see it right there.

05:27And down here, you know, earlier, right here is where it said general purpose. What it says now is ClickUp searcher, and that's how we know that that's a custom agent that we built ourselves. So, anyways, those are the two differences.

05:37And like I said, it's just one markdown file. And what this is called up here, the YAML front matter, that's called progressive disclosure, which basically means if you say, hey.

05:46Go do x, y, and z for me. Cloud Code will naturally go search through your sub agents and your skills to see if you have any sub agents or skills to use. And so for the rest of this video, I'm just gonna say sub agents, not skills, but they both work with this kind of progressive disclosure process.

06:01But the idea is that Cloud Code is able to read just the front matter, just the name and the description, and then decide does this apply to this prompt. If so, I'll pull in the sub agent, and I'll run all of the extra stuff and read it.

06:13But otherwise, I'm not gonna waste my tokens by reading everything if I'm not going to end up invoking that sub agent. So that's why we have the CML front matter and that's why that's very important. Besides the fact that it also defines things like tools, model, and then there's tons of other levers you can pull there, but gonna dive into that right now.

06:29So, anyways, settings up top and then your instructions go below that. And these are the four that I think matter the most. The name, obviously, so you can reference the sub agent.

06:37The description is really important. This is basically the trigger, and this is how you can make sure that your sub agents are getting invoked without you actually saying, hey. Go invoke this x y and z sub agent.

06:47So the more precise that your descriptions are, the more often Claude code will actually trigger them, and you won't get misfires. Misfires basically meaning you want it to invoke a sub agent, but it doesn't, or you don't want it to invoke a sub agent, but it does.

07:01And so sometimes the only way that you can really make sure that you're tuning the actual front matter so that you're not getting these misfires is you just have to test it out. Mean, you just have to use it more and more. And then, like, when it doesn't fire and you think it should, you just think about, okay.

07:13Why didn't that happen? And then you update the description, and then same thing if it's the opposite way around.

07:18And then if you go to the actual Claude code documentation on these sub agents, you can see all of the different things that you can actually put in the front matter. You can put tools like we just mentioned, but you can also put disallowed tools. So if you don't want it to ever write or edit files, you can put that so that these sub agents are explicitly read only.

07:34You can also define things like which MCP servers it's allowed to use, and you can even give it skills. So, basically, any setting that you want to configure for your custom sub agents, you can pretty much do.

07:45Just come to the documentation, have Cloud Code read the documentation, and say, hey. I wanna set up a sub agent that does x, y, and z. It should not be able to do x, y, and z.

07:52It should be able to look at this data, not look at this data, and it will help you build the right YAML front matter. So how do you actually write a great sub agent? So obviously, not having a weak description.

08:03So having, you know, a very precise type of description. You can even say something like use proactively if you wanted to fire off, you know, pretty generously. And then after you have the actual front matter not dialed in, it's all about the body.

08:15The body is the way that the sub agent actually works, what skills it invokes. Because, yes, sub agents can invoke skills, and skills can invoke sub agents, so keep that in mind. They work together.

08:23They're not, you know, competitors. And you have to have that same idea once again that that you have to iterate.

08:29It's not gonna be perfect on the first try. Every time you use your sub agent, you have the opportunity to give it feedback on what it didn't do well and how to make that better, and then what it did really well and how to make sure that it does that every time. And real quick, what's the difference between a skill and a sub agent?

08:43Well, honestly, at their core, they're very similar because you're able to define, do x, y, and z in this order. You know, here's a prompt. Here's a persona, whatever.

08:51But the main difference really is that one has a clean context window and one doesn't. And you can run a ton of different sub agents in parallel in, you know, independent sessions as we saw earlier, whereas the skill is typically more of something that I'm kind of triggering in my main session all the time. But once again, that doesn't mean that I don't have a great LinkedIn research skill that I hand off to sub agents to use.

09:10You know what I mean? So, really, I think of it as kind of like the parallel use and the clean context window and, of course, the ability to use a different model. Now there is something that I'm gonna show you guys real quick in Cloud Code, which is like it allows you to build agents very easily with a slash command.

09:25You can also do it with natural language, but I'll show you that in a sec. Before we show you that, I did wanna kind of go over this real quick, which is project level versus global level sub agents. And this is the same, you know, if you understand how the cloud code, like settings files work or the cloud code, like hooks and skills, MCP servers even, it's all the same.

09:43You always have project level stuff or you have global level stuff. So project level stuff is basically what lives in your project, in that repo. So right here, we're in my Herc two project, and anything that you see inside of my dot claud right here is project level.

09:56So all of these agents are product project level. All of these skills are project level. And then I've got other sub agents and other skills that are global.

10:03So for example, if I say, hey. Where does my session handoff skill live?

10:08That's gonna find that globally. Because if I go to my skills, there's no skill in here called session handoff as you guys can see. But right here, the session handoff skill lives in your global skills directory at the, you know, the user level.

10:19And so global ones are usable by every product on your machine. So no matter which product or repo I'm working in, I can always use that session handoff skill or I can always use that, you know, sub agent that I've built and it belongs to me. So if I share this GitHub repo to someone, they won't get that skill or they won't get that sub agent.

10:35It's not a big deal because you can easily say, oh, you know, you accidentally made that sub agent globally, but I actually wanted in this project. Can you just move it? And because it's just a markdown file, they move super easily.

10:43You can even have them both. You know, you can have them in both spots. But the reason why I wanted to explain that before I showed you this is because if you know, you have to choose.

10:51So if you do a slash agents, you can look at what agents are currently running. If you've got a bunch of sub agents, You can go to your library, and you can see a bunch of different built in agents that are down here, like Claude, Claude code guide, explore, plan, and you can also look at your project level agents.

11:05So for example, we could look at, you know, the AI trend hunter. We can look at carousel planner.

11:10You'll notice that some of these have different models, like all of these are Sonnet, but then some of these have different project memory. You know, this one has product memory. This one has none.

11:17But anyways, what I wanted to show you guys, if you go to create a new agent, you choose here if it's a personal or global or if it's a project. So let's just make a new project one right now. In order to create it, we can generate with Claude, or we can do manual configuration.

11:29So I would probably come in here and choose generate with Claude. And then you basically just describe what this agent should do and when it should be used, and it says to be comprehensive for the best results. Create me a sub agent that criticizes all of my work.

11:40I basically want to be able to hand it off ideas, and I want it to not agree with me, but I want it to, um, criticize it. I want it to roast it. I want it to play devil's advocate and look for every possible hole in the plan and what could go wrong and give me back basically that report.

11:56I want this thing to be invoked whenever I say roast my plan or review my plan, anything like that. So that's my prompt. Obviously, that's pretty concise.

12:04So, like, if you really had a good sub agent use case, you probably wanna give it some more detail and some more nuance there. But I just wanna show you how it's able to generate this file from the description. And because I chose project level, it's gonna create that in the agents folder within my dot Claude.

12:18So in a sec here, we'll see an agent pop up. It'll probably be called, like, devil's advocate dot m d or roast agent dot m d, something like that.

12:26Oh, but before that, it also says what tools do we want it to be able to use. So, like, for example, in this one, maybe I only want it to go with read only tools. So I could say just read only, and then, you know, we could look at some advanced options too, which would be all these MCP servers and a bunch of other things.

12:43And even, like, individual tools, whoops, even individual tools like bash, cron create, cron delete, like, you can get really granular here. But in this case, I'm just gonna go ahead and hit continue with read only tools. And then we're able to choose the model.

12:54And in this case, we're gonna go with Haiku, but you can also inherit from the parent. So if the parent's running on Haiku, all sub agents all of that sub agent will be inherited or same thing with Opus or Sonnet. And then finally, we can choose our background color.

13:06I'm just gonna go ahead and choose pink. And then we get to choose the memory, so whether that's project, none, user, or local.

13:13And so really for the sub agents that I'd be creating and the ones that I would recommend you guys do, I'd probably just say project scope unless you want all these sub agents to be completely completely innocent, wake up completely blind, no memory at all, then you would choose none. But as far as between project user and local, I'm probably just gonna always choose project.

13:29Alright. There we go. So I'm gonna go ahead and save this new agent, you can see it just pops up right here.

13:33It's called the plan roaster. And what happens when you create them with Claude is it makes this huge because it doesn't yet understand what you might say and how you want it to trigger.

13:44So my first recommendation would be trim this down a little bit because once again, this is part of the progressive disclosure, so there's no need for the description to be so long. So I'm literally gonna delete all of this.

13:55I mean, it's good to look at, but I'm gonna delete all of this up to here. And really, in my case, this is good enough. Right?

13:59Use this agent when Nate wants an adversarial critique of an idea, plan, strategy, blah blah blah. Trigger on phrases like roast my plan or review my plan. We've got the tools, the model, the color, the memory, and the name.

14:10So now I'm just gonna open up a new session of Claude, and let's real quick just say, so I've got this plan, and I want to create an ice cream stand in, you know, Chicago.

14:21I wanna create this ice cream stand on Oak Street Beach, and I don't yet have a refrigerator. And I wanna sell the ice cream all day long for about, you know, $20 a pop.

14:32And it's just a little a little piece of ice cream. So go ahead and roast my plan.

14:37This is actually interesting. So I created a skill called roast, and it's gonna use that instead.

14:41So it defaults to that because it thinks that it's good enough. And in the skill, the roast skill, I actually have it spinning up five different sub agents. So that's a good demo.

14:51I didn't mean for this to happen. These are all general purpose sub agents that live within my roast skill, but I'm gonna go ahead and cancel that. I'm gonna run this prompt again.

14:58But this time, I'm going to explicitly tell it to not use a skill. But like I said, that's a good example of showing you that in a skill, you can have it fire off a bunch of sub agents. Anyways, the whole reason why the roasting thing is so top of mind is because Cloud Code and AI in general can be a little bit of a sycophant and can just be a yes, man.

15:13So having things worked out like a roast skill or like a plan roaster sub agent is pretty helpful. Okay. So look what happened here.

15:20It did not invoke our roast our plan roaster sub agent. So what I'm gonna say is go ahead and take a look within our dot Claude agents folder.

15:29We've got a sub agent called plan roaster dot m d, and you didn't invoke it here. And I'm not sure exactly why. Go ahead and read the description of that and and look back at my prompt and help me understand why did you not fire off the sub agent so we can make this better.

15:43Because that exact prompt is something where I'd want you to use that that sub agent. And so that's really the way that I think about iterating on my descriptions both for skills and for sub agents.

15:52Just just understanding, like, why didn't it fire or why did it fire, and how do we then rework the description. I do think there's a little bit of, you know, foul play here because my roast skill got invoked earlier, and it's probably, like, defaulting to those skills before a sub agent.

16:08So, you know, maybe that wasn't the best example, but I guess it's good that it happened so I can show you guys the way that you might think about improving your Yemo front matter.

16:17Okay. So completely my fault. I didn't close out the front matter.

16:22So good tip, you have to close off the quotes if you open them up. Right? That can break your JSON.

16:28It can break other things as well. So it will break your YAML front matter. It said the problem wasn't judgment.

16:34It was mechanical. It also said, hey. You know, you do have a roast skill, so maybe there was a little bit of, you know, cloudiness there.

16:39So I completely get that, but it went ahead and it updated the description. You can see it made it a little bit longer, but there's still collision between the roast skill and the planned roaster. Right?

16:48They both get invoked kind of similarly. So, really, the best thing to do here is you would combine the skill to say, hey. Whenever you run the skill, also invoke the plan roaster agent instead.

16:57But for the sake of the demo, I am just going to actually be way more specific about what to use. So there goes our prompt once again.

17:04The copy and pasting out of the terminal is horrible. So usually if I wanna copy and paste something from the terminal, I will tell it to write it to a text file, or I will just use it in the desktop app. But either way, I was way more explicit here.

17:15You can see I said use the plan sub agent, not the roast skill. And now it's initializing our pink plan roaster sub agent, and what I can do is I can go down to this section down here. I can open up this other terminal, and we can see the exact prompt that got sent over to our plan roaster, which was roast this business plan hard.

17:32Here it is in full. I wanna create an ice cream stand in Chicago, blah blah blah. So it's basically exactly what I said.

17:37Tear it apart, hit every flaw, the missing refrigerator, the absurd $20 price, and then the sub agent already finished up. So it sent us back to the main session, and now the main terminal is going to interpret what the plan roaster sub agent said and then give us the rundown.

17:50And what you'll notice here is the the plan roaster took 22.8 k tokens, but those 22.8 k tokens did not pollute our main session. All we got was basically this much, which is pretty awesome.

18:01So anyways, that's a real quick, a little bit of a sloppy example, but hopefully it shows you guys the different elements to play with and, you know, the way that you think about using these sub agents, but that's what it looks like in Cloud Code. The way that I like to think about these is the same way that I've thought about AI since the beginning of my YouTube channel, which is your AI, you know, it's very fun and cool to have one mega personal assistant agent that can do everything, but really the best way to do it is to have each AI be a specialist.

18:24And that's where your main general ones can be pretty good at, you know, a jack of all trades because of skills. Right? You invoke a skill and now it's good at LinkedIn posts.

18:31Now it's good at doing research. Now it's good at scripting videos, whatever. But really the sub agents are actual specialists.

18:37They have subject matter expertise. So you can have one that's a security auditor. You can have one that does tests.

18:42You can have one that writes docs. You can have one that's an expert with databasing, whether that's the architecture or the queries or anything like that. And you can just silo basically this assembly line or parallel work of a bunch of agents that are good at one thing and really, really good at that one thing.

18:55And the other thing that's cool about that is you can borrow subject matter expertise from other people. This is just one of the hundreds of thousands of examples out there, but there's a GitHub repo, which I'll link in the description, and this one's called awesome Cloud Code sub agents.

19:08So if you scroll down here, you can see there's a bunch of sub agents that you can use and in different categories. Right? You've got an API designer, a back end developer, a GraphQL architect.

19:17We've got other language specialists like TypeScript or SQL. You know, you can scroll through and find a lot of these custom same way that you look for skills from other people, custom sub agents that other people have already built, and they maybe know a lot more about CLI developing than you do. So they've put all their subject matter expertise into a sub agent, and now you can just use that because all it is is a markdown file.

19:37Now, yes, because everything's open source and because all these markdown files are out there, you wanna be careful. Right? Like, you're downloading a file or you're putting it into your system, just make sure there's no prompt injections in there.

19:46Make sure there's nothing, you know, malicious. And you can even do it by having maybe a sub agent that verifies open source repos, and it's read only. It can never send data.

19:55It can never do anything. And all it does is verifies that there's nothing malicious inside of that markdown file. So, anyways, we looked a little bit about how Claude picks out the agents.

20:03It can be automatic, and it can automatically invoke things when you are, like, looking through your code base or whether you are doing research. It'll automatically chuck some out there. You can also have them very proactively use sub agents if you have things like that in the description so it fires frequently.

20:18You can also list them explicitly by name. You know, you can tag the agent name or you can say, hey. Use the plan roaster sub agent like I just did in that example.

20:25And you can also launch a session as a sub agent. If you do Claude with a flag of the sub agent's name, it'll actually put you right in a terminal right away with one of those sub agents. And, honestly, I never do this, but it's nice to know that that feature exists.

20:38So once again, just wanted to hit on the point that you can do read only sub agents, which is pretty cool, just by using tool restrictions and giving them only certain things. It's always nice to have basically the mindset of if my AI could touch data or could read data, I have to assume that it will. Even if I never prompt it, I have to assume that it will.

20:56And that's the difference between a permission layer being explicit tools that it's allowed to use and explicit MCP service it's allowed to use and just prompting and saying, hey. Don't do that, or you don't need to read that. Don't worry about it.

21:08There's a big difference between those types of permission layers. And then, of course, you have the ability to save a ton of money here. Let's say you have to read a 300 page research report and just get, you know, maybe three fun facts from it or just get a summary.

21:20There's probably no reason, unless it's a really, really, you know, technical report, to use Opus for that, probably not even Sonnet. So delegate that to a haiku sub agent to read everything and then send back just a small summary to your main session. And that's how you have this system where you have your smart boss, which is the Opus model that you talk to on the day to day that just works with a bunch of little haiku agents.

21:40It's gonna save you a lot of money. It's gonna keep things moving faster, and that's the way that you wanna start utilizing these things.

21:46Another way that you can also keep them from getting out of control is you can have a max turns set on these sub agents. So maybe they're starting to do loops of research or they're doing loops of reviewing through a code base. You can say, hey.

21:56Max turns equals 10. Honestly, I don't use this very often because most of my sub agent delegation is research or very specific workflows where it doesn't really I'm not worried about a loop, and I'm keeping my hands on either way. But that is, once again, just another nice lever to pull.

22:11So then after we've seen all these benefits, hopefully, it's starting to become a little bit more clear, but a lot of people might also still wonder, okay. So when do you actually use a subagent? When is it really better to?

22:20So one core question you can think about is, is this about to dump a pile of stuff into my chat that I'll never read again? If that's ever yes, delegate it to a sub agent. If it's no, then maybe you keep it in line.

22:31But there's also some other things to think about too. Right? So let's look at some signals.

22:34If you're about to read a lot of files, do some sub agents. If you're gonna spit out a wall of output, maybe do sub agents.

22:41If it's a job that you keep repeating, build a custom sub agent for it. If it is independent stuff and you can run a ton of things in parallel, like, you know, maybe you have 15 chapters of a book and you want each chapter to be reviewed, and, like, it doesn't have to be in chronological order, all of them can be reviewed at the same time, then that that's parallel.

22:58And then you can go ahead and do those independent jobs. And also if you want, like, an unbiased reviewer, because once again, sub agents can wake up, no context, completely fresh, no memory, and you can get an honest review.

23:09Now you don't need a sub agent if you're just doing a quick edit, if the steps depend on each other. Right? So if it's, like, one, two, three, then four.

23:16If the agents need to talk to each other, then that's when you would need more of, like, an agent team or a different type of orchestration. I've made a video on agent teams before. They are more expensive than sub agents because they're they're talking and stuff like that, but they share a task list and everything.

23:28Sub agents do not work that way. It's just a one to one relationship between sub agent and main session, not like a one to many.

23:34If you've got five sub agents running, they cannot talk to each other. You would also skip them if you need the sub agent to have, like, the context of the entire conversation or if it needs to ask you a question because you don't get really get to talk to the sub agents.

23:45You know? The main agent is the orchestrator. Now there's also something to think about, which is a fairly newer feature.

23:49It was with with the release of Opus 4.8, which is the dynamic workflows. And what that does is it spins up a workflow that typically delegates to a ton of different sub agents in parallel. So remember the idea is that the main chat is the orchestrator, and you've got a bunch of different sub agents running, whether that's three or whether that's 40.

24:06A lot of times if you're asking for a big project and it decides to use a dynamic workflow, then all that's doing is it's creating a bunch of sub agents and it's delegating to them all at one time. So I made a video about those.

24:17I will tag that right up here if you wanna check out the dynamic workflows video. You'll see an example I did where it spun up 41 sub agents at the same time and just ran them. I've also done some examples, not on video, but, like, when I was testing it out where I did some workflows, and one of them spun up, like, 210 sub agents at the same time, which was great, but it ate through my context or sorry.

24:36It ate through my session limit like crazy. So you definitely wanna be careful when you're spinning up dynamic workflows. They actually then, a few days after this came out, they said, hey.

24:44We changed the trigger word for dynamic workflows from workflow to to UltraCode. You can still say to use a workflow for this, but when you're clearly referring to something else, Claude won't kick off a dynamic workflow. So you wanna make sure that you are being very careful about when you kick off those dynamic workflows because, like I said, they are expensive.

25:00So, anyways, that is pretty much all of the stuff that I wanted to talk about here with sub agents.

25:06So the whole thing on one slide, to do a quick recap, if it's just one quick thing, you don't need a sub agent. Right?

25:12Just because this feature is awesome, which it really is a great feature, that doesn't mean to force it. Because sometimes if you're forcing too many sub agents, you're gonna get worse results. So play around with them, understand the benefits, and start to kick them off when you really do need them.

25:23If you wanna share them with your team, keep them in your projects, keep them in your repo. If you wanna keep sub agents just for you that you can use across any project, then put them in your home folder, kind of, you know, make them globally or personally. You can save a lot of money by having cheap workers with one smart lead.

25:36You can get better results by letting a fresh agent review your work or do work in parallel. If you wanna do a giant parallel job, go ahead and check out a dynamic workflow. Just be careful of your session limit.

25:46And if you're not sure, if it's a pile of stuff that you're never gonna reread, then go ahead and spin off a sub agent. Whether you are using Cloud Code in the terminal or whether you're using it in the desktop app or even the Versus Code extension in Versus Code or, you know, on the web, wherever you're using Cloud Code, everywhere that you use Cloud Code can run sub agents, and the principles that I just talked about are always the same.

26:06This is where they live. That's how you invoke them. They're always YAML front matter, and those are pretty much the best practices.

26:12So I know we covered a ton of information. If you guys want to download this exact slide deck, all you have to do is join my free school community. The link for that is down in the description.

26:20Once you join here, all you have to do is click on the classroom, click on all YouTube resources, and then you'll be able to find everything that I've dropped in here for free. GitHub repos, skills, templates, slide decks, whatever you want. It's all in there for free.

26:31But that is gonna do it for today. So if you guys enjoyed the video or you learned something new, then please give it like. It helps me out a ton.

26:36And as always, I appreciate you guys making it to the end of the video, and I'll see you on the next one. Thanks, guys.

The Hook

The bait, then the rug-pull.

The video opens in the middle of something already running: five subagents with distinct personas — a retired teacher, a COO, a software engineer — simultaneously reviewing a book chapter. The demo is live before the intro card hits. It is a deliberate provocation: most people trigger one agent at a time and wonder why it feels slow.

Frameworks

Named ideas worth stealing.

22:11model

Five-Signal Decision Framework

It'll read a LOT of files
It'll spit out a wall of output
It's a job you keep repeating
Independent jobs that run in parallel
You want an unbiased reviewer

Use a subagent when any of these five signals are present. The override rule of thumb: '10+ files, or output you'll never re-read = subagent.' Scale to dynamic workflow when the signals cover dozens of parallel jobs or a codebase-wide pass.

Steal forAny agentic workflow decision — filters the 'should I delegate this?' question down to a checklist.

20:38model

Smart Boss / Cheap Workers

One Opus session as the orchestrator (the smart boss)
Multiple Haiku subagents handling reading, summarizing, researching (cheap workers)
Only the summary returns to the main context

Match model tier to task complexity. Research ingestion and summarization do not need Opus. Routing them to Haiku and returning only the distilled output to the main session saves money and keeps context tight.

Steal forAny pipeline that combines heavy reading with reasoning — cost goes down, quality stays the same.

06:55model

Subagent File Anatomy

name: human-readable identifier, used for explicit invocation
description: the trigger — precision prevents misfires and collisions with skills
tools / disallowedTools: hard permission layer (not a prompt suggestion)
model: specific tier or 'inherit from parent'
memory: project | user | local | none

Every subagent is one Markdown file. The YAML front matter drives all runtime behavior. The body is the instruction set. Progressive disclosure means Claude reads only the front matter until it decides to invoke.

Steal forTemplate for every custom agent you build — covers the five decisions that matter most.

CTA Breakdown

How they asked for the click.

VERBAL ASK

26:00product

“If you want to download this exact slide deck, all you have to do is join my free Skool community. Click on the classroom, click on all YouTube resources, and you'll be able to find everything I've dropped in here for free.”

Soft, earned pitch. Free community, not a paid product. Slide deck as the specific value hook. Low friction ask that matches the tutorial's generous information density.

MENTIONED ON CAMERA

18:11linkawesome-claude-code-subagents (GitHub repo) ↗

06:55linkClaude Code subagents documentation ↗

03:31toolFireflies.ai ↗

26:00productNate Herk's free Skool community (AI Automation Society) ↗

Storyboard