Modern Creator
Mark Kashef · YouTube

Anthropic's NEW Claude Architect Guide In 39 Minutes

A 39-minute walk-through of Anthropic's new Claude Certified Architect exam guide, translated from a 40-page PDF into five domains, three demos, and five rules.

Posted
2 months ago
Duration
Format
Tutorial
educational
Views
60.8K
1.8K likes
Big Idea

The argument in one line.

The Claude Certified Architect exam guide is really a working syllabus for Claude Code, and five cross-cutting rules - hooks over prompts for anything high-stakes, structured errors, 4-5 tools per agent, separate-session review, and few-shot over instructions - capture roughly 80% of its value.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A developer using Claude Code daily who wants to stop guessing at best practices and lock in a mental model that matches Anthropic's own.
  • An engineer thinking about taking the Claude Certified Architect exam who wants a 39-minute orientation before paying for the official prep materials.
  • A founder or team lead building agentic features in production who keeps hitting reliability issues and suspects the problem is architectural, not prompt-level.
  • Anyone shipping MCP servers, sub-agents, or CI/CD-driven Claude workflows who wants to know which patterns the exam treats as correct.
SKIP IF…
  • You have never used Claude Code or the Anthropic SDK - the video assumes you already know the basics of agents, tools, and prompts.
  • You want a deep tutorial on writing hooks, MCP servers, or CI scripts - this is a conceptual map, not a hands-on build.
TL;DR

The full version, fast.

Anthropic released a five-domain Claude Certified Architect exam, and the official 40-page guide doubles as the best Claude Code syllabus currently in print. The video walks through every domain - Agentic Architecture (27%), Tool & MCP (18%), Claude Code Configuration (20%), Prompt Engineering (20%), and Context Management & Reliability (15%) - pulling out the highest-leverage idea from each. The closing five rules are the through-line: when a behavior must hold 100% of the time use a hook not a prompt, always return structured errors, cap each agent at four to five tools, review code in a fresh session that did not write it, and show two or three concrete examples instead of writing more instructions.

Members feature

Chat with this breakdown.

Modern Creator members can chat with any breakdown — ask for the hook, quote a framework, find the exact transcript moment. Unlocks at T2: refer 3 friends + add your own API key.

Create a free account →
Chapters

Where the time goes.

00:0001:39

01 · Anthropic just dropped a real certification

Cold open + pitch: a 40-page exam guide is the best Claude Code syllabus available, and the host has read it cover to cover.

01:3902:03

02 · The 5 domains explained

Pie-chart breakdown of the exam's five domains and their weighting.

02:0303:25

03 · Domain 1: Agentic Architecture (27%)

Why agent architecture is the single most important domain - how Claude thinks, coordinates, and enforces rules.

03:2504:07

04 · The Agentic Loop

Request -> response -> tool_use -> execute -> repeat. The only field that decides when the loop ends is stop_reason.

04:0705:18

05 · Anti-patterns to avoid

Three mistakes: parsing text for 'I'm done', setting hard iteration caps, and ignoring stop_reason.

05:1806:38

06 · Hub-and-Spoke: coordinator + subagents

One coordinator agent decomposes the task, subagents run in isolated contexts, results merge at the end.

06:3807:34

07 · The narrow decomposition mistake

Coordinators that decompose too narrowly miss whole branches of the problem - fix by giving broad goals, not narrow checklists.

07:3408:49

08 · Live demo: spawning 3 subagents in parallel

Terminal demo: a single prompt spawns video/written/audio research subagents in Claude Code, each using its own tools and tokens, with a final synthesis.

08:4910:00

09 · Prompts vs Hooks (most important concept)

Prompts are best-effort suggestions; hooks are deterministic scripts that physically block actions. The exam guide's most contentious section.

10:0010:55

10 · When to use hooks over prompts

Prompts for style/tone/formatting; hooks for compliance, financial, and security. When money is on the line, hooks every time.

10:5511:42

11 · Live demo: /hooks in Claude Code

Walk-through of /hooks in the terminal and the claude-code-guide agent as a second way to discover the right hook.

11:4212:07

12 · Domain 2: Tool Design & MCP (18%)

Tool descriptions are the single highest-leverage thing you can fix - they decide which tool fires when descriptions overlap.

12:0713:10

13 · Why tool descriptions matter most

Bad descriptions cause 40% misrouting; good descriptions with explicit do-not-use clauses drop misrouting to under 2%.

13:1013:54

14 · The tool overload problem

Project vs user-level MCP, where to put .mcp.json, how to keep API keys in environment files.

13:5415:28

15 · MCP server scoping: project vs user level

How project-level and user-level MCP differ; community vs custom servers; checking what is wired up with claude mcp list.

15:2816:57

16 · tool_choice: auto, any, forced

An agent with 18 tools makes worse decisions than one with 5. tool_choice can be auto, any, or forced - force the first move, then loosen the leash.

16:5718:51

17 · Domain 3: Claude Code Configuration (20%)

Most people dump everything into one CLAUDE.md. The guide splits it into three layers: user, project, and path-specific.

18:5120:04

18 · The 3-layer CLAUDE.md hierarchy

Layer 1 user-level (~/.claude/CLAUDE.md), Layer 2 project-level (.claude/CLAUDE.md, version-controlled), Layer 3 path-specific (.claude/rules/*.md).

20:0421:06

19 · Path-specific rules (.claude/rules/)

Rule files load only when their glob matches - testing rules for test files, API rules for the API folder. Keeps CLAUDE.md lean.

21:0622:25

20 · Commands vs skills vs plan mode

Commands are reusable slash prompts; skills are scoped, isolated mini-agents with their own tool allow-list; plan mode is for ambiguous multi-file work.

22:2523:40

21 · Claude Code in CI/CD pipelines

Using claude -p with --output-format json to make Claude Code run non-interactively inside a GitHub Action.

23:4024:31

22 · The -p flag and --output-format json

How the -p and --output-format json flags turn Claude Code from a chat tool into something a pipeline can drive.

24:3125:45

23 · Why you need a separate review session

A session biased by writing the code is bad at reviewing it. Always spin a fresh, stateless session for review.

25:4526:26

24 · Domain 4: Prompt Engineering (20%)

When Claude is inconsistent, the instinct is to write more rules. The exam guide says: show 2-3 examples instead.

26:2627:15

25 · Few-shot examples vs instructions

Two or three concrete input/output examples beat a paragraph of detailed instructions every single time. Claude learns the underlying pattern.

27:1528:06

26 · Guaranteed JSON with tool_use

Define a tool with a JSON schema, set tool_choice to forced, and extract from the tool_use response. Eliminates syntax errors, not semantic ones.

28:0629:23

27 · The validation loop

When the model misreads data, retry with specific feedback - the original document, the extracted field, the literal mismatch. Not just 'try again'.

29:2329:45

28 · Domain 5: Context Management (15%)

Why Claude pays attention at the beginning and the end of context but goes fuzzy in the middle, and what to do about it.

29:4530:37

29 · The lost in the middle effect

The first 40% of context is well-primed, the end has recency bias, and the middle drifts. Tool outputs push the important stuff into the fuzzy zone.

30:3731:46

30 · 3 ways to fix context bloat

Pin a key-facts block at the top, trim verbose tool outputs, delegate messy work to subagents whose context never pollutes yours.

31:4632:24

31 · /memory in Claude Code

/memory shows what is actually in scope right now - project memory, user memory, auto memory - so you can audit before a fresh session.

32:2433:41

32 · When to escalate to a human

Three escalation scenarios: customer asks for a human (escalate immediately), policy is ambiguous (escalate with a structured handoff), or issue is straightforward (resolve, but still offer a human).

33:4135:00

33 · Error propagation done right

Never return a generic 'failed'. Include what broke, what was tried, what partially worked, and what alternatives exist.

35:0037:12

34 · The 5 rules that apply across every domain

Closing recap: hooks for high-stakes work, structured errors, 4-5 tools per agent, separate session for review, few-shot examples beat instructions.

37:1238:07

35 · Interactive study prompts (X post resource)

A community-built prompt set turns Claude Code into an interactive instructor that quizzes you on each domain.

38:0739:01

36 · Free study guide + where to go next

Mark plugs his own mega-guide and Skool community as deeper next-step resources.

Atomic Insights

Lines worth screenshotting.

  • Anthropic's Claude Certified Architect exam weights its five domains at 27/18/20/20/15 - agent architecture is by far the biggest single chunk of the test.
  • The only reliable signal that Claude is finished with an agentic loop is the stop_reason field; parsing text for 'I'm done' or capping iterations both break in production.
  • Sub-agents do not communicate with each other - each runs in its own context, and the main agent only sees their final summaries when it stitches the answer together.
  • The most common coordinator failure is decomposing the task too narrowly; the fix is to give broad goals and let the sub-agents pick their own subtasks.
  • In Anthropic's own example, a prompt-based refund check fails 12% of the time and refunds the wrong customer; a hook drops that to zero because it physically blocks the action.
  • Prompts are for style, tone, and formatting; hooks are for anything where one failure causes real harm - financial, compliance, or security.
  • Tool descriptions are the highest-leverage thing in MCP work: ambiguous descriptions misroute 40% of the time, while good ones with explicit do-not-use clauses drop misrouting under 2%.
  • An agent with 18 tools makes worse decisions than one with 5, because each extra option is a chance for the model to wander outside its lane.
  • tool_choice has three modes - auto, any, forced - and the guide recommends forcing the first move so the agent always starts the same way, then loosening the leash.
  • CLAUDE.md should be split across three layers: user preferences in your home directory, team rules in the project, and path-specific rules that only load when a matching file is open.
  • Path-specific rules in .claude/rules/ are the underused unlock - testing rules only load for test files, API rules only load in the API folder, and your token bill drops.
  • claude -p combined with --output-format json turns Claude Code from a chat into something a CI pipeline can call non-interactively, returning structured JSON.
  • Never let the Claude session that wrote the code also review it - the writer is biased toward its own work, and a fresh session catches what the first one misses.
  • Two or three concrete few-shot examples beat a full page of written instructions every time, because Claude learns the underlying pattern instead of memorising the examples.
  • Forcing tool_use with a JSON schema eliminates syntax errors like malformed JSON but does not eliminate semantic errors - you still need a validation loop for wrong values.
  • Claude pays high attention to the start and end of a context window and gets fuzzy in the middle; tool outputs piling up in the middle quietly degrade reliability over time.
  • The three context-management fixes are: pin a key-facts block at the top, trim verbose tool outputs, and delegate messy work to sub-agents whose context never pollutes yours.
  • If a customer asks for a human, escalate immediately - sentiment scoring will misread sarcasm and cultural tone, and the policy line is clearer than the model.
  • When escalating to a human, send a structured handoff - customer ID, root cause, what was tried, recommended action - not a 'sorry, transferring' message.
  • Generic 'failed' errors are useless; every error should include what failed, what was tried, what partially worked, and what alternatives exist so the next agent can act.
Takeaway

Five rules that decide how reliable your Claude agents will be.

WHAT TO LEARN

Reliable Claude systems come from picking the right level of determinism for each problem - hooks where it must work, prompts where it just needs to read well, and structured handoffs everywhere in between.

04The Agentic Loop
  • Check stop_reason after every Claude response - it is the only reliable signal that the loop is done.
  • Do not parse the assistant's text for phrases like 'I'm done' or 'task complete' - they break.
  • Do not cap iterations at a fixed number; you do not know in advance how deep a task needs to go.
06Hub-and-Spoke: coordinator + subagents
  • Sub-agents run in isolated contexts and never see each other's work; only the coordinator merges results.
  • Give the coordinator broad goals, not narrow checklists, so it does not decompose the task too narrowly.
  • Let each sub-agent pick its own tools and tokens within its narrow goal - that is what makes them precise.
09Prompts vs Hooks (most important concept)
  • Anthropic's own example shows a prompt-based refund check fails 12% of the time and refunds the wrong customer.
  • Use prompts for style, tone, and formatting - anything where 90% is good enough.
  • Use hooks for compliance, financial, and security - anything where one failure causes real harm.
13Why tool descriptions matter most
  • Ambiguous overlapping descriptions cause misrouting in roughly 40% of cases.
  • Adding explicit 'use INSTEAD OF X when Y' clauses drops misrouting under 2%.
  • The description is the interface - fix descriptions before adding new tools or routing layers.
16tool_choice: auto, any, forced
  • An agent with 18 tools makes worse decisions than one with 5; less choice means better decisions.
  • Force the first tool call so step one is always consistent, then loosen the leash for the rest.
  • Use tool_choice = forced for extraction pipelines where the schema must be filled exactly.
18The 3-layer CLAUDE.md hierarchy
  • Layer 1 user-level stays in your home directory and is never shared via git.
  • Layer 2 project-level lives at .claude/CLAUDE.md and is version-controlled for the whole team.
  • Layer 3 path-specific rules in .claude/rules/ only load when an open file matches their glob.
23Why you need a separate review session
  • The session that wrote the code is biased toward thinking it is correct.
  • Spin up a fresh, stateless Claude session for review - it catches what the writer never will.
  • claude -p with --output-format json lets a CI pipeline run that review automatically.
26Few-shot examples vs instructions
  • Two or three concrete examples beat a full page of written instructions every time.
  • Claude does not memorise the examples; it generalises the underlying pattern.
  • Show null for one missing field once, and Claude uses null for all missing fields after.
29The lost in the middle effect
  • The first 40% of context is well-primed and the end has recency bias - the middle drifts.
  • Pin a key-facts block at the top so critical state stays in high-attention territory.
  • Trim verbose tool outputs and delegate messy work to sub-agents whose context never leaks back.
31When to escalate to a human
  • If the customer asks for a human, escalate immediately - do not try to resolve first.
  • If policy is ambiguous, escalate with a structured handoff package, not a vague apology.
  • If the issue is straightforward, resolve it but still offer the option of a human.
34The 5 rules that apply across every domain
  • Hooks beat prompts whenever a behaviour must hold 100% of the time.
  • Always return structured errors - never a generic 'failed' message.
  • Cap each agent at 4-5 tools to keep decisions sharp.
  • Use a fresh Claude session to review code; the writer is too biased.
  • Show two or three examples instead of writing more instructions.
Glossary

Terms worth knowing.

Claude Certified Architect
A new Anthropic certification exam, pass/fail, organised around five weighted domains covering how to design and operate Claude-based agentic systems.
Agentic loop
The repeating cycle of request, response, tool_use, execute, and continue that drives every Claude agent - the loop ends when stop_reason indicates completion.
stop_reason
The field on a Claude response that signals why the agent stopped - 'tool_use' means it wants to call a tool, 'end_turn' means it is done. The only reliable end-of-loop signal.
Hub-and-spoke architecture
A coordinator agent at the centre delegates subtasks to specialised sub-agents that run in isolated contexts and return summaries the coordinator stitches together.
Sub-agent
A Claude agent spawned by a coordinator with its own context window, its own tool set, and no memory of what other sub-agents are doing.
Hook
A deterministic script Claude Code runs before or after an action; if the hook returns a block, the action physically cannot happen. The opposite of a prompt suggestion.
MCP server
Model Context Protocol server - a process exposing tools or data to Claude through a standard interface, scoped at either project (.mcp.json) or user level.
Tool description
The natural-language text attached to a tool that tells Claude when to call it; the single largest lever for reducing misrouting between similar tools.
tool_choice
An API parameter with values auto, any, or forced that controls whether Claude can decide to skip tools, must call some tool, or must call one specific tool.
CLAUDE.md
A markdown file Claude Code loads into context on every session; the exam guide recommends splitting it across user, project, and path-specific layers to avoid loading everything every time.
Path-specific rules
Rule files in .claude/rules/ that declare a glob pattern at the top and only load when an open file matches, keeping CLAUDE.md lean.
Skill
A scoped Claude-Code capability defined in its own file with a frontmatter declaring allowed tools, running in an isolated context so its messy work does not clutter the main conversation.
Plan mode
A Claude Code mode where the agent explores, reads, and proposes changes without modifying any file, so a human can approve or redirect before execution.
CI/CD pipeline
Continuous Integration / Continuous Deployment - an automated chain that builds, tests, and ships code without manual intervention; the guide covers how to drop Claude Code into this chain.
claude -p
The Claude Code flag that runs a single task non-interactively with no confirmation prompts, suitable for scripting and CI.
Few-shot examples
A prompting technique where you show the model two or three concrete input/output pairs instead of describing the rules in prose; Claude generalises the pattern.
Lost in the middle
The phenomenon where a long-context model pays close attention to the start and end of its context but degrades on information buried in the middle.
Forced tool use
Setting tool_choice to a specific tool so Claude must produce a tool_use block in that schema; eliminates JSON syntax errors but not semantic ones.
Resources Mentioned

Things they pointed at.

00:22linkAnthropic Claude Certified Architect exam guide
16:58toolWarp (terminal used in live demo)
15:40toolclaude mcp list (Claude Code command)
31:38tool/memory (Claude Code command)
38:07linkX post: I want to become a Claude code architect (community prompt set)
Quotables

Lines you could clip.

16:58
Giving an agent 18 tools is like hiring a brand new employee and giving them access to every single system from day one.
Vivid analogy that explains tool overload in one sentence.TikTok hook↗ Tweet quote
09:56
Prompts are suggestions and hooks are laws.
The whole prompts-vs-hooks debate compressed into eight words.IG reel cold open↗ Tweet quote
24:26
Fresh eyes, even AI eyes, catch more.
A clean one-liner for the separate-review-session rule.newsletter pull-quote↗ Tweet quote
28:03
Claude doesn't just copy paste your examples. It learns the underlying patterns behind them.
Reframes few-shot prompting in a way most beginners miss.TikTok hook↗ Tweet quote
29:23
Two to three examples will beat a full page of instructions each and every single time.
A defensible, controversial claim about prompting.IG reel cold open↗ Tweet quote
32:08
It's infinitely better to start a brand new session with a summarized version of outputs from before versus pushing through a conversation even if you're at that million context window.
Counters the 'just use a longer context window' instinct.newsletter pull-quote↗ Tweet quote
12:40
The description of the tool is really the interface of tooling.
Reframes tool descriptions as API design.TikTok hook↗ Tweet quote
The Script

Word for word.

metaphoranalogy
00:00Anthropic just released something huge, which is a brand new certification program called the Claude certified architect. It's a real exam, pass or fail, and one of the most important things is that it's based on five core domains. Now whether you're trying to get certified or you just want to get better at using Claude code and becoming a master of it, then this entire syllabus that they've put together will act as the best resource for you to learn exactly how to go from zero to hero.
00:26So here's the deal. I went through the entire exam guide myself, not through an LLM. I read through every single page.
00:32So I'm going to synthesize this entire exam guide into this video and break down each and every concept for you. And as a bonus, I'll share some resources that should supercharge your learning journey.
00:43So this is the very guide that I was discussing before. It is 40 pages and it walks through everything from the target candidate description of what it's like to be a master of Cloud Code in Anthropics. Each and every paradigm you should master and be prepared for.
00:57There are some exam content response types. There's some preamble around what the exam entails, what the format is, and this is the most important part, which is the content outline. And this is where it breaks down the distribution of each and every part.
01:10And beyond that, in each area, it walks through a series of scenarios, and you'll see that it's very specific. It's not just about knowing MCP tools, it's about understanding exactly how to conceptualize each and every part of said tool.
01:24If you scan through all of this, you'll see exactly how detailed it is and it can be overwhelming to many, which is why I wanna break down each and every concept and go through every domain with a fine tooth comb. So like I said, there are five different domains that Anthropic thinks that you need to know before you can call yourself a Claude code master.
01:42So the biggest one, which is 27% of the exam, is agent architecture. So this is basically how Claude thinks step by step, how it coordinates with other agents, and most importantly, how you enforce rules that just telling Claude in a prompt can't guarantee.
01:57If you only study one domain, this is the one. Now tool and MCP integration is at 18%. So this is how Claude connects to the outside world, your databases, your APIs, your file systems.
02:09And the number one reason why agents call the wrong tool is actually embarrassingly simple, and we'll get to that shortly. So Claude code configuration sits at 20%. This is your Claude MD files, your skills, your commands, and things like plan mode.
02:23And most people dump everything into one giant file, but the exam guide teaches you how to split that into three layers so Claude isn't loading irrelevant tools every single time you ask us something. Ultimately, we have the prompt engineering section, and this isn't just about writing better prompts.
02:42The certification is actually very specific here. If you want Claude to give you consistent output, show it two or three real examples of exactly what you want. That works better than writing a whole paragraph of detailed instructions every single time.
02:56And last but not least is context management and reliability. So here's the thing. Claude reads the beginning and the end of what you give it really well, but the stuff in the middle sometimes can get fuzzy, and that's called lost in the middle logic.
03:10So now that we've set the tone for what's important, let's go through every single one, and we'll naturally start with the largest and most important, which is the agentic architecture and orchestration. Let's start with the very engine that powers each and every Claude agent, which is the agentic loop. So whether you're using Claude code, the Anthropic SDK, or any agentic framework that is built on top of Claude, this is what's happening each and every single time you run an agentic workflow.
03:36So your code first sends a request to Claude, and then Claude naturally responds. The most important thing to keep aware of is this stop reason right here. You wanna check this all the time.
03:48If it says tool use, that means Claude wants to go use a tool, like reading a file or running a command. You can execute said tool, feeds the result back, and then it goes again into this endless loop. But in this case, it's not actually endless.
04:00It goes up until a very certain point. If it says something like end turn, then that's essentially Claude saying that I'm done. That's pretty much the entire engine over and over again.
04:10Now the exam guide points three different areas where people make mistakes in understanding the agentic loop, and these are basically the anti patterns. So first, reading Claude's text looking for phrases like I'm done or task complete.
04:24That's unreliable and it breaks all the time. Second, you don't wanna set limits like stopped after 10 loops. You don't know the level of depth that Claude code needs to do yet to accomplish a specific task.
04:36So you might be cutting off work that genuinely needs 11 steps. And third, you don't wanna look at what Claude said to figure out if it's finished. There's a very specific field, like I said, stop underscore reason that exists for exactly this purpose.
04:51It's the only thing that you should be checking. Now if you're using Cloud Code in a terminal, then sometimes you won't see this stop underscore reason.
04:58Every time Cloud Code reads a file, executes a tool, runs a command, or spins a sub agent, this is exactly the process and the patterns that drive that forward. Understanding this will really help conceptualize everything else we're about to cover. So when Claude needs to do complex, like research a topic from multiple angles or process a really large project, You don't need to send one agent to do everything.
05:21You basically have one agent that sits in the center, and this is the main agent. It breaks the task down, hands off pieces to specialized agents, and then combines the result at the very end. So these would be examples of the other sub agents.
05:35So one synthesis agent that uses tools to verify and write, another search agent that uses tools like search and fetch URL, and then you have the analysis agent that uses tools like read doc and extraction. Now the exam guide mentions this specific concept. So each one of these agents has its own separate window, its own separate set of tasks, and technically its own world.
05:56So there's no communication in between different sub agents. That's actually what the newer agent teams feature was designed to do, which is enable that communication through providing the equivalent of an email inbox to each agent so they can email each other, see who's blocking who, and execute the task in unison.
06:14So it's important to understand that sub agents don't maintain track of memories of other sub agents at the same time. So sub agent a will have no idea what sub agent b did. It will all kind of come together at the very end once the main agent takes the TLDR of each's outputs and or findings and then disseminates that back to you.
06:34Now there's one major mistake that many people make when it comes to understanding sub agents, and it's the following. So let's zoom in here. Even though you have a main agent, it's completely possible that the coordinator could break down tasks too narrowly, and this is something you have to look out for.
06:50So in practice, it could look like this. You say research AI in creative industries, and the coordinator only creates subtasks about visual art. So digital art, graphic design, or photography, but it completely misses music, writing, film, and game design.
07:07So the sub agents did their job perfectly, but the coordinator just scoped it wrong. It's basically like having a bad manager for a great team.
07:16So the fix is to give broad goals, not narrow checklists to this main coordinator agent. You wanna let the sub agents figure out how to break down all the sub tasks based on their narrow goals that are defined by this broader goal.
07:30So instead of me just breaking this down conceptually, let's go down into the terminal and see an example of this. So if we go into warp here, I'm going to copy paste this prompt and we'll send it over. And while we send it over, I'll read it through.
07:43So it says, I want you to research the impact of AI on content creation by spawning three sub agents in parallel. The first sub agent is to research how AI is changing video content creation. Creation.
07:55The second one is to research how AI is changing a written content creation. And the last one is how it's changing audio. And then we say each sub agent should search the web and return a three bullet summary.
08:06So it's very broad in terms of what they're looking for. We're not micromanaging exactly how they're going to do said thing, but we're giving them the overall assignment.
08:14So you've taken the equivalent of an employee, you've onboarded them, and then after onboarding them, you trust them enough to give them a very well situated task and allow them to execute independently. And after completes, you could see right here all three agents are finished.
08:29Each one use its own set of tools, its own set of tokens, and then we have the results from each one, and then we have the overall synthesis. So this is the main agent compiling all the results of the sub agents, and this is how this coordinator sub agent pattern works.
08:43So this might be the most important part of the entire exam guide where they differentiate between prompts and hooks and when and where to use each. So prompts are what I call best effort. You can tell Claude something like always verify customer before processing a refund, and most of the time it works, but sometimes it doesn't.
09:02If we even hop into their exact scenario, they have this question where they say production data shows that in 12% of use cases, your agent skips this invocation of a function get customer entirely and just go straight to look up an order based off of the stated name occasionally leading to misidentified accounts. So from a business perspective, this is not okay.
09:2412% of accidentally provided refunds to the wrong person or to people trying to take advantage of this becomes a big issue.
09:31Now hooks are completely different. A hook is basically a small script that runs automatically before or after Claude tries to do something, and it can literally block Claude from taking an action unless a specific condition has been met. So it's not 99%.
09:46It's not 99.9%. It has to be a 100%. And the action physically can't happen if the hook says no.
09:54So you can think of prompts as suggestions and hooks as laws. So the exam guide draws a very clear line on when and where to use prompts versus hooks.
10:04And when it comes to what it's good for, it's primarily around style, tone, and formatting. These are things that you can execute well 90% of the time, and it won't land you necessarily in an area of harm or a land of hurt. Hooks are optimized for compliance, financial stuff, and security.
10:21So anywhere where one single point of failure can cause some real issues. And this overall concept is where a lot of people go wrong because they just think that if something failed 90% of the time, they can just tweak the prompt to perfection. And as running a company called prompt advisers where initially, all I would do for companies is help them prompt engineer every system prompt in a system, in a production use case for content creation, there is a level where a prompt is just not good enough over a thousand iterations or 5,000 iterations.
10:51If you're not as familiar with hooks and you want a little bit of a debrief, then you have two options. If you pop into a terminal, you could always do slash hooks, and then this will show you each and every way that you can invoke a different tool.
11:04And this list goes on and on and on. And if you click on one and you click enter, it'll show you exactly what it would do. And option two is, I showed this in a prior video, you can use my favorite function in Claude code, one of the most underrated, which is the Claude code guide agent, and then you can ask it what are the best hooks for x use case.
11:26And then it will go through with full knowledge of what hooks it has at its capacity and which one is optimized for your use case and whether or not you should be using a hook or a prompt to begin with. So whereas the last concept, the enforcement piece is probably the most important, this is the highest leverage, which is getting tool descriptions correctly, which is basically giving Claude code, which you provide it with whatever tools you want at its arsenal as well as its native tools, the right tool at the right time with the right description for the right use case.
11:55So tools are basically how Claude decides which tool to use when it has multiple options, And that's typically not a small feat because you can have two tools that have vague overlapping descriptions, like one that retrieves customer information and another that retrieves exactly what the order entails.
12:13So that could lead to some form of communication issues. So Claude has to essentially guess, and the exam guide covers this very clearly.
12:20Ambiguous descriptions cause frequent misrouting. So Claude ends up calling the wrong tool way more often than you'd expect. And one really important thing to note is that when you invoke these tools, sometimes you see the final result being executed properly.
12:36But back when everyone was using no code tools like n eight n, you would have the agent in that platform work and execute the workflow, and you would see it fire the right result. So you could get the exact result you're looking for in Cloud Code, but you have no idea that it actually did the wrong thing three or four times to eventually do the right thing.
12:55So it's not just about the outcome, but also the efficiency in getting to that outcome because as it tries through all the ways it doesn't work, it spends your tokens and you wanna be as token efficient as possible. So to give you something more tangible, let's say this is one of your functions, get customer, you would basically say that you wanna use this tool whenever you need customer ID and profile data, and you want to use the lookup order instead when you have an order number and need a shipping status.
13:21So you essentially want to be intentional in saying do not use this tool when this happens versus just saying when it should use that tool. And this is pretty much the highest leverage tip from the entire guide, which is the description of the tool is really the interface of tooling and fixing the descriptions to make sure it knows the optimal path, the critical path the most critical thing that you can do in your workflow.
13:43Now while MCP servers are increasingly falling out of favor for different use cases, there are times where they make sense. So the exam guide does cover where to use different levels of scope for your MSP servers.
13:55You have project level and you have user level, and I'll walk you through the difference of when and where to use both. And this essentially allows Claude to connect to external tools like GitHub, Slack, Outlook, whatever it is.
14:08It's one of the vectors that you can use. And if you've ever used Claude Chat or Claude Cowork and use their connectors feature, it's essentially using an MCP under the hood. So project level MCP lives in a file called the dot m c p dot JSON at the root of a project.
14:24So any passwords or API keys go in what are called environment variables, where they're denoted as dot n for dot environment, and they never directly end up in the core file. So for example, if you had an MCP server for GitHub, which is essentially code version history, you would have an environment variable.
14:41It would be set to the token of your GitHub, and this would be written to an environment file. So every single time that an agent would try to use an MCP server, it would then be auto authenticated through this file, then go and invoke this specific service. User level MCP lives in a file in your home directory.
14:58So this is basically your personal sandbox. You have experimental tools, personal API keys, things you're testing before rolling them out to the rest of the team.
15:06Now the practical takeaway from the exam guide is that essentially you can use as many community based MCPs. These are not necessarily open source MCP servers, sometimes aren't too safe, but more so the native MCP servers from the platforms themselves. So if you look at the major providers like Salesforce, GitHub, etcetera, everyone has some form of instruction for using MCP servers, and only build custom servers when you absolutely need to.
15:32And it's important to remember that an MCP server is purely a function. So if you just need your functions executed in a slightly different way or different order, you might not need a custom MCP. Now real quick on the terminal side of things, all you'd have to do is go into your terminal, and you could do one of these two things.
15:48You could say, Claude MCP list. This would go and invoke if you have any MCP servers whatsoever.
15:55Now personally, I've migrated my entire ecosystem to skills, CLIs, etcetera. So you won't find any that are already authenticated. You'll find just the shell of the ones that I used to use.
16:04So the Gmail, Google Calendar, Canva, and Zapier, all of them I used to use, but now I've migrated all of them to use the skills primarily just for token efficiency, security, etcetera. But if you wanted to see which ones you had out of the box, that's the way you do it. If you're using MCP servers at the the project level, then you could just paste the command just like this, where you could say, show me the dot MCP dot JSON file in this project and explain the MCP server configuration.
16:31And then you get this response where in this case, I don't have in this particular project an MCP dot JSON file, and it walks through what needs authentication like we saw before, how to configure it, and there's that command that I showed you before, Claude m c p list. It basically invoked that.
16:45So whether you're asking for it through natural language or going straight to the source with this command, then you can have full visibility on what's happening with your MSP servers. The next principle in the exam guide is the tool overload problem, and this is essentially making better decisions by having less options.
17:01So you can think of it like this. Giving an agent 18 tools is like hiring a brand new employee and giving them access to every single system from day one. They're gonna use things that they shouldn't call tools outside their lane.
17:14You wanna keep each agent down to a maximum of four to five tools that are directly relevant to what they're doing. That constraint is really what makes them precise. And if you need a reminder, earlier, I showed you an example of spinning up three sub agents, and you'll notice that all of them used four or five tools at max.
17:32So this is really a paradigm that's built into Cloud Code, and that allows it to have a process, create SOPs. So this one would be search, fetch, extract, and save. So the goal is being precise, reliable, and always on task.
17:46Now there's also a setting called tool choice that controls how Claude picks his tools. There are three main modes. One of them is auto where basically Claude decides on its own whether to use a tool or not.
17:58And then you have another one called any, and this is essentially forcing Claude to use a tool, but it has to pick which one. And finally, have forced, meaning we are making it.
18:09Use this tool and there are no options. It's not just independence, it is forced dependence on a particular outcome.
18:16So the guide alludes to the fact that you can force a tool call to make sure that step one is always consistent and predictable, and then you can loosen that proverbial leash of Cloud Code to run freely and make more autonomous decisions as long as you know you've steered it in the right direction. So you're essentially putting guardrails on its first move or couple moves and then allowing it to run freely and really tap into that power of the agentic harness.
18:39Now next up is one of the most contentious topics in Claude code, which are Claude MDs, which are the heart and soul of your operating system, your air traffic control of your repo or project if you will. And pretty much it covers all the different layers, three different layers, the user level, the project level, and the path specific rules.
18:58Now most people dump everything they know into Cloud MD, they think that it's a proxy for a knowledge base or rag, but essentially it's not. People dump their preferences, their rules, their style, their tone all in one place, and then complain why there's so many tokens being wasted all the time.
19:14The big issue is that every single time you open a brand new session, Claude auto injects that straight into memory. So you're wasting time and you're wasting tokens. So the guide splits it into three different layers.
19:26One is the user level, the next is the project level, and the last one are path specific rules. So you can treat your top layer as your personal preferences file. This lives in your core home directory.
19:36So you have your editor settings, how you like your explanations formatted. So this one's just for you and not meant to be shared with anyone or through something like GitHub. So the middle layer is a project level CloudMD, and this is where you have things like team rules, coding conventions, architecture decisions, and this essentially allows you to share it with your team assuming you have one so that everyone's on the exact same page.
19:59So this is where having some version control makes a lot of sense. And finally, we have the bottom layer here, and this is really the golden nugget of the three levels. These are path specific rules.
20:09So you create a small rule file that lives in the dot claud rules folder, and at the top of each file, you put a pattern that says when to load it. So when something like only load this when I'm editing files is a very good example. So your testing rules only show up when you're writing tests, and your API rules only show up when you're in the API folder.
20:30And lastly, if you have something like React components, if you're a developer, then you know what that is. If not, then don't worry about it. The TLDR is this is huge because Cloud Code can get focused.
20:40So you can have a lean and mean Claude MD and rely on rules to cure the path forward for any nuances that need to be taken account for a specific use case. So I know I'm throwing a lot at you right now, but the next section tries to bring everything together into cohesion. So it's really about when to use what because we haven't even started speaking about things like skills, like commands, plan mode versus direct execution, when to use each.
21:05So commands are basically reusable prompts. You save them once and you can trigger them with a slash command. So you can have slash review PR slash generate tests slash morning if you wanna execute a walk through of what your day looks like based on your calendar, your Gmail, anything you've hooked up maybe using the Google CLI.
21:23But one thing to note is that team wide commands go in a commands folder in your project so everyone can use them via something like git, whereas personal ones will end in your root folder, and these are your personal flash commands. So these are specific to you and tailored to exactly what you wanna do day to day.
21:39Now we've gone through skills at length in this channel, but just in case you and I are meeting each other for the first time, we'll go through that as well. So skill is a step above a command. A skill has its own file that defines what it can do, what tools it's allowed to use, and it runs in its own separate context.
21:55So you can think of it like this. A skill can do messy exploratory work like research files, do pretty much anything you want, and none of that clutter ends up in your main conversation. It's like sending someone to go do research in another room, and you're just bringing the summary back to main conversation.
22:11Now moving on to another existential question that many people ask and the guide goes through, to use or not to use plan mode. So if the task touches multiple files, it's ambiguous, or it could go in a few different directions, then using plan mode is the way to go. Claude explores, reads, and proposes changes without actually modifying anything.
22:31Just review it, approve it, or tell it to go in a different direction. But if it's a very obvious and straightforward single file fix, then you can just let Claude execute it directly.
22:41So you don't have to over plan in this case the same way many people will over engineer things. Now this next part is fairly advanced. So if you're nontechnical, this part might leave you squinting a little bit, but I'll try to explain it as best I can.
22:53So this is about using Claude code in what's called a CICD pipeline. What this stands for is continuous integration and continuous development.
23:02So if were to break down this concept into one sentence to make it as accessible as possible to everybody, it would be the CICD pipeline is an automated conveyor belt where a developer will push code, that code will be reviewed, and then it will be shipped and pushed to the end user, all without any form of buttons being pressed along the way.
23:21So the guide really focuses on this step three right here, but we'll get to that in a second. Step one is, like I said, you have a developer that pushes some code. Then this triggers the CI, the continuous integration pipeline to go and check it.
23:35Step three is where the magic happens, and this is what's called Claude dash p. Claude dash p is not a very straightforward concept. So, again, I'll try to break it down.
23:45The dash p essentially allows Claude code to run without asking you for anything. So no prompts, no confirmation. It's essentially bypassed permissions mode in a way, and it just runs the task you give and it gives you the result back.
23:59And then you have a flag that gives you a clean structured output that other tools can read. It's actually called the dash dash output format JSON flag. When you put these together, these two flags turn Claude code from something you can chat with solely into something that you can use to automate different parts your process.
24:16So the main learning here is that you can trigger this from any CICD pipeline, any system that essentially tests and deploys your code. Now it's hard to make that part less gibberishy, but this part will be the main takeaway from that section.
24:29And this is their important note on using separate clawed code sessions for reviewing code versus writing code because there is some level of pollution. When you write code, you essentially are biasing the language model to say, yeah. Yeah.
24:43I wrote amazing code. Because why would the language model write poor code on purpose? So you need a stateless session to go and review any form of code, anything that was produced in session one assuming you're doing something more on the technical end of things.
24:56So if you need a little anchor to remember, then you can remember that fresh eyes, even AI eyes can catch more. Two heads are better than one. In Claude code's case, five, ten, 15 heads sometimes are better than one at reviewing code as long as it's in a separate session.
25:10So for example, if I said claw dash p list all Python files in this project and summarize what each one does, output this format, then you will see it goes through every single Python function in my folder, which I won't get into in-depth, and then it comes back with the full key patterns here. All scripts use Gemini three pro preview for images.
25:31This is my thumbnails generation folder and dated output folders report lab for PDFs and one script per video topic design. So when it comes to making outputs reliable, this is a whole portion of the guide that's dedicated to dealing with inconsistency in Claude's responses.
25:48So your instinct when Claude gives you inconsistent outputs is to write more instructions. So for whatever reason, your instructions involve number crunching, something like handling different currencies, different decimal places, you try to shove all of that in there.
26:02But Claude interprets it differently each time. One response can give you one number versus another depending on the day, the model you choose, etcetera. So you can have the same set of instructions, but three different results because people keep forgetting that this isn't magic, these are language models.
26:19Now to fix this, Anthropic recommends going to few shot examples. If you're not familiar with what few shot are, these are from the beginning of prompt engineering time, one of the best ways to get consistent outputs.
26:31So you give an example. In this case, the input could be Acme Corp reported 4,200,000 in revenue for 2024, and this is the output you want exactly.
26:41So you give it exactly the parameters. In this case, we're putting it in JSON. This could be in whatever format you want, and same thing for example two.
26:48So multi shot gives it enough of a hint to generalize and better understand which direction you're going for. And the interesting thing here is that Cloth doesn't just copy paste your examples.
26:58It learns the underlying patterns behind them. That's why two to three examples will beat a full page of instructions each and every single time. Now in the same vein of consistent and reliable outputs, this also generalizes to JSON, which stands for JavaScript Object Notation.
27:14Very common when you're dealing with agents, with toolings, and tool calls. So this is also more of an intermediate to advanced use case, but it's important to know because it's covered in the guide. So I'll move from left to right and, again, try to make this as accessible as possible.
27:27So step one is you define a tool, which basically acts as your template, providing the exact structure you need. So every field, every data type, and whether or not it's required.
27:38Leaving something as optional is beneficial for Claude because otherwise if you don't tell that it's optional, then Claude will make it up. So making it optional allows Claude to say, I don't know in a very legal way.
27:51So you're a legalized way of allowing it to say, I don't have this. I don't know what to do with this. In step two, building on what we referred to before, you can force Claude to use a specific tool.
28:01So there's no option to respond with plain text, no option to use a different tool. It has to fill your template as is. So just as a takeaway, this eliminates syntax errors.
28:10So anything like malformed JSON, this is just broken JSON or markdown wrapping, but it does not eliminate semantic errors. So anything with a wrong value in set field. So step one is you extract the data, and assuming it's correct then obviously this is all done.
28:26But if it's not correct, this is the part where you really need to dial in. You're not meant to just say try again. You're meant to actually send very specific feedback.
28:34So instead of saying retry, you would say the original document, the field extraction, and the specific error. And this is how you would frame the specific error.
28:43Revenue field says $0, but document clearly says 4,200,000. So now you're giving it multiple areas to zero in on and see what might be happening.
28:53But just like everything, there's nuance. You don't wanna just keep going in this endless loop. If the answer isn't there, if the information isn't in the source document, then retrying even with the best of instructions won't help.
29:04So it's not just knowing how to validate and test, but also knowing when to stop. Now as a segue to the next section, back when I used to drive a Honda, once in a while you'd get this notification which politely asks you to take a break and typically it's because they want you to have attention to the wheel especially for longer drives.
29:22Drives. You can take this exact same paradigm in mind for the next principle. So instead of worrying about driving sharply, this focuses on keeping Claude sharp throughout the lifetime of a context window.
29:34So when you give something to Claude code to read, it pays really close attention at the beginning. The first 40% of context window is really well primed. You have the system prompt.
29:44You have the first messages. You have your Claude MD injected at the beginning as well, and it really pays attention. And this is also true near the very end where you have recency bias towards the latest messages.
29:55But the context in the middle or the monkey in the middle starts to get a little bit fuzzy. So information buried in the middle starts to get compartmentalized in a way where it can't maintain that full fluency or flow of thought.
30:07Now the problem can get worse over time because every time Claude uses a tool, the result is added to this middle section. So a customer comes back with 40 fields when you only need five.
30:18Each one pushes the important stuff further into the fuzzy zone. So naturally, how do we fix this? Well, Anthropic comes up with three different ways to accomplish this.
30:27First, you can pull out the key facts and put them at the very top of a conversation, essentially pinning them in a place where Claude will always see them. So you can think of it as a key fact summary block.
30:38Another method you can employ is trimming verbose tool outputs. And what the word verbose means here is you get a series of data from a tool. A lot of this data could just be pure metadata that doesn't actually move anything forward, and you can get rid of it.
30:52And by trimming it progressively, you just keep the tool outputs that matter which will flood the context window less. And the third way is to delegate tasks to sub so they can maintain all of their messy output in their own individual context and box. It's all isolated and boxed off, and you just get a clean summary back.
31:11The guide actually mentions explicitly that it's infinitely better to start a brand new session with a summarized version of outputs from before versus pushing through a conversation even if you're at that million context window because you have all of this different set of information, tool calls, different trials, pivots in the conversation that pollute your context window.
31:32If you're ever curious at what is in your memory at a single point in time, you can always go into Cloud Code, do slash memory, then in here it will tell you that auto memory is on, what your project memory looks like, if you wanna check-in at your Cloud MD, the fact that you have some certain rules here, some level of user memory, and then you can also open your auto memory folder to see exactly what's in there.
31:54So you can click on enter right here. This will open it up in another window, then you'll be able to see a series of markdown files that denote exactly what it's remembering about your current session. And to close the loop on reliable outputs, there's a section related to human in the loop, which is basically when do you escalate a particular scenario to an agent.
32:14So if you have some form of chat agent in the wild and someone asks to speak to a human, then the goal should not be to try to fix the issue first using a language model and not to try one more thing. It should be to respect this request and execute it right away.
32:29And it's important to zero in on this because it explicitly says that this will probably trip up people because you'll try to get creative with how AI can answer something, but if someone asks for handoff, you give them handoff. Then the second scenario, the rules could be unclear and the agent could be unsure about what policy applies.
32:46The prescribed action here is to escalate, but escalate using what's called a full package. This full package includes what the customer information is, the ID, the root cause, what was attempted and tried, and what is the recommended action. So very similar to managing an actual customer system like Zendesk or similar, you would execute this in a very similar way.
33:06So the agent would technically come to the conclusion that it can't make any meaningful progress, and this is what it could look like in terms of a final package to hand off. And for the third scenario, if it's a straightforward issue, the policy is clear, which is to allow the agent to resolve it.
33:21But it comes with a big caveat. Even if it resolves it perfectly, it should still ask, would you prefer I transfer you to human agent?
33:30So you wouldn't want to give the agent itself a confidence score and escalate it when it's low. And one of the many reasons why it thinks that sentiment analysis can miss the mark is it can misread sarcasm, cultural differences, and tone.
33:43So I actually had to double check whether or this was legit. So I noticed that this is the question in the example guide here which says your agent achieves fifty five percent first contact resolution well below the 80% target.
33:55And you can see here it says that sentiment doesn't correlate with case complexity which is the actual issue. Alright. And we're getting near the end here, and this portion of the guide just covers error propagation.
34:05Basically, what to do when things go wrong. Now compare that to a detailed error that includes what went wrong, what was attempted, any partial results that came back, and what else could be tried.
34:16So now the main agent can actually make smart decisions, meaning trying a different search using data from a previous run, switching to a completely different source, or just basically noting that gap and moving on.
34:28So the TLDR of the TLDR is this just breaks down how to allow your agents to fail gracefully. Meaning failing in a way where you get meaningful errors, you can get meaningful outputs and meaningful retries. And just to bring everything together, because we've had all kinds of thoughts and examples and paradigms, what are the five rules that you can take away that will set you on the right path?
34:48Whether you're just learning Claude code or you're preparing to dive into the Claude code architecture exam. So rule numero uno is if it has to work a 100 of the time, whether it's money related, security, legal, don't rely on telling Claude in a prompt.
35:04Use a hook that physically blocks the action. So prompts are suggestions and hooks are the laws.
35:10Rule number two is when something breaks never return a generic error. Always include what broke, what tried, what partially worked, and what else could be done. Rule three is keep your agents focused.
35:22Four to five tools max per agent and an agent with 18 tools makes infinitely worse decisions than one with five that are directly So less choice and better decisions. Number four is review your code in a separate Claude session.
35:37The one that wrote the code is naturally biased toward thinking it's correct. A fresh session with no history catches what the first one will never. So two or three real examples of what you want produce more consistent results than a full page of written instructions.
35:55Claude learns the pattern, just not the format. And the real trick here is understanding that although these are five separate rules, they're all kind of the same concept, which is if you need to rely on building proper agentic systems in the wild, then you want to focus on the right tool that has the right level of determinism, which is its ability to execute something predictably every single time.
36:17And the main thing to take away from this is that although these are five different rules, they're essentially the exact same concept just showing up in different patterns. And the TLDR of it is to be structured, to be explicit, and to not have a what if or a probably will work with something like a prompt when you need the firepower of something like a hook.
36:37So if you nail understanding these core five principles, it'll give you the 80 of the eighty twenty. And more importantly, it'll give you the foundation to keep adding on additive knowledge.
36:46Now you might think that I'm gonna end off there. You might be even hoping for it, but I'm gonna leave you with one more thing before we depart for this video. So I found this really good guide by this user on x.
36:58I can't pronounce the username, but he came up with this article here that says, I want to become a Claude code architect. And in it, he came up with a series of prompts that break down each and every section of the official architecture guide. He's created these very bespoke prompts, I would imagine, using AI, and you can just pull this up and go into Cloud Code.
37:21You can paste it, and then it will ask you and interview you on your competence on a particular domain. So if we take this behemoth prompt for section one, this just says you are an expert instructor teaching domain one, architecture and orchestration of the Claude certified architects certification exam.
37:38And then at the bottom, it says welcome. It tells you the weighting of this particular section, and it asks you how familiar you are with AgenTic systems. If you say something like none, then it will create a custom learning path for you to start going back and forth through these concepts.
37:53So you see here, it breaks down what an AgenTic loop is, and at the very bottom, there's a concrete example. The critical field, we already alluded to this, the stop reason, the anti pattern, the correct pattern.
38:05It's gonna keep going telling you which part of the guide to reference, and this is awesome. So kudos to this individual. I'll leave the link for you with some other goodies that I'm about to tell you right now.
38:14Now as I recover from filming this video, I'm gonna leave you with a mega guide going through everything I walked through today with the actual visuals themselves, a breakdown of the concept, hopefully, in a better way than I was even able to articulate. And I'll make that available to you in the second link in the description below.
38:30And for those that wanna go infinitely deeper on Claude code and be in a whole ecosystem where you have coaches, myself, a brand new upcoming course, which is bound to blow your mind in terms of what you can do with Cloud Code, then you'll wanna check out the first link in the description below and maybe join me in my early AI adopters community.
38:47And for the rest of you, if you found this to be a helpful labor of love, one thing that you could do as a thank you is just leave a like and a comment on the video. If you like my stuff and you want me to go deeper on these kind of concepts, then subscribe to the channel and let me know. I'll see you in the next
The Hook

The bait, then the rug-pull.

Anthropic just shipped a real, pass-or-fail Claude Certified Architect exam, and the 40-page exam guide doubles as the best Claude Code syllabus in print. The host reads every page, then translates the five domains into plain English with diagrams and live demos so you do not have to.

Frameworks

Named ideas worth stealing.

01:39list

The Five Exam Domains

  1. Agentic Architecture & Orchestration (27%)
  2. Tool Design & MCP Integration (18%)
  3. Claude Code Configuration & Workflows (20%)
  4. Prompt Engineering & Structured Output (20%)
  5. Context Management & Reliability (15%)

Anthropic's exam weighting tells you where to spend prep time - agent architecture alone is more than a quarter of the test.

Steal forany internal team training on Claude Code
03:20model

The Agentic Loop

  1. Code sends request
  2. Claude responds with stop_reason
  3. If stop_reason = tool_use, execute the tool
  4. Feed result back, loop again
  5. If stop_reason = end_turn, stop

Every Claude agent - SDK, Claude Code, custom framework - runs this loop. The exit signal is the stop_reason field, not text in the response.

Steal forexplaining how agentic systems actually work to a non-technical stakeholder
05:18model

Hub-and-Spoke (Coordinator-Subagent) Pattern

  1. Coordinator decomposes the task
  2. Specialised sub-agents run in parallel, each in its own context
  3. Sub-agents do NOT communicate with each other
  4. Coordinator merges results at the end

The canonical architecture for complex multi-step work. Each sub-agent keeps its own tools and tokens; the coordinator only sees summaries.

Steal forany research, scraping, or analysis pipeline that branches
08:50concept

Prompts vs Hooks

Prompts are best-effort suggestions (probabilistic, ~88% reliable in Anthropic's refund example). Hooks are deterministic scripts that physically block actions. Style/tone -> prompt. Money/compliance/security -> hook.

Steal forany time you would write 'always verify X before Y' in a system prompt
12:07concept

Tool Description Anti-Pattern

Vague overlapping descriptions cause ~40% misrouting between similar tools. Adding explicit 'use INSTEAD OF X when Y' clauses drops misrouting under 2%. The description IS the interface.

Steal forany MCP server you ship or any internal tool registry
18:51model

The 3-Layer CLAUDE.md Hierarchy

  1. Layer 1: user-level (~/.claude/CLAUDE.md) - personal preferences, never shared
  2. Layer 2: project-level (.claude/CLAUDE.md) - team rules, version-controlled
  3. Layer 3: path-specific (.claude/rules/*.md) - conditional, only loaded when matching files are open

Split CLAUDE.md across three layers so Claude only loads what is relevant to the current task. Most people dump everything into one giant file and waste tokens on every session.

Steal forany team CLAUDE.md that has grown past a single screen
35:00list

The Five Cross-Cutting Rules

  1. If it has to work 100% of the time, use a hook not a prompt
  2. Always return structured errors - what failed, what was tried, what alternatives exist
  3. Cap each agent at 4-5 tools max
  4. Always review code in a separate Claude session from the one that wrote it
  5. Two or three concrete examples beat a paragraph of instructions

The closing distillation - five rules that apply across every domain. The unifying principle is determinism: pick the tool with the right level of predictability for the job.

Steal fora one-page Claude Code style guide for any team
CTA Breakdown

How they asked for the click.

38:07product
Check out the first link in the description and maybe join me in my early AI adopters community.

Soft, value-led. Framed as a community of learners with a course coming, paired with a free mega-guide so the ask does not read as purely commercial.

Storyboard

Visual structure at a glance.

cold open
hookcold open00:00
5 domains
promise5 domains01:35
agentic loop
valueagentic loop03:20
hub-and-spoke
valuehub-and-spoke05:32
subagent demo
valuesubagent demo07:50
prompts vs hooks
valueprompts vs hooks09:05
tool descriptions
valuetool descriptions12:00
tool overload
valuetool overload16:55
claude.md hierarchy
valueclaude.md hierarchy19:03
commands vs skills
valuecommands vs skills21:20
ci/cd pipelines
valueci/cd pipelines23:02
few-shot examples
valuefew-shot examples27:00
json with tool_use
valuejson with tool_use27:50
lost in the middle
valuelost in the middle30:10
human-in-the-loop
valuehuman-in-the-loop33:10
the 5 rules
valuethe 5 rules35:10
study resources
ctastudy resources37:10
sign-off
ctasign-off38:55
Frame Gallery

Visual moments.