Big Idea

The argument in one line.

Claude skills are not prompts. They are persistent SOPs that use progressive disclosure to protect context, route to cheaper models, and execute the same repeatable process with zero re-explanation on every run.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You use Claude Code daily and waste tokens re-explaining the same workflows in every session.
You want to automate a repeatable business process without context bloat eating your budget.
You are building multi-step agentic pipelines and need model routing to control cost.
You have heard of Claude skills but assumed they were just fancy system prompts.

SKIP IF…

You are looking for a quick-start template without the underlying conceptual model.
You already have a working, tested skill library built to Anthropic spec.

TL;DR

The full version, fast.

Claude skills are persistent markdown files that teach Claude a workflow once and run it reliably on demand. The core mechanism is progressive disclosure: the frontmatter loads every session for semantic matching around 100 tokens; the full SKILL.md body loads only when triggered; scripts and references load only when a step requires them. This keeps context lean, costs low, and output consistent. The video covers the full anatomy: description writing, invocation control, model routing to Sonnet subagents, degrees of freedom, validate-fix loops, MCP vs Python scripts, five design patterns, three-phase testing, and the A/B iteration method Anthropic recommends.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 01:42

01 · What Are Skills?

Skills as onboarding guides for Claude; persistent markdown files that load on demand; if you repeat a process, make it a skill.

01:42 – 03:05

02 · Progressive Disclosure

Three-tier loading: frontmatter triggers SKILL.md body on match which triggers linked files on demand.

03:05 – 05:26

03 · Skill Structure and Organization

Directory layout: SKILL.md SOP, scripts executed not loaded, references with examples and schemas, assets. The 97% context reduction example.

05:26 – 07:28

04 · Writing the Perfect Description

Max 1024 chars; third person only; answer WHAT and WHEN; include 3-5 trigger phrases. Bad vs good examples contrasted.

07:28 – 09:05

05 · The SKILL.md Body

Real-world research-lead skill walkthrough in VS Code; imperative writing style; frontmatter, model spec, allowed tools.

09:05 – 10:31

06 · Goal Format

Objective, inputs required, execution steps as numbered pipeline nodes. Specific tool calls with expected output per step.

10:31 – 12:55

07 · Model Routing and Subagents

model: sonnet, context: fork; Opus orchestrates, Sonnet executes; cost implications; allowed-tools scoping per skill.

12:55 – 14:07

08 · Invocation Control

Default both can invoke, disable-model-invocation:true for human-in-loop on destructive ops, user-invocable:false for background knowledge.

14:07 – 15:59

09 · Writing the SKILL.md Body - Instructions

Exact commands with full paths; expected output format per step; dependencies explicit. Vague vs specific imperative steps.

15:59 – 17:59

10 · Degrees of Freedom

High freedom for multiple valid approaches, medium for preferred pattern with variation OK, low freedom for exact commands do not modify.

17:59 – 19:05

11 · Feedback Loops

Bake validate-fix-repeat into the skill; max 3 rounds; report to user on persistent failure; skills evolve through iteration.

19:05 – 20:14

12 · MCP Tool References

Scripts for fixed deterministic pipelines with zero token overhead; MCP for judgment and external service interaction.

20:14 – 22:45

13 · Design Patterns Overview

Five patterns from Anthropic: Sequential, Iterative Refinement, Multi-MCP Coordination, Context-Aware Branching, Domain-Specific Intelligence.

22:45 – 25:21

14 · Testing Your Skills

Three tests in order: Trigger Test fresh session, Functional Test 4-5 runs subagent variants, Value Benchmark with vs without skill.

25:21 – 28:25

15 · Iterating with Claude A/B

Claude A designs, Claude B tests in fresh session. 7-step loop: complete, notice repeated context, create skill, review, test, bring failures back, refine.

28:25 – 31:20

16 · Practical Setup and Marketplace

VS Code plus Claude Code environment; dot-claude folder; Anthropic GitHub repo for skill creator; skillsmp.com marketplace.

31:20 – 36:05

17 · Live Demo: Converting a Goal to a Skill

Live conversion of Gamma slides goal file into SKILL.md using the skill creator; correct frontmatter, model routing, scripts output.

36:05 – 39:35

18 · Testing the Gamma Skill

Fresh session test; skill triggers correctly on natural language prompt; live generation of The Importance of Salt 11 slides via Gamma API.

Atomic Insights

Lines worth screenshotting.

Skills are not prompts. A prompt disappears after a session; a skill is persistent infrastructure that executes the same process every time without re-explanation.
The frontmatter description is the only thing the model reads every session. Get it wrong and your skill never fires, no matter how good the body is.
Write descriptions in third person (Processes emails) not first person (I process emails). Wrong POV breaks semantic discovery entirely.
A 3363-line Python execution layer returned 105 lines of context: a 97% reduction in token overhead by keeping scripts executed, not loaded.
Scripts are executed not loaded. They never enter the context window, which is the entire point of separating deterministic execution from probabilistic instruction.
Use disable-model-invocation: true for any skill that deploys to production, sends to external people, or handles sensitive data.
context: fork spawns an isolated subprocess so Opus handles orchestration while Sonnet handles execution. You pay Sonnet rates for most of the actual work.
High-freedom instructions are for judgment tasks; low-freedom instructions with exact commands are for fragile business logic where one variation breaks the pipeline.
The trigger test must use a brand-new session. Testing in the same window where you built the skill gives a false positive because context is already there.
Under-triggering means the description is too narrow; over-triggering means it is too vague. Both are description problems, not skill body problems.
A skill that does not improve consistency, quality, or speed over raw Claude output should not exist.
Use MCP when the model needs judgment or must interact with an external service; use Python scripts when steps are deterministic and repeatable.
The A/B iteration method where Claude A designs and Claude B tests in a fresh session is how you find description and instruction gaps invisible from inside the build context.
References are the pattern-matching layer: put examples of good output there so the skill matches proven patterns rather than inventing new ones each run.
The Agent Skills Open Standard means skills written today will be portable across Claude AI, Claude Code, and the API.

Takeaway

The system behind repeatable Claude workflows.

WHAT TO LEARN

A skill is not a better prompt. It is a workflow contract that executes consistently by keeping execution outside the context window.

01What Are Skills?

A skill is not a prompt. It is a persistent workflow that executes the same process reliably across every session without re-explanation.

02Progressive Disclosure

Progressive disclosure means the model loads only what it needs: frontmatter every session, the full SOP on match, scripts and references only during execution.

03Skill Structure

Scripts belong in a scripts folder and are executed, not loaded. A 3363-line execution layer can return 105 lines of useful output, a 97% reduction in what the model processes.

04Writing the Perfect Description

The frontmatter description must be in third person, under 1024 characters, answer WHAT and WHEN, and include three to five trigger phrases. Get any of these wrong and the skill never fires.

05The SKILL.md Body

Every execution step should include the exact command, the expected output format, and the dependencies. Claude needs no room for ambiguity in a repeatable workflow.

06Model Routing

context: fork spawns a cheaper subagent for execution while the orchestrating session handles coordination. Most of the actual work runs at Sonnet rates, not Opus rates.

07Invocation Control

Any skill that writes to production or sends to external parties should require explicit human invocation. Autonomous model triggering is a risk, not a feature, for consequential operations.

08Degrees of Freedom

Match constraint level to task fragility: judgment work gets high freedom; deterministic business logic gets exact commands with explicit instruction not to modify them.

09Feedback Loops

Baking a validate-fix-repeat loop into the skill body turns probabilistic output into a quality gate. Cap at three rounds and escalate to the user rather than looping indefinitely.

10MCP vs Scripts

Use Python scripts for fixed deterministic steps with no token overhead; use MCP when the model needs judgment or must interact with an external service on the user behalf.

11Design Patterns

Most real-world skills combine two or three of the five patterns: Sequential, Iterative Refinement, Multi-MCP Coordination, Context-Aware Branching, and Domain-Specific Intelligence.

12Testing Your Skills

Run three tests in order: a trigger test in a fresh session, a functional test across four to five different inputs, and a value benchmark comparing output with and without the skill.

13Claude A/B Iteration

The A/B method where one instance designs and a separate fresh instance tests surfaces description and instruction gaps invisible when building and testing inside the same session.

Glossary

Terms worth knowing.

Progressive Disclosure: A three-tier loading system where only the skill frontmatter loads every session, the full SKILL.md body loads when triggered, and linked files load only when a specific step needs them.
Frontmatter: The YAML header at the top of a SKILL.md file containing name, description, model, context, and allowed-tools. It is the only part loaded into every session.
context: fork: A SKILL.md directive that spawns the skill as an isolated subprocess running a specified cheaper model, separate from the main orchestrating session.
disable-model-invocation: A frontmatter flag that prevents the model from triggering a skill autonomously, requiring explicit human invocation. Used for destructive operations.
Agent Skills Open Standard: A cross-provider specification that makes skills portable across Claude AI, Claude Code, and third-party agents, preventing ecosystem lock-in.
Degrees of Freedom: The level of constraint given to the model within a skill: high (multiple valid approaches), medium (preferred pattern with allowed variation), or low (exact commands, do not modify).
Value Benchmark: The third testing phase: comparing output with the skill against output without it. A skill that does not measurably improve consistency, quality, or speed is worse than no skill.

Resources

Things they pointed at.

28:25linkAnthropic skills GitHub repo ↗

30:20linkskillsmp.com ↗

35:00toolGamma API ↗

20:50toolHeyReach

10:30toolPerplexity ↗

10:30toolAirtable

Quotables

Lines you could clip.

11:11

“3,363 lines of Python just returns us 105 lines of output. There is no bloat, just answers to each of the steps in this process.”

Concrete number that makes the efficiency case visceral→ TikTok hook↗ Tweet quote

16:00

“If I handed this to someone, would they know what to do with this?”

Simple heuristic that applies to any workflow documentation→ IG reel cold open↗ Tweet quote

24:50

“A bad skill is worse than no skill.”

Contrarian, quotable, forces the value benchmark mindset→ newsletter pull-quote↗ Tweet quote

18:25

“You can get it to rewrite its own skill while it is learning. Your skills are evolving.”

Unexpected framing: skills as living documents not static configs→ Twitter/X thread opener↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

analogystory

Claude skills are making waves, but most people have no idea what they actually are. They think it's just another prompting trick, but it's really not. Skills are actually portable workflow packages that teach Claude your processes so you can write them once and run them forever.

In this video, I'll break down the complete anthropic guide on how to write the best skills, when to use them, and how to test them properly. So let's get into it. So the first and most obvious question is what are skills?

You can think of skills like an onboarding guide for Claude. Now Claude is really smart already, but it doesn't know your specific workflows. It doesn't know your tools, your standards, your business context, and things like that.

So a skill teaches Claude something once, and then it can use it over and over again for the exact same thing. And that's really powerful because if you can build a really good skill up front, that means that your repeatable process is gonna be good every single time it runs. But we need to dive much further into this because it actually solves a much larger problem.

So if we look over here at what skills are, they are persistent markdown files with instructions. And for those of you who don't know, markdown is just a file type. It's easily readable in plain English, so nothing overly complex here.

These files load on demand, not every single session, which is really important because it's part of Viruses, such a smart solution. It's portable across Claude AI, Claude code, APIs, and things like that, and it's part of the agent skills open standard. So that's where all the providers are getting together, and they're all gonna be able to use these skills interchangeably so that they aren't locked into each other's ecosystems, which should obviously make things a lot more difficult.

So the TLDR is that if you are doing the exact same process over and over again, you should probably turn that into a skill. But let's dive deeper into that first. So this brings us neatly onto progressive disclosure, which is something that you need to understand, but because it is the core innovation behind why we're doing all of this.

So it's a three tier loading system that revolutionizes how Claude actually accesses information. So instead of reading everything up front, it loads it in specific pieces. If we look over here, the first thing that it does is load the front matter.

So this is the name and the description of the skill only. We would use this every single time it loads into a session. The token cost here is only around 100, and this is really important because you can think of this like a catalog.

Say you ask Claude to do something for you. It's going to go and search the front matter of your skills to see which one suits your request. We'll get into a practical demonstration of this very soon.

The second part of progressive disclosure is the skill dot m d body, and this is where it loads the full instructions. So after it's decided, hey. This front matter over here seems to match what the user needs, we're then gonna go and load the full body, and it uses this when it matches that request.

And the token cost here is usually hundreds if you've written something quite well. Can obviously be higher if you've written something terrible, but the point is is that you can see here through this progressive nature, we're no longer just loading everything upfront. We first just load the little bit into every session.

Then if it matches something, we load the full skill dot m d body, which has the full instructions of what the skill is gonna do. And then it also has any linked files that come along with this, where it loads any references, scripts, templates, things like that. And these are only loaded again if it needs it as part of a step in this workflow that it's doing.

So the token cost here is zero until it's needed. Alrighty. So now that we understand one of the problems that it solves, the second thing that we need to look at is the skill structure and the organization.

So every single skill has a directory with the skill.md as the entry point. So understanding what goes where is critical because you need to understand how to build this efficiently. So if we take a look at an example from my environment, lives in your skills forward slash and then whatever the skill name is.

In this case, we are looking at something called research lead, which is actually one of my workflows. So the main file over here is my skill dot m d, and this is the SOP. You always wanna try and keep this under 500 lines just for efficiency.

The second thing that we have here are any scripts that you want. So as part of my workflow, I have some scripts that run such as scrape underscore linkedin dot p y, research with perplexity, analyze with OpenAI, whatever. I've got some scripts as part of this research lead workflow here.

And the important thing to note here is that these are executed. They are not loaded into the context window. So again, you can see we have that separation over here, which doesn't bloat the context window of the AI.

That is the whole purpose of having a structure like this because we want the probabilistic nature of AI, but we want the deterministic function of tools, which is scripts in this case. More importantly, we don't want to waste context on whatever it is that we're doing because then it leads to those problems that I mentioned earlier.

So running these as scripts helps solve that problem because these are executed locally by a Python script, which is just programming language, and the results are then shot back up into Claude when it needs them. So instead of 500 lines of trash, we now just have 20 word result, whatever it is that these scripts do.

Then as part of this, you also have references, and you can think of this as a good place to stash your examples. So again, if we take my research lead as an example here, part of that workflow, I would want to go out and DM a bunch of people. So in the references section is where I would put in few examples of what a good DM looks like.

So as part of the skill, it knows to check the references folder and then say, hey, this is what good looks like. I'm gonna match this exactly. Because remember, AI is a pattern matching machine.

So giving that examples is a really amazing thing to do. And then you can also have assets and these can be templates or other assets that the AI might need for the skill in order to achieve its goal. And again, the important thing to remember here is that everything is loaded only when it's needed, so we protect that precious context window.

Then we need to take a look at writing the perfect description. Now remember, the description is what goes in the front matter that gets loaded into every single session. The reason this is so important, like I mentioned earlier, is that if Claude cannot see the skill, it's never gonna know how to use it or fire it, in which case it won't get used.

So one of the first things you need to understand is use a max of 1,024 characters. You wanna keep it concise but comprehensive enough for accurate matching.

Very important. Third person only. So process emails, not I process emails.

It's the wrong POV, and that will totally break the discovery of the skill that you're writing. You need to make sure that you answer the what and the when. So what does it do and when should Claude use it?

And then finally on this, just include three to five trigger phrases. So specific words or phrases that should activate the skill. So let's take a look at some bad examples over here.

Helps with projects. That's far too vague, and with AI, you wanna be as specific as possible. The way I like to think about it is if you are speaking to another human, if you are giving them this little minimal instruction here, would they understand what it is that you're talking about?

Not, you probably need to revisit this thing. If you don't know how to write this, it means you don't understand the problem well enough to describe it, in which case you need to go and address that first. But there are other types of bad examples.

So I can help you process these emails, and you see here we've written it in the wrong POV, so it breaks the discovery because it's not in third person. And then finally, implements sophisticated data pipelines, and that's just consultant word salad. There's absolutely no reason to use overly complex language if it doesn't need it.

Keep it simple. Always. As simple as possible, as specific as possible so that anyone or anything that reaches would know exactly what it's for.

Let's see what good examples look like. Manages linear sprint planning, including task creation and status tracking. Used when the user mentions sprint backlog or tickets.

Pretty straightforward. Processes Gmail inbox to identify high risk emails and deliver an executive briefing to Slack. Run with forward slash email digest.

So you can see these are really good examples because they tell the system exactly what to do or when to use something. So it can pick that up as part of the front matter that we mentioned earlier. Because you have to remember, Claude is using semantic matching here, meaning that it's matching based on meaning.

It's not using regex or something like that. So create ticket matches log some tasks, but exact trigger words still give it the highest hit rate. So always keep that in mind.

Why don't we take a closer look at what skill.md looks like in the real world? So we have our structure over here. We have dot claud and then forward slash skills, and in this case, research lead.

That's one of my skills. So our skill dot m d file, you can think of it like the SOP, and this is a step by step instructions that Claude is gonna follow in order to achieve its goal. So if we come on over here to my research lead skill and click on the skill m d, we can see some of the things we've been speaking about already.

So up at the top over here, we have our front matter, the name. We have the description, transform a LinkedIn URL into a complete research package with personalized outreach, and a little bit more info there.

Then underneath that, we have the model, and this is something that's really important to understand. Because what you can do with this instead of getting Opus down here to run everything in your environment or always use Opus for your skills, you can specify the model. So I could start the process by chatting in my little box at the bottom here, but then it can spawn it off as a subprocess with Sonnet so you save a lot of money because not all of your tasks need Opus.

So that's something important to keep in mind. Then we also have allowed tools, and this is an important concept as well because you don't need to load everything into every single skill that you have running. You have probably have a ton of MCP servers.

You might have some scripts, stuff like that. The point is is that locking this down to only use the thing that it needs is again far more efficient. In this case, for researching a lead, we have a bunch of scripts that live in this folder over here, and it uses these scripts in order to achieve our goal.

So first, let's have a look a little bit more at what a goal looks like. Now one of the things that people get wrong is that they think skills just replace prompts, and that's kind of half true, but also not really because a skill is more like an instruction manual or SOP as I've mentioned earlier.

And Anthropic says that you can write these in two ways. One, you can write it imperatively, which is where we're telling this thing what to do step by step.

The other way is to take more of a prompt structured approach like we used to do. Both of those will work, but this one is the preferred method. And then they have a caveat that they say at the end of all of that, and it pretty much just reads that as long as the skill does exactly what you want it to do every single time, that's what you should be aiming for.

So the structure is important, especially this front matter part, but what you do after that is mostly up to you to choose how it gets you to where you wanna go. For me, the goal format always makes the most sense because I'm coming off of a framework that I made myself as part of some of my other videos you can check out on the channel now if you haven't seen them already.

Point is is that to me having it laid out in this way, it's not only clear for me, but it's also very clear for the AI to understand exactly what we're trying to achieve with the skill. We're just trying to achieve a goal here. In this case, transform a LinkedIn URL into a complete research package with personalized outreach content constrained to relevant personalization only.

So relevant relates to a problem they're likely facing that we can solve, and I just give a little bit more information. And then over here, we have the inputs required. So I just give the ad a little bit of knowledge about what inputs might be required for this, and the orchestrator will then run through everything.

So I list the execution steps as part of this pipeline. Step one, scrape LinkedIn profile. Step two, research company and person via perplexity.

Step three, run AI analysis, so on and so forth. So it's literally a list of how we achieve this goal, and you can think of it like nodes in an n n n workflow, to be honest. It's pretty much the same thing.

But as part of these steps, I'm also giving it very specific instructions on what tools to use, and they relate to these over here. So my scripts are just these tools.

And these will then go out and run locally, which is a very important distinction because we don't need to bloat the context. We don't need AI to do any of this work when we can use programming language to do it for free, and then just deliver us the results.

So if we take a look at this, these scripts, this is just the execution layer. All of this gets executed locally, and it's not loaded, which is really important.

So 3,363 lines of Python just returns us a 105 lines of output with context that the AI actually needs.

There is no bloat, just answers to each of the steps in this process in order to get its goal achieved. So that's a 97% reduction, which is really handy.

And again, this is just part of separating concerns. There's tons of frameworks behind this already. I had one busy turning it into a version two along with using this because this is way more efficient with some of the other tips and tricks that I've been using in my other videos.

But then we also have references and if we come over here, we can see one of the references I've got in here are output structures. So it's just JSON structures for each analysis type in the research lead pipeline. So this just gives the AI a little bit of structure around how I want things done.

But then equally as part of this, I could also have the definition of what a good direct message looks like. Stuff like that. I would put it in there so that this can just go and match patterns when it needs to to deliver the same thing repeatedly and accurately every time.

Then finally, we have assets and these are just templates or images and config files that might be part of the system. In this case, I don't really have any as part of my research lead system yet, but that's because it achieves it with everything else that I've already got in it. The point is though that not everything needs all of these.

You don't need to fill them out. It's interchangeable. You might not use any scripts if there is MCP available and MCP makes sense to use.

It would then use that as a tool. I use scripts because for this specific workflow, it makes complete sense to do everything locally. I I don't need to use any tokens or anything like that for this specific workflow.

But now that you have an understanding of what the structure looks like, we need to look at a few other things behind skills. Next up, we need to look at invocation control, and this is important to understand because it basically decides who should be able to trigger this specific skill. So first up, we have default behavior.

So you can invoke it? Yes. Claude can invoke it?

Yes. You use this for most of your skills. Let Claude find them and use them easily.

That's the default behavior and for the most part, most people are gonna wanna be using that kind of thing. Next up, we have disable model invocation equals true, and this means that you can invoke it, but Claude cannot invoke it. You'd wanna use this for destructive operations or something where you want a human in the loop.

Because remember, you don't just want the system going out there and YOLO ing every single thing, especially if it needs your approval to get in front of other people or send something to someone, deal with sensitive information. Those are a few use cases. But then there's also user invocable equals false, and that means that you can't do anything, but Claude can do it.

There are a few use cases when you'd want to have this. Me, personally, I would never have that, but mostly useful for background knowledge that Claude should know, but the users can't directly trigger. So if we have a look at what that might look like in our front matter, we would have name, deploy, description, deploy the application to production, disable model invocation, true.

And in that sense, Claude wouldn't be able to do a single thing because you don't want Claude just yellowing things into production, but the user can say yes. So as part of a workflow, that's probably something I would want to set especially if I have things that are starting to travel into production. We need that human review before we go in there.

And then I also wanna look at model routing and sub agents. If you remember earlier when I touched on the skills, I told you about being able to run Sonnet. So for my research lead, we kick it off with whatever I'm talking to in that chat box.

Usually, that's Opus. Then it will spawn our Sonnet model, which is much cheaper to run a specific workflow. The context here is to switch to fork, which is an isolated subprocess, and then we just give it our allowed tools.

So we have the user asking, it reads the front matter, it matches the skill, it then spawns Sonnet, and it executes and returns whatever I need from my research lead process. Pretty straightforward and obviously it's up to you to decide or work with the AI in order to decide which model makes sense to use for what. You can use Haiku for the things that are far less complex.

Maybe you wanna keep using Opus for the things that need a lot of judgment and a lot of reasoning behind them. Point is you can toggle with all of those and you can play with them as part of your workflows. Then I just wanna touch on the skill that MD body again.

So we had a look at how I structure mine already using the imperative form as an s SOP, but you need to understand what actually makes this good. So exact commands with full parts, expected output format for each step, dependencies listed explicitly, things like that.

Claude knows exactly what to run and what to expect back. It's really helpful when we give it examples or when we're being very specific because it needs no room for ambiguity. So like I've got over here, step one, do this.

Use this thing in order to achieve that. This is what the output would look like. Step two, exact same thing over and over again for the specific workflow that I'm doing here.

But what makes an instruction bad is if you say something like research the lead and find out about their company, then write some DMs. That is insanely vague and you could get a 100 different results based on that. So doing something like this is not gonna get you where you wanna go, especially as a repeated process.

This wouldn't work if you gave it to a human. If you gave these instructions to a human, they would go and do something, but it's probably not gonna be the thing that you actually wanted them to do. So again, when you are writing these things or getting AI to write them for you, be very specific and think, if I handed this to someone, would they know what to do with this?

If we look at this over here, if I handed this thing to someone, they would understand exactly what is happening here and exactly what I would want them to do. So that's the way that you wanna look at writing your skill bodies. Then we have degrees of freedom, and this is essentially just telling Claude the level of freedom that it has.

So from loose guidance to the exact commands that we have. So high freedom means that multiple approaches are valid. Context determines the best one.

An example of this would be review the code, check for bugs, edge cases, readability, things like that. So it has high freedom to do that. For medium freedom, that's where a preferred pattern exists, but some variation is okay.

So an example of this would be generate a report using this template, customize sections as needed. And then low freedom, these are operations that are fragile where consistency is critical. So a specific sequence must be followed every single time in the exact same way.

And that for me is where mostly I use my Python scripts. So I say to the AI as part of the skill body, run Python three scripts migrate p y whatever. Do not modify the command.

I'm giving it very specific instructions about what to run here. It doesn't really have a lot of freedom or choice in the matter of how it does this task because I'm being very specific. And that's really important for business logic and building systems.

You don't want to give this thing high freedom if it needs to achieve a very specific command. That makes absolutely no sense. So you need to factor that into your skills as well.

How much freedom do you wanna give this thing as part of what it's trying to do. One thing I like to do is to also have feedback loops. So generate, validate, fix, repeat.

AI does that automatically, but you can bake these into your skills and and you can give them an extra nudge just to put any of your own custom rules in there as part of this generate, validate, fix iteration loop that it runs through. So if we look at an example from my research lead, we have validate the output. So we run Python three scripts validate dot p y output dot JSON.

If validation fails, read the error message carefully, fix the specific issues, and run validation again. Only proceed when validation returns valid equals true. Maximum three rounds if still failing report to user.

So, again, you can customize this to whatever the hell it is that you want, but feedback loops are very important because it learns every single time that it runs through this iteration, and you can also get it to update some of your other SOPs or your Claw. Md for anything specific that it might learn.

The point is here is that your skills are evolving. The goals mostly stay the same, but when you're testing and you're iterating through your initial builds, that is when you are learning and evolving the system so you can get it to rewrite its own skill while it's learning. The point here is that when we go through this little loop over here of generate output, validate with script, fix issues, repeat until valid, We're trying to get to a very good definition of done, something that matches the quality that we need for the system we are building.

So then next up, we obviously need to talk about MCP because that forms part of it. In your environment, they're generally gonna be like two types of tools. One of them being MCP, which then has a whole world of tools.

But then we also already mentioned scripts, which in this case are also tools. Point is is that you don't want to use MCP for every single thing out there just because it exists. If we need to do something with Superbase or Vercel or something like that, Notion, whatever, there are tons of times where using MCP makes sense.

But there are also times where using it is just gonna add latency or context bloat and things like that into your system, And that's where using something like a standard Python script makes much more sense. So I would say use scripts specifically if you have a fixed pipeline, the same steps every single time, something repeatable that you need to do over and over again in the exact same way that doesn't require any form of AI judgment.

You're gonna wanna use a script for that because then there is absolutely no token overhead like I mentioned earlier. It's literally just the script running locally, doing its thing, whatever it needs to do. And then I would use MCP when I want some judgment behind it or if the AI needs to interact with something specifically on my behalf, that's when you would wanna use it.

When you have those extended services like Vercel and Superbase, when it needs to go out into the wild and reach into some of these fancy tools and do work on your behalf, MCP makes sense for that. For me, I'm always gonna default to using scripts for deterministic stuff, and I will get MCP involved when I needed to do work on my behalf with those external systems.

Then we need to take a look at a few design patterns that Anthropic put forward. So the first one is sequential, and this is where steps run-in a strict order. This is like that n n n node flow that you can think of.

And that's what I've done with my research lead because everything can be done locally. I don't really need MCP or anything like that. It's mostly just using Python scripts here along with a couple of APIs into other systems.

But the point is everything runs sequentially. So we research the lead, we scrape a profile, we research that, we analyze everything that we've done, and then we write the DMs, we store them, and then send them using HeyReach. So that is a very sequential thing, and most business operations function that way.

We want the steps to run-in this specific order. Then the other one is iterative refinement where the output quality needs validate fix repeat loops. So an example here would would be for building a website.

So generate HTML, validate, fix, revalidate until we get to whatever our definition of done looks like. The third one is multi MCP coordination, and that's where the workflow spans multiple external services.

So an example here, if we were building an app, we might use Superbase for the database. We would use it ourselves for deploying it. We might use Slack to notify a bunch of people.

We might use Firecrawl to do web scraping as part of that. Whole bunch of different things, but they're all running via MCP and the AI is coordinating all of that stuff for you. If you haven't seen my ChatGPT versus Claude video, it's on the screen now.

And I use this exact sort of thing in there. We use MCP for everything. Then the fourth one is context aware branching.

And this is where you have the same input with different execution paths. So an example of this would be to process a file. If it's a CSV, take this data path.

If it's a PDF, go down that path. Very straightforward. It's just context aware in the way that the process will run based on its initial point.

Then pattern five, we have domain specific intelligence. And that's where we have embedded business rules or compliance, audit trails, things like that. So an example as part of my research lead workflow that I've got, I have a relevance filter and it discards anything that looks like theater.

So if it finds someone's LinkedIn profile that talks a bunch of stuff that isn't really relevant, it filters all of that out. It only focuses on keeping things that will help write its DM effectively because I've baked in some of my own domain specific intelligence when it comes to sales and AI and things like that.

So I've told the AI what to look for and it is now intelligent in the way that it does that. So these five design patterns, they're probably the most common things. And realistically, for most of the work that you're gonna be doing for clients, it could include many of these different things, if not all of them, as part of their system.

Just keep in mind something on MCP again. Only use it when you absolutely need to use it for your client systems, because even though they kinda solved the token bloat problem with lazy loading, it still exists to some extent, especially if you're building a massive system.

But then, obviously, we need to test the skills that we're writing. We can't just go out there and do a bunch of stuff and expect it to work without running any tests. They need to be performed in this order because each of them reveals a different aspect of the quality or their effectiveness behind them.

So test one is the trigger test. You open a brand new session, not an existing one because there's context bias.

Obviously obviously, if you've been building the skill with the agent inside there and you had to ask it a question, it already has some context awareness to it. So the best thing to do is to use natural language inside a new window. So if I flip back here, say I was using this one to build me my initial skill, I would just come over here, open up a new window and I would say, hey, let's go and run lead research.

Now as part of that, we're testing whether this thing can semantically match what I've written in my front matter, which is that little sentence up top that Claude uses to identify a skill. And we're doing that in a new window because this has no context awareness of anything that I've spoken about with regards to research lead up until now.

Whereas if I did it in the window in which I built that skill, it obviously has all the context as part of that conversation up to the point where it might have cut something off. But the point being that context is still gonna be in there, which is why we want to use a new window to see firstly if it can actually trigger the workflow which validates our front matter is working as it should.

So if we have under triggering, that means that the skill never fires. So we need to broaden the description in our front matter describing what this thing is or does. But then there's also over triggering where it fires for everything, and you need to narrow the description.

So, again, it's about finding the right balance when you're writing your front matter. Test two is the functional test, and that's pretty straightforward. We're just running the skill four to five times with different inputs.

Does behavior stay consistent when we're doing that? Try it with sub agents as well. That's very important to do.

Again, if we're not just using our main Opus agent and we switch the model to Sonnet, get it to do that and see how it responds. And then the third part of testing is the value benchmark. The hardest question here is, is this skill actually helping?

So compare Claude's output with the skill versus without it. If the skill doesn't improve the consistency, the quality, or the speed of whatever you're doing, it probably doesn't need to be a skill. Again, it should be something that is repeatable, but it needs to be adding that value to you.

So if it's not making your life faster or more efficient, does it really need to be a skill or do you need to readdress whatever it is that you're looking at? And then finally on this, we need to look at what Anthropic recommends for iterating with Claude, this is part of their a b building and testing. So they recommend that you use two instances working in tandem, and it makes complete sense as you'll soon find out why.

So Claude a is the designer. Helps you write and refine the skill dot m d. It understands the agent instructions, and it works with your domain expertise to craft the effective skills.

Claude b is the tester. So you use the skill in a fresh session like we just discussed on real task, and this reveals where the instructions might be too vague or incomplete and things like that. So the workflow for this might be complete a task with Claude a.

Use normal prompting to accomplish a workflow. Notice repeated context. So you identify what instructions you provide repeatedly, and that's probably gonna be your skill.

If you have a workflow that you do over and over again, that can also be a skill. Things like that. Then for step three, just ask Claude a to create a skill.

So again, if we're using my research lead analogy over here, this was something that I obviously did over and over again to find leads. So initially, it started as an in workflow, then I moved it into an agentic format with using my own gotcha framework.

And now I'm bringing it into skills because as you've seen, they're far more efficient specifically for the context loading. But the point is is that I went through my process step by step. How do I research leads?

How do I go out there and find them? And then I've built an automation around that. And in that same sense, we're not just doing that as a skill format.

I So would come in here and I would say, this is my typical working model. I go into this system. I take this action.

I do a bunch of these steps. And I would talk to Claude about it, describing my entire workflow.

And then at the end, just say, hey. Let's turn this into a skill. And that's what we're doing over here.

If you had walked through something with it or you had been working together, you can just turn that into a skill as well. Tons of ways that you can approach this. Then obviously, we need to review it.

We wanna make sure there's no bloat. We wanna make sure that as a first part of the process that it is actually functioning and running at a basic level. And we do that by running tests with Claude b.

So fresh session, similar tasks, observe the behavior, see what it does. Then the step six is to bring failures back to Claude a.

Claude b forgot to filter test accounts. So we're feeding Claude a more information about where it went wrong when it initially built our skill. You just go back and forward like that until you get the system built.

It's refining with two agents, and it's probably the most efficient thing you can do right now, especially if you already have the context that you can feed agent a upfront. And at this point, I should probably give you some pointers on where you can get the skill creator and embed it in your own environments that you can make this entire process a lot easier because Anthropic already thought about this for you.

Alrighty. So let's have a very quick overview of how all of these pieces together from a practical perspective and how we can use Claw to actually help us arrange and organize all of these things. Now for the most part, I'm gonna be using an IDE, which is just this interface you see here.

In order to settle all of this up, you can do it in the CLI and again, this is an open framework, so it will probably work with most of the applications that you're using nowadays as well. So I'm using Versus Code as my IDE, but again, any IDE will do just right. So one thing that you could do to get everything set up for you is literally just come down here to Claude, ask it to research skills, and then get it to set up your environment according to what Anthropic says is best practice.

That's one way, and then this thing would just go ahead and do it. I'm not gonna take that approach because I use this system along with my Claude MD that I'm busy migrating to a version two to attach with my framework. So I use this a little bit differently in that sense because I've got other moving parts that form part of my skills.

But for the most part, if you wanted a shortcut, just ask Claude to set this up for you. If you wanted to do it manually, everything lives in this dot Claude folder over here. This is currently at a project level.

And you see we've got our skill dot m d, which is our skill creator, its scripts, its references. So if you wanna get this meta skill creator that creates all of the skills for you, there are multiple ways you can do that.

One, you could come on over to your web browser and you can have a look at this main GitHub repo over here. This is from Anthropic and you can connect this directly to the IDE that we're currently working in or you could just reference this and again give this to your agent down here and say, hey, can you pull out the skill creator from here and add it as a skill to our environment?

And it would do that. But probably one of the easiest things for you to do is to come down here, do forward slash plugins, go over to the marketplace tab and that repo link, just copy and paste it into here.

And you'll see you'll then have it added down here. In which case under plugins, your skills become browsable from whichever repo you paste in there. So you can also search for them if you don't feel like scrolling through this entire thing.

But if you scroll down, you will see a few skills in here including the skill creator. So this is the official one from Anthropic. And then once that's installed, it will be installed over here and you'll be able to use it and just come to this thing and say, hey, build me a skill based on this workflow whatever it is you're working on.

And it would then go ahead and build it based on the best practices outlined in this skill.md over here. So this is the meta skill creator. Very handy.

But then there's already tons and tons of existing skills out there from people who have done these workloads for you already over and over again and they're already tested. We can find those in the marketplace over here. So if you check out skillsmp.com, you'll be able to find an insane amount of agent skills.

You saw you can always come down here and try and find whatever it is that you're looking for. They give you a few examples. So skills about trading, data analysis related skills.

Let's have a look for sales and marketing. See what they've got for that. So you can see here there are a bunch of skills for this already.

And then all we would have to do is click here and this is a little markdown structure that you can see. So again, there's our name, there's the front matter description that we spoke about in the video, lead research assistant, this skill helps you identify and qualify potential leads for a business by analyzing product and service.

So there are tons of things you could do from here. You could just copy and paste this back into your chat window. Say, hey, turn this into a skill and it will go and do it for you.

Or there are other options on the right hand side here depending on what you're using. You can see it's got wget which means it would just download it. You can also use n p x which is pretty handy if you didn't want to rebuild this thing yourself.

Point is, this is where you come to get existing skills. There are a ton of other places where you can get skills as well. But for me, I prefer to build my own because like I said, I I work in a very specific way.

So now that I've got the very basic stuff out of the way, I'm gonna jump into my environment and I'm gonna show you how I would build skills and how it works within my environment. Okay. And we're in my environment now.

So let's pretend that you had a slide generator and you wanted to turn it into a repeatable workflow because you've nailed exactly how you want your slides done. Now, I've already got something like that in here.

I've got under my goals folder, I've got Gamma slides. You don't need to understand what goals is. It's just my framework that I was using historically before I moved across to skills.

So I'm just using this because it already, as you can see, is a repeatable workflow. I was already doing the exact same thing. Just inefficient because now that skills are so granular, you can actually do this type of thing a lot more efficiently, which is why I'm moving my framework across to half skills and then half of some of the other stuff that I've been doing in my videos if you've been watching me for a while now.

So anyway, what we're gonna do is we're going to get our agents to turn these gamma slides into a skill. Shouldn't be that hard because we've already got the format nailed over here. We were already doing that kind of thing.

But for the most part, it's going to use our skill creator to put it into the right format. So all I'm gonna do is I'm gonna come over here and I'm gonna tell this thing to do it.

I want to create a new skill based on one of our goals for Gamma slides. So can you take Gamma slides and turn them into the new skills dot m d format?

Place it in the gotcha v two folder. So this thing's now going to try and figure out what it is that I just said to her. It's probably gonna read our goals folder.

In your case, if you were writing a new skill, you might not already have something as nailed down as this. But again, you would understand the workflow that you've been building or the idea that you want to have. And then you would start to articulate that to the AI and work back and forth with it in order to build some kind of repeatable workflow or an application, a system, whatever it is that you're building.

And ultimately, you would get to a point where you would have something structured like a goal like this. I'm just gonna bypass permissions. Specifically, I do with my framework is myclaw.

M d is tailored all around this. So it understands exactly how to build these goals and how to separate them with tools and things like that. I don't wanna go into that in this video.

I have deep dives in other videos on my channel. You can check one on the screen now. Point is though, I'm gonna be keeping that exact same structure just migrating it across to skills and I'll put on a new video and deep dive into how all of that will go together.

So I'm not gonna go deep into that in this video. Okay. So this thing is allegedly done.

Here's what I created, the Gamma Slide skill. So if we go on over to Gotcha v two, now yours would probably be living up here. Mine is just like this because I'm using this as a test environment while I migrate frameworks.

But if we come down here, can see it used the skill creator to create our Gamma slides. So if we have a look at skills.md, the name is Gamma slides.

The description is to generate presentation slide decks from markdown content using the Gamma API. Used when the user says create slides, make presentation, build a deck, generate a slide deck, so on and so forth. It's given that the Sonnet model to do that and it wants to use it as a sub process which is very cool.

And then it's limited its tools as well because we only need one tool in order to create that and that is probably gonna be a script that connects to the Gamma API itself and then pushes in some of the parameters that we want. So we can have a look over here.

The objective is to generate a presentation deck blah blah blah. The inputs required, so we need to have content or a content idea, a title, the slide count, and then it lists that we need the Gamma API key which lives in my dot e n v file down here.

And then all it does is create the execution steps. So prepare markdown content which creates the clean file that we create locally here.

Now we create it locally so I can do editing here before I push it up to Gamma and then use Gamma AI to create it. Otherwise, we would burn a ton of credits between two systems and that's just silly. So we get things working really nicely in this environment, which is cheaper, and then push it up to Gamma, and Gamma's AI will make it for us.

So I'm not gonna go through this whole file, but you get the point. It is a step by step framework on exactly how to do it. We tell it exactly what it needs, the text amount, whether we're using images or no images and things like that.

It all gets baked into this skill or standard operating procedure. And then in order to help us achieve this goal, we have our scripts which is how we connect to the environment and this is just the tool that it uses.

It's the Gamma API. You can see down here, this is just a Python script. It costs nothing to run.

It doesn't need any context. It does all of the work. Don't need to use MCP for this because the AI inside Gamma will already take care of all of the building for us.

So we don't need our subscription to do any of that lifting. No need for MCP. And it really is that easy, guys.

Now, obviously, you would be working with the AI if you wanted to. I imagine if you started with a fresh environment, you wouldn't have any of this stuff. But the point is you could just come down here and articulate your thoughts or your problems with the AI and work through whatever it is that you're trying to turn into a skill.

It's basically just understanding your problem well enough and then having solved it manually and then coming in here and trying to refine that into a system. And Claude will go ahead and build that for you, especially if you're using my framework or someone else's framework.

It can help you do that even better and then you attach it to your skills like we did over here. But then obviously, all we've done here is create it. So we wouldn't want to test it in the with the exact same agent.

So we would open up another one and I would just come in here and I would say to this thing, use the Gamma skill to create me a slide deck on the importance of salt.

And then it's gonna go ahead and it's gonna do that I'm gonna bypass permissions. Let me understand how the Gamma skill works by reading the relevant files. So you can see there, it's reading our skills, Gamma slides, figuring out how things work.

Now I'll create the slide content and generate the deck. Let me first ensure the temp directory exists which it probably does. After it does that, it's gonna go and create our local markdown files.

After that, it's then just gonna use the Gamma API, send it up to Gamma, and Gamma is gonna generate our slide deck, and we'll see how that thing turns out in just a few seconds. But before we do that, we should understand that obviously, we were building something more complex, there would be back and forth between this one which is Claude a and this one which is Claude b and if you remember that's what we were speaking about when we're building and testing these things.

We want to have two agents talking. Now, this thing immediately found out what I was talking about. So that means that we nailed our front matter.

That means that that little section at the top was very clear and that a brand new agent can pick that thing up, which is the most important part. And then it was just able to go and do exactly what it needs to do. It wrote our Gamma content.

Content is ready. Now let me generate the presentation via the Gamma API.

And so you can see how it's using the skill and in this case, the tool, which is a script in order to achieve its goal. We could extend this out even further. So if you want it to be better practice, I would provide it a definition of what good looks like, and that would live in the references folder, which it didn't create here because I didn't care enough to add it for this example.

But the point is is that I would add references in here, and then I would put in two or three MD files showing it exactly what a definition of good looks like for my slide structure so that I would always get that structure every single time. We're just matching patterns here. But I think it still did a pretty good job.

Let's have a look at what it turned out. Look at this, the importance of salt. From ancient civilizations to modern industry, salt has shaped human history, biology, and culture in profound ways.

This essential mineral touches every aspect of our lives. You get the point. This is the exact thing that you saw earlier.

This is the same color theme and all of that that I use for every single one of my slides. So all of this is nailed from a functional perspective. So we have officially created a repeatable system.

And anytime I wanna come into my agent, all I need to do as you've now seen is come down here and speak to it. And that's the whole point of skills to make all of these repeatable systems much easier to use for us. So hopefully, I've demystified it just a little bit.

I will have another deep dive into this when I launch v two of my framework that uses skills as part of it, and we'll cover more things in there as well including plug ins. So if you have any comments, leave them below. I do have a community that just launched, if you guys wanna check that out, it is open now.

Otherwise, check out the videos on the screen. They're definitely gonna help you on your journey. Thanks very much for watching.

See you on the next one.

The Hook

The bait, then the rug-pull.

Most people discover Claude skills and immediately misfile them as a fancier prompt. They are not. A skill is persistent infrastructure: a workflow written once that executes consistently forever, without spending tokens on re-explanation. This breakdown starts where most tutorials skip: the architecture that makes skills efficient, not just possible.

Frameworks