Modern Creator
Leveling Up with Eric Siu · YouTube

How Skills, Evals, and Loops Clone Your A-Players Reliably

A 13-minute inside look at the Skills-Evals-Loops system one agency uses to train non-engineers to do the work of four to ten people.

Posted
3 days ago
Duration
Format
Tutorial
educational
Views
605
32 likes
Big Idea

The argument in one line.

Replacing vague process mandates with a compounding system of discrete reusable skills, measurable evals, and chained agent loops is how an organization reliably mints A-players instead of just hiring them.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A founder or operator who has tried to build SOPs but finds nobody uses them six months later.
  • A team lead exploring how to use Claude Code, Codex, or agent loops in day-to-day workflows beyond individual prompts.
  • Anyone managing a services team who wants to raise revenue per employee without simply cutting headcount.
  • A non-engineer curious whether AI tooling is accessible outside of the dev team — the apprenticeship cohort explicitly includes SEO, sales, social media, and interns.
SKIP IF…
  • You are looking for product strategy or growth-channel breakdowns — the content is entirely internal-ops and team-building.
  • You want a deep technical walkthrough of how to build Claude Code skills from scratch; this stays at the concept and system level.
TL;DR

The full version, fast.

Most teams try to create processes that no one actually follows. This video proposes replacing that with three interlocking primitives: a Skill (a single reusable job packaged for AI), an Eval (a written definition of what good output looks like with pass/fail criteria), and a Loop (two or more skills chained to eliminate a recurring weekly bottleneck). The host runs a live six-week apprenticeship called Pod of One where every participant ships one to two real skills with evals and opens a pull request against the team Skills Dojo. The result is one person doing the work of four to ten roles, and those people teaching the next cohort.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0000:43

01 · The Three-Part Combo

Premise: more A-players means faster growth. Three primitives will get you there.

00:4301:44

02 · Services as Software

The labor-hours-down / revenue-per-employee-up math shift when AI handles client work.

01:4402:39

03 · The Skills Dojo

Live demo of the internal skill library: video caption generator, X long-form post writer, Instagram carousel, deck generator, and more.

02:3904:38

04 · The Six-Week Apprenticeship

Pod of One: 1-hour weekly training plus 30-min office hours; outcome is 1-2 shipped skills with evals per person who can then train others.

04:3806:07

05 · What a Skill Actually Is

One job. Reusable. Not a one-off prompt. Walkthrough of title-tag rewriter and X long-form post writer skills with humanizer checklist.

06:0706:43

06 · Single Brain CTA

Mid-video pitch for singlebrain.com (managed marketing agents in Slack/Teams).

06:4308:19

07 · Skills as the Intention Behind Process

Why skills beat SOPs: adoption is trackable, quality is scorable, coaching becomes data-driven.

08:1909:07

08 · What an Eval Is

Eval equals definition of done. No skill merges without one; 2+ peer upvotes required.

09:0710:14

09 · What a Good Eval Looks Like

Title-tag eval walkthrough: 56-char body hard check, keyword in first 40 chars, no banned superlatives, adversarial plus held-out cases.

10:1411:54

10 · Chaining Skills Into Loops

Stage 1: one more skill. Stage 2: chain skills to clear a bottleneck. Stage 3: full loop with minimal hand-holding.

11:5413:29

11 · The Founder's Invisible Hammer

Cross-functional cohort insight; call to skillify this video; founders have the leverage to make this mandatory.

Atomic Insights

Lines worth screenshotting.

  • A skill is a reusable package that does one clear job; it is not a one-off prompt and not a catch-all mega-prompt.
  • An eval is just a written definition of done with a few golden examples, clear pass/fail criteria, and a short rubric.
  • No skill merges into the shared library without an eval; that constraint is what prevents the library from becoming noise.
  • Traditional SOPs fail because no one knows whether they were followed or how well; a skill plus eval makes compliance checkable by a machine.
  • Two or more skills chained in sequence to kill a recurring weekly bottleneck is, by definition, an agent loop.
  • Revenue per employee is the lagging indicator; the leading indicator is how many accounts each person can handle and how many hours each account requires.
  • The apprenticeship cohort is deliberately cross-functional: engineers, SEO people, salespeople, social media managers, and interns all learn the same framework.
  • Teaching one other person the skill you built is a graduation requirement — that mechanism is what makes the training compound instead of stall.
  • A skills dojo with adoption tracking and a leaderboard turns invisible process compliance into a visible, coachable metric.
  • The same framework that runs a services agency applies to any business because the underlying constraint — throughput per person — is universal.
  • Adversarial test cases in an eval are what separate a robust skill from one that only passes easy examples.
  • Founders already hold an invisible hammer: they can mandate a six-week apprenticeship in a way a middle manager cannot, and that leverage is the honest prerequisite for this system working.
Takeaway

Three primitives that make AI stick across a whole team.

WHAT TO LEARN

A prompt that only one person knows how to run is not an organizational asset — packaging it as a testable, peer-reviewed skill is what turns individual capability into compounding throughput.

  • A skill does one job, is documented well enough for a colleague to run cold, and encodes a standard its author personally holds — not a mega-prompt, not a vague directive.
  • An eval is not a grading exercise; it is the written definition of what good output looks like, including at least one adversarial case designed to make the skill fail so you know where its edges are.
  • No skill should be considered shared until two peers have run it and confirmed they get the same quality output — the review gate is what prevents the library from filling with untested drafts.
  • Tracking which skills each team uses, and how often, turns invisible process compliance into a coachable, visible metric — something traditional SOPs can never produce.
  • Chaining two or three skills to remove a single recurring weekly bottleneck is the first practical step toward running an agent loop; starting with the bottleneck rather than with the technology is what makes it stick.
  • A cross-functional cohort — not just engineers — is the right audience for this kind of training; the skills that change throughput most for a services business often live in SEO, sales, and content, not in code.
  • Teaching one other person the skill you built is the graduation requirement that converts a training cohort into a self-replicating apprenticeship rather than a one-time workshop.
Glossary

Terms worth knowing.

Skill
A packaged, reusable Claude Code or Codex instruction set that does exactly one job. It is documented enough for a teammate to run it cold and encodes a standard the author personally upholds.
Eval
A written quality bar for a skill: a set of golden examples, a short rubric, and explicit pass/fail criteria. A skill cannot be merged into the shared library without one.
Loop
Two or more skills chained to run repeatedly and remove a single recurring bottleneck, the first practical form of an autonomous agent workflow.
Skills Dojo
The shared internal library where approved skills live, with a leaderboard tracking which teams and individuals are actually using them.
Pod of One
A six-week internal apprenticeship where one participant learns to build and use AI skills well enough to do the work of four to ten roles and then teach the skill to someone else.
Services as Software
A business model framing where a services firm uses AI to serve more clients per strategist at higher revenue per employee, approximating the economics of a software product.
Adversarial case
A deliberately malformed or edge-case input included in an eval to confirm the skill fails gracefully rather than producing a plausible-looking wrong answer.
Held-out case
A test example kept separate from the examples used to build the skill, used to verify the skill generalizes rather than just memorizing the training inputs.
Resources

Things they pointed at.

Quotables

Lines you could clip.

05:10
A skill is a unit, it's reusable, and it's not a one-off prompt.
One sentence that collapses the entire difference between prompt engineering and productized AI skills.TikTok hook↗ Tweet quote
09:50
The eval is how you prove a skill is good without trusting the vibes.
Quotable definition that reframes a technical term for a non-technical audience.IG reel cold open↗ Tweet quote
13:05
You have that invisible hammer where you can just make anything happen.
Strong closing image for founder-audience clips.newsletter pull-quote↗ Tweet quote
07:00
Nobody uses these processes. But when you have these skills out and your team's using them and you have a skills dojo, you could actually track which teams are actually using it.
Directly attacks the universal SOP failure mode with a concrete alternative.TikTok hook↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogy
00:00Alright. So I'm gonna show you the three part combo. So using skills, evals, and loops to reliably create more a players in your organization.
00:09Because what happens? If you have more a players in your organization, you're gonna reliably grow faster. Gonna hit your numbers also higher than you've ever imagined, and everyone's gonna have a better time.
00:19Right? The the the challenge is everyone's like, well, how do you get more a players? So what I'm gonna show you real quick is how you wanna think about this.
00:25It doesn't matter what kind of business you're in, but in the context of this video, I'm gonna show you how services as software works and how this applies to really any business, and then I'm gonna show you how we use this three part combo to move a lot faster, and literally, you can steal this. What you can do is you can take the transcript from this video and then make this into a skill itself, and then figure out how you can deploy this for your company.
00:43So let's get into it. Okay. So first and foremost, what I wanna call out here is if you look at a typical services business, it would take a lot of time to let's say manage one client.
00:52Okay? So the labor hours per account would be a lot. Maybe eighty hours for an account.
00:56I'm just making things up right now. Right? And then you might be If you're talking about revenue per employee for an account, it might be not too high.
01:02This is like in the situation where this is prescale, negative 200 k, and maybe the accounts per strategist, maybe you can only staff Maybe for each account, you can only have five accounts per strategist per person. Right? That's basically what it is.
01:12The whole idea here is is when you think about business as an example is you want the number of hours you're spending per account to come down. You want the amount that each person can take on to go up, and that means your revenue per employee goes up, which means that you're able to do a lot more with less. That's not saying you should cut a bunch of people, that's just saying that ultimately when you have such a high revenue revenue per employee, your entire organization is operating a lot more efficiently.
01:37But the question is, how do you actually get there? And this three part combo is going to help you get there. Okay?
01:44And one of the things that I had talked about before was a skills dojo. Right? So in another video, I talked about the concept of having a skills dojo where you have all these skills like a video caption generator or like a x long form post writer, and this has been good for us.
01:56It's literally generated millions of views on x for us and it's driven a bunch of leads as well. And so you have a lot of these these skills that that are in your head that you wanna make available to other people, and we've made this available to our organization. And also, this is something that we've been starting to implement for our clients as well on the single brain side of things.
02:12Okay? So there's there's like an Instagram carousel skill, there's a deck generator, there's design.md files you can have to keep your designs consistent across the board for let's say you're doing ads, you're doing carousels, you're doing pdfs, whatever it is exactly, these are gonna help you scale a lot more.
02:27Right? So the question is, how do you start to democratize the skills internally? One, you should have a skills dojo, and I covered that in another video, but more importantly, what I didn't cover in the other video is is this piece.
02:36Okay? So it's this concept of we're doing this six week apprenticeship with with certain people on on the team right now. And the whole idea here is that, you know, we we have It's it's a six week program and now we're already at week three and we're teaching people the concept of of what is a skill exactly, what is an eval, and what is a loop, which I'm gonna cover briefly.
02:54And then you should you should screenshot parts of this and then steal it for yourself. Right? And plug it into your Claude, plug it into your your chat should be, plug it into one of your autonomous agents.
03:02That's fine. So we have this this pod of one program which is a six weeks apprenticeship ran by yours truly, and by the end every person can build and use Claude code codec skills, and they have one to two real skills shipped to the skills dojo with evals. Again, that skills dojo is something that I showed you a little earlier and has the confidence to train other people on this.
03:19One person can basically now do the work of four, five, or even 10 jobs, right, by leveraging AI. And so what's happening right now is I'm spending my time training this group of people. I'm doing a one hour training every Wednesday.
03:30I'm having each of these individuals connect with me for office hours. These are thirty minute office hours, and so they're they're all at different levels and they're all working on different things, but the key thing here is that I'm training them. Right?
03:39Just I'm meeting them where they're at and I'm training And so the idea here is once we finish off this cohort, then they're gonna be able to train other people on this, and this this is why we call this apprenticeship. So this is this is Jedi training right now, and then eventually they're gonna become Jedi masters on their own.
03:51They're gonna be able train other people. Right? It's also important that if you put something together like this, you have to define what the definition of success is, what the definition of of done is.
04:00Right? And so we wanna make sure that at the end of this, each person has taught one of these skills to one other person, and these merged skills, all of them have to have an eval, which I'll talk about in a second, and then you wanna have one to two real skills shipped per person. Right?
04:13So anything that, again, is is manual right now that you're figuring out how to make into this make into this skill, that's good. So you don't want this to be just be a bunch of theater where people are watching you and then people just go back to doing their work. You can't accept that.
04:23So a skill is a a unit, it's reusable, and it's not a one off prompt. Right? So that's why we're not just talking about prompt engineering here.
04:29We want people to reuse this stuff. So you have the skill first, and then the dojo will compound this, and the eval is the bar. So let's first talk about what a skill is at a high level.
04:37Okay? The whole idea here is that when you have a skill I'm gonna scroll to, let's say, the bottom over here. When when you think about a skill, let's say you have like a like a like a title tag writer.
04:47Right? So a skill focuses on just one ideally one thing. You're not just saying, hey, help me do keyword research.
04:53Help me do backlink building. Help me do Reddit as well all in one skill. You want it to be separated.
04:58That way it's easy for the machines to parse. You So might have a title tag rewriter over here. Okay?
05:03So given a URL, a target keyword, we write the title tag to team standard, keyword front loaded, title body, fifty sixty characters, optional brand suffix, counted separately, no clickbait. Right? This is a It's a tiny job and this is easy.
05:15It's a very simple idea here to show. And then discovery call recap. Okay.
05:20Or you can have like a hook generator over here. Or like the skills that I showed a little earlier. I'll I'll I'll kinda move back over here.
05:26You look at the again, x long form post writer's case. So this write writes post for x in your founder's CEO's authentic voice. Every post should feel like a person wrote it, not a content team, not a bot.
05:36And there's a bunch of things in here. Don't use em dashes. Don't use it's this, not not that type language.
05:41Right? Build a bunch of ASCII diagrams in there so it has a bunch of these diagrams that you can do. Make sure this is 1,500 to two two thousand words or so, and then we'll just continue to to repeat it over time.
05:49And you can see here's here's some system flows over here. So there's examples in these skills. There's feedback loops, input format, output reference.
05:55Like we have a lot of these things. Here's a checklist. Here's a pattern checklist over here.
05:59Here's a humanizer score as well. So you can combine different skills with it. So we wanted to we wanted to make it human sounding.
06:04That way we wouldn't have to just keep going back and forth with these. By the way, if you're enjoying this video around skills, evals, and loops right now, then you have to check out single brain because this is where we tie everything together and we build managed marketing agents that live inside of Slack and Microsoft teams that ultimately help our clients grow a lot faster.
06:21So a lot of the concepts that you're seeing right now, maybe you're like, oh, this is overwhelming. I don't wanna handle it. These are all the things that we're building, and these are the loops that compound for our customers, and they're just able to move a lot faster.
06:31And we find that every single relationship that we're having here, it's very collaborative. Hey, can you do this? Can you actually add this?
06:36Can you do this over here? Because they once they see it, they can't unsee it anymore, and we want you to be able to see the same thing too. So check out singlebrain, brain, singlebrain.com, and we'll see you on the other side.
06:45The way I think about skills is this is to me the intention behind creating processes. Because a lot of people, you know, back in the day, when I say back in the like a year or two ago, everyone's like, oh, you gotta create process. You gotta create process.
06:55The problem with that is nobody uses these processes. But when you have these skills out and your team's using them and you have a skills dojo, you could actually track which teams are actually using it, which individuals are actually using it, and then who's scoring high, who isn't scoring high, and you can figure out how to coach those people, and you can also figure out how to reward the people that are very engaged with this.
07:12Right? That is what I would say. I mean, ultimately, you know, with with this, when I'm doing the training, let's say later this week or tomorrow actually, by the end of the hour, every participant has a scaffolded draft skill for their own role and open a new a pull request against the skills know your repo.
07:26The skill does one clear job, document it, it's documented for a peer to run it, and encodes a standard that the participant personally holds. Right? So one draft skill, one job, 100% room shows their their back skill alive in five minutes.
07:37And so even creating this program requires you to yap to your cloud code or your codex to draft this out for you. But it's it's really important.
07:45You ask it to ask you clarifying questions. And so if you don't spend the time planning this, you're planning to fail. Right?
07:51So this is what's gonna happen tomorrow. This is the run of the show. Okay?
07:54So this is helpful for me because we we I'm spending so much time in terms of on the product side, on moving the company forward, on recruiting, all these different things. I'm kinda jumping around all the time, so this allows me to just kinda drop in there and do it. But I'm also doing this for you right now too, so this makes it reusable.
08:08Right? So anyway, this is a it shows kinda what the definition of done looks like, and then we can probably go through the the skills repo, but this is what a skill is first and foremost.
08:17But you can't just create a skill on its own because if you don't do an eval, an eval actually just to me, it's just a definition of done. It's what success looks like.
08:26If you don't have a good eval, so the eval is the bar written down, a few golden examples, clear pass, failed criteria, a short rubric. If you don't do that, your skills gonna be in trouble because it doesn't know what good looks like. And so no skill merges into the dojo without one, build the eval, prove the skill passes it, get pure upvotes, and merge.
08:44Okay? So then next week when I when I when I go on this, we're gonna have five starter cases over here, include one adversarial, one one held out.
08:52And so I'll explain what that means in a second, and then skill clears its own eval before merge, and then two plus peers upvotes merge it into the dojo. And then then it's live and it's merged into the dojo. Right?
09:01So then we'll go through this next week, like what does a good eval look like? What does a bad good eval look like? And honestly, could just have the e the the AI write it.
09:08But to continue on the the title tag one, so like what what is a good eval for the title tags could look like? Well, the eval set is URLs plus keyword, exact title you accept, include one keyword, edge case, and and one no keyword adversarial case. Okay.
09:20So hard check would be you want the title body to be 56 characters, keyword in first 40 characters, no bands superlative, and any miss equals fail.
09:29Okay. So again, you're trying to be specific here. When you have the definition of what success looks like, it makes it easy for the machine to run it through that to decide if when it ran the skill, did it actually follow what that definition looks like.
09:42Okay? So the definition I've done here is once we finish this, you have you have you know, Pierre has reviewed it and upvoted it, and so we want people to start to get used to the dojo.
09:51Right? But the whole idea here is that again, you can't have a process without defining what the definition of success looks like. That's all an eval is.
09:57That's all a skill is. Skill is a process. Eval is just a definition of done.
10:01And you don't need to scare yourself into like, you know, what what does an eval mean? That's such a big word. Right?
10:05It's a scary word, but I would just say that, again, the eval is how you prove a skill is good without trusting the vibes. That's ultimately what it is. That's what I would say.
10:15Now the next step is, okay, let's let's say you figure out how to do a skill. Let's say you figure out how to do an eval here. Now you have to figure out how you can chain these things together.
10:22So if you chain and run your skills to kill your number one weekly bottleneck, for example, it might be doing client reporting as an example, or it might be sales follow ups as an example, or it might be doing sales discovery as an example. If you are able to move from one skill to running skills repeatedly and chaining them, to removing your single weekly your biggest weekly bottleneck, then you now have an intro to agent loops, right, which is what the founder of Cloud Code or the founder of Open Claw have been talking about.
10:48These are north star examples of what running agent loops look like, and what ends up happening is once you chain these skills together, you can just run these loops inside of Cloud Code or Codex. You know, some people are like, oh, is it just a slash goal or is it just a slash loop? And so to me, it's just Yeah.
11:01It it is kind of that, but you need to have the right skills for it. Right? And again, that that's why you need to have good definitions of what success looks like.
11:07You know, I think that the number one thing is you need to figure out, like, we do this, you you have to name your number one weekly bottleneck, and then figure out what skills you can chain into it, and then we can figure out what does before and after look like. And so one example might be from my side, we have all these skills for coming up with ideas.
11:22So we have my content in gesture which will pull from my content last week, and then it'll have my my hook generator as well. There's also like my dot charge generator as well. There's all these different things that if you If on their own, they're already pretty powerful, but when you combine them one after another, it becomes even more powerful.
11:37Right? And so that's what it is ultimately. You wanna show that you're able to What good looks like here is that you run your chain repeatedly to clear bottleneck, but then more importantly, you need to make sure that you can chain two or three more skills.
11:49And then not only that, ideally this can run with minimal hand holding and other people on the team are using it. Right? That's when the the the organization really starts to compound, and then we move on to to graduation over here.
11:59But what I wanna call out too here is in this meeting, it's not just one group of people. It's not just engineers. It's You got engineers, you have SEO people in here, you have sales people in here, you have social media managers in here, you have interns in here.
12:10It's it's anybody that has a good relationship with change and is very adaptable and is very curious, they're gonna do a good job here.
12:18Right? And so, again, there's a couple things that I showed here. I mean, you got you got the you got the skills dojo.
12:22We have all these templates over here. By the way, like, I even have a section here on on how to write a good skill, how to write a good eval.
12:27So like, it says over here, what an eval is in plain language. Again, forget the word eval for a second. You already do this every day.
12:34When you read a draft and think, no, that headline is weak or yes, ship it, you're running an eval in your head. That's all it is. Right?
12:39We don't need to oh, just because Silicon Valley is using all these these acronyms doesn't mean we need to be afraid of them. Right? How to write a good skill.
12:45So I think I would encourage you to have something like this, and you can screenshot this, you you you can make a transcript from this, but you should skillify this video or this podcast that you're listening to right now, and then just keep keep it going. Right?
12:58But if you're not driving this from the top right now, you're listening to this, you're you're you're a founder in your organization, you have that power. That is your fault because the founder, you have that invisible hammer where you can just make anything happen.
13:09Right? And so that's why I think this is so important. This is the three part combo that's going to help you create more predictability, more reliability when it comes to minting a players on your team.
13:19There's gonna be a lot more predictability with that. So that being said, hope you enjoyed this video and we'll catch you in the next one.
The Hook

The bait, then the rug-pull.

The title makes a promise most management content cannot keep: not inspiration, but a working system. In thirteen minutes the host opens the actual docs his team ships against, lesson plans, eval rubrics, a live skills library with a leaderboard, and hands you the whole structure to steal.

Frameworks

Named ideas worth stealing.

00:00list

Skills / Evals / Loops

  1. Skill — one reusable job packaged for AI
  2. Eval — written definition of done with pass/fail criteria
  3. Loop — chained skills that kill a recurring bottleneck

The three-part system for creating organizational A-players through AI; each layer depends on the previous one.

Steal forAny team wanting to move from ad-hoc prompting to a governed, compounding AI skill library
02:39model

Pod of One Apprenticeship

Six-week program: week 1 skill draft, week 3 skill shipped with eval and PR, week 4 eval deepened, week 5 loop built, graduation requires teaching one other person. Outcome target: one person does the work of 4-10.

Steal forFounders who want to roll out AI tooling across a mixed-skill team without relying on engineers
09:07list

Eval Definition of Good

  1. Starter eval set with typical, edge, and adversarial cases
  2. Hard checks (measurable pass/fail, not vibes)
  3. A short rubric for soft criteria
  4. 2+ peer upvotes before merge
  5. Confirm peers can run the skill and get the same result

The five properties that separate a trustworthy eval from a rubber-stamp.

Steal forAnyone building an internal AI skill library who needs a merge gate that is not just a gut check
CTA Breakdown

How they asked for the click.

VERBAL ASK
06:07product
Check out Single Brain because this is where we tie everything together and we build managed marketing agents that live inside of Slack and Microsoft Teams.

Mid-video organic mention while walking through a screen share of the skill library, feels like a natural pivot to the paid offer rather than a hard break.

MENTIONED ON CAMERA
FROM THE DESCRIPTION
Storyboard

Visual structure at a glance.

open
hookopen00:00
math shift
premisemath shift00:43
skills dojo
demoskills dojo01:44
pod of one
valuepod of one02:39
skill demo
valueskill demo04:38
eval intro
valueeval intro08:19
loops
valueloops10:14
CTA
ctaCTA06:07
close
ctaclose13:05
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

Chat about this