Big Idea

The argument in one line.

Claude Code is tuned to make you feel productive, not make you money — four prompt-layer upgrades that fix sycophancy, verification gaps, context rot, and single-agent bottlenecks are what actually close that gap.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You use Claude Code daily for building apps, automations, or client deliverables and feel like you are repeating yourself or patching bugs that should not have shipped.
You have hit the context wall mid-session and felt the quality drop before you hit the limit.
You want a structured way to validate ideas before writing a line of code rather than shipping MVPs to single-digit signups.
You are curious whether sub-agents and /goal actually work in practice or are just demo theater.

SKIP IF…

You are looking for Claude API or prompt-engineering fundamentals — this assumes you are already inside Claude Code.
You want a finished product rather than a methodology shift — the skills are free but you still have to adapt the verification loop to your own build context.

TL;DR

The full version, fast.

Claude Code has four documented failure modes — sycophancy (88% agreement rate per research), false completion (ships broken code, lies about it), context rot (all 18 tested models degrade long before the context window fills), and the single-agent serial bottleneck. The creator built four custom slash-command skills that attack each one: /roast deploys a five-persona council to kill or reshape ideas before coding; a Playwright verification loop makes Claude screenshot and stress-test its own output before declaring done; /session-handoff preserves context across a slash-clear reset; and /goal plus parallel sub-agents let a second model evaluate completion so the worker cannot declare itself done. Every upgrade is demonstrated live building a YouTube-to-LinkedIn SaaS concept from ideation through a full six-file go-to-market kit.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 02:27

01 · The problem with default Claude

Opens with the income-cap framing: output quality times speed equals money. Argues Claude defaults optimize for feeling productive, not results.

02:27 – 04:03

02 · Upgrade 1 — /roast kills sycophancy

Explains sycophancy research (88% agreement rate, MIT/Penn State personalization finding) and introduces the five-persona council skill.

04:03 – 07:00

03 · /roast live demo

Live demo: runs /roast on a $9/mo YouTube-to-LinkedIn SaaS concept. Council returns reshape verdict with cheapest 48-hour test. Comparison with unroasted Claude shows generic vs specific output.

07:00 – 09:35

04 · Upgrade 2 — verification loop

NYU Copilot study (40% security vulnerabilities). Email agent story — confirmed 100% sent, actually sent 25%. Introduces Playwright-based self-verification methodology.

09:35 – 16:31

05 · Verification loop live demo

Builds Cadence waitlist landing page with embedded verification loop. Playwright screenshots all sections at desktop and mobile. Stress-tests form with 22 cases, finds 2 non-blocking issues.

16:31 – 20:20

06 · Upgrade 3 — context management + /session-handoff

Context rot research (18 models all degrade). Desk analogy. Demonstrates /context, status line token counter, and /session-handoff vs /compact.

20:20 – 21:07

07 · /session-handoff live demo

Runs handoff, shows structured output (decisions, key files, open questions, pick-up-here), slash-clear, paste back into clean window.

21:07 – 27:27

08 · Upgrade 4 — sub-agents + /goal

Anthropic team-vs-single-agent study (90% improvement). Introduces /goal with a separate evaluator model. Live demo: six parallel sub-agents produce full go-to-market kit in 8 minutes.

27:27 – 28:12

09 · Summary + CTA

Recaps four upgrades in four lines. Plugs free School community and paid plus tier.

Atomic Insights

Lines worth screenshotting.

Claude is tuned to make you feel productive — that is not the same as making you money, and the gap between those two defaults is where most wasted builds live.
AI models fail to push back on how you frame something 88% of the time — humans do it 40% of the time, making the model systematically worse at disagreement than a person.
Sycophancy gets worse the longer a conversation runs and the more a model knows about you — personalization features accelerate the yes-man problem.
Every AI model tested degraded in performance before the context window was anywhere near full — more conversation is not better.
A verification loop that uses Playwright to screenshot its own output and stress-test before reporting done changes the definition of finished from asserted to proven.
Roughly 40% of GitHub Copilot-generated programs had security vulnerabilities — the silent failure problem is documented, not anecdotal.
A lead agent coordinating parallel sub-agents outperformed a single agent by over 90% on Anthropic internal research evaluation.
Setting a /goal with a separate evaluator model means the worker cannot declare itself done — it separates the builder from the judge.
Context rot starts early: staying below 250k tokens in a 1M-token window is the practical ceiling for consistent output quality.
A session handoff that captures open decisions, key files, and a pick-up-here prompt costs nothing and preserves all continuity after a slash-clear reset.
CAC will exceed a $9 LTV on day one with no distribution and no moat — the /roast council surfaces that in minutes before a single line of code is written.
The shift from builder to judge is the actual leverage point — the role changes to decision maker, reviewer, and completion evaluator, not prompt author.

Takeaway

Four habits that change what Claude actually delivers.

WHAT TO LEARN

The gap between feeling productive and producing results comes down to whether you treat Claude defaults as a starting point or a ceiling.

Requesting adversarial pushback before building anything — through a structured council or even a simple devil's advocate prompt — produces fundamentally different decisions than asking if an idea is good.
AI agreement rates are documented to be around 88% — higher than humans — and get worse with personalization, so the longer you use a model the more you need to build in friction.
Building a verification step into the prompt itself changes the output contract from asserted-done to proven-done, and catches the silent failures that only surface in front of clients.
Context quality degrades measurably before the window fills — treating a quarter of the context window as a soft ceiling, not the technical limit, preserves output quality.
A session handoff that captures decisions, key files, and open questions before clearing context costs a few seconds and preserves full continuity — slash-clear is not a memory wipe if you do this first.
Parallel sub-agents with independent context windows are faster and produce better outputs than a single agent working serially, because each starts fresh without accumulated noise.
Separating the worker from the evaluator — having a second model grade completion rather than the working agent self-declaring done — closes the loop on the same sycophancy problem that undermines idea validation.

Glossary

Terms worth knowing.

Sycophancy: Documented AI behavior where the model agrees with the user rather than pushing back, even when the user is wrong — researchers call it AI being a yes man.
Context rot: Performance degradation that occurs as a conversation lengthens, well before the context window fills — measured across 18 models in a published study.
Sub-agent: A separate Claude instance given its own isolated task and clean context window, which then reports back to a lead agent — enables parallel independent work.
/goal: A Claude Code slash command that sets a completion condition evaluated by a second model each turn, so the working agent cannot self-declare done.
CAC: Customer acquisition cost — the total spend required to get one paying customer, which must stay below lifetime value for a business to be viable.
LTV: Lifetime value — the total revenue a single customer generates over their relationship with a product, the ceiling on how much acquisition spending makes sense.
Headed browser: A browser instance that is visible on screen during automation, as opposed to headless which runs invisibly in the background — useful for verifying what Playwright is actually doing.
Verification loop: A build methodology where Claude uses Playwright to screenshot its own output, stress-test forms and interactions, and iterate until passing before reporting completion.

Resources

Things they pointed at.

02:25productFree School community (Nate Herk)

02:55linkElephant study (AI sycophancy benchmark)

03:26linkMIT and Penn State personalization/sycophancy research

08:00linkNYU GitHub Copilot security study (1,600 programs, 40% vulnerable)

16:40linkContext rot study (18 models tested)

21:15linkAnthropic team-vs-single-agent internal evaluation

Quotables

Lines you could clip.

01:08

“Claude is tuned to make you feel productive. It is not tuned to make you money, and these are two completely different things.”

Counterintuitive thesis in one sentence, no setup needed→ TikTok hook↗ Tweet quote

08:43

“Something being finished and something actually working are not the same thing at all.”

Universal truth, tight, punchy — applies beyond AI→ IG reel cold open↗ Tweet quote

17:17

“A longer conversation literally makes Claude get dumb.”

Blunt, surprising, makes you want to hear why→ TikTok hook↗ Tweet quote

27:35

“You are very much changing from the builder and producer to the problem solver, the decision maker, the reviewer, the judge.”

Reframes the AI-user relationship, quotable without context→ newsletter pull-quote↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

analogystory

00:00So I figured out how to turn Claude Code into the best business partner I could ask for, and I made three times more money in the past thirty days. You see, Claude has these problems that a lot of people don't ever notice, and every one of those is costing you time and money on stuff that's never gonna work. So what I did is I built a set of four upgrades to fix every one of those issues.

00:18So these four upgrades turn Claude into something that actually makes you money instead of just wasting your time. And it doesn't matter if you're trying to build an app or you're running an agency or you're doing AI consulting. This works for anything that you wanna do inside of Claude code.

00:29So in this video, I'm gonna show you guys the four upgrades and exactly how you can use them to make more money. So let's get into it. Claude has a few habits that quietly work against what you're trying to do.

00:38Little things that you might not think twice about. So think about how most people use Claude. You open it up, you type what you want, you get an answer, and you just kind of assume that that is the best possible answer that you could have gotten because, you know, Claude code is one of the best AI tools out there and the models underneath it like Opus are super super smart.

00:53So it's very easy to just trust what it says. But there are these errors that are baked into Claude's design that make your results worse than they should be. So by default, Claude is tuned to make you feel productive.

01:03It is not tuned to make you money, and these are two completely different things. And every one of those design errors is costing you money because your income is basically capped by two things. The first one is the quality of your output, and the second one is how fast you can produce it.

01:15So the better the output you get and the faster you get it, the more money you can make. I'm sure we can all think of many specific moments where it felt like Claude was just trying to get us to spend more tokens or was lying to us about features that it had built or, you know, you feel like you're just repeating yourself a ton.

01:27But the good news is you don't need to go rewrite Claude's code base to fix any of these things. You literally just need these four upgrades. And before I started using these, I remember launching promotions that did not do very well at all or shipping automations that were silently failing pushing out websites or apps with a ton of bugs.

01:42So that's basically the whole arc. But before we get into the first upgrade, if you want to get these prompts and skills and see how your results get better, then you can get them for completely free inside of my free school community. The link for that is in the description.

01:53Okay. So the first upgrade fixes the biggest one, which is just Claude agreeing with everything you say. I mean, haven't you guys ever noticed that you tell Claude you wanna do something and it pretty much will always say like, hey.

02:02That's a great idea. You're really smart because it wants you to like it. But then what actually happens if you say like, you know what?

02:08I changed my mind. It will once again come back and say, you know what? You're really smart.

02:11I'm glad you changed your mind. That's a great idea. And it's getting better over time as the models are just getting smarter and smarter, but this is actually documented.

02:17Researchers call it sycophancy, which is just a fancy word for AI being a yes man. There's a study also called elephant, which measures exactly this, and they found that AI models fail to push back on the way you frame something about 88% of the time.

02:30And for humans, it's around 60%. And it actually gets worse the more the model knows about you. Researchers at MIT and Penn State found that the personalization and memory features tend to make the model more agreeable over a long conversation.

02:41And so that's tough because, basically, the longer you work with it and the more you use it, which is what we all really should be trying to do, the better it gets at telling you what you want to hear. So this is a pretty simple fix. You ask Claude to start challenging you and pushing back and playing devil's advocate before it builds anything or before it approves any plan.

02:56And that's the whole idea behind a skill that I built called roast. It basically pulls Claude out of agreement mode, and it forces it to stress test your idea and its own work instead of just approving everything. So basically, what roast does is it spins up a whole council of personas and they attack it from different angles.

03:11So you've got a contrarian whose only job is to find fatal flaws. We've got an expansionist who's looking for the biggest upside. We've got a first principles thinker who's working with no outside context, just pure logic.

03:20We've got a deep researcher that actually goes in and pulls out a bunch of real market data and competitor pricing off the web. And then we have the buyer who actually role plays being your customer and tells you straight up if they buy the thing or not. And then finally, the judge takes all of those findings and gives you one verdict.

03:34You basically get green light, reshape, or kill. And it also gives you the single cheapest test that you can run-in the next forty eight hours to find out if the idea is even worth pursuing, even if it was reshaped. And so what I'm gonna do is throughout this whole video, I'm basically just gonna build a little business from start to finish so you can see each upgrade working on something real.

03:51So the idea that I want to build out is a $9 a month tool that turns a YouTube transcript into a week of LinkedIn posts. So let me actually just go open up Cloud Code and roast it live. Alright.

04:01So here we are right now in a fresh Cloud Code project. You can see right here, all we have is a cloud dot md, which basically has, like, nothing in it. I just told it that your job here is to help us make some money, and then we have our dot Cloud with a skill in here.

04:13And this is the roast skill that I was just telling you guys about. So all I'm gonna do is do a slash roast and say, I have this idea to make a $9 a month tool where people drop in a YouTube video link, and that transcript gets turned into a week's worth of LinkedIn posts. So I'm gonna go shoot off that message.

04:30So as you can see here, before it runs the council, it has three quick questions to ask us. So the first thing is, who's the actual target buyer for this $9 a month tool? And let's just keep this as broad as possible for now and really see what the council can do.

04:42I'm gonna say anyone with a YouTube link. What is your edge here? What do you already have?

04:46Let's just say that we have, you know, no real edge. We have no distribution, but we can build something fast with Cloud Code, and we'll shoot that off. And then what are our constraints and budget?

04:55How fast do you need to get the first dollar? Let's just say we have a little bit of runway, but not too much. So we'll shoot off those answers, and now we should see the actual council get spun up.

05:05So here is the brief that the council's going to judge, and we're gonna see each of these agents get spun up, the contrarian, the expansionist, and then the other ones. And while this is running real quick, what I wanna do is take this, open up another session, and just say, this is my idea, and just say, do you think this is good?

05:19Do you think this will work? Do think I can make money? And it'll just be cool to come back to that after we see what the council says and see what it would have said if we didn't do that.

05:27So, So anyways, you can see we have now these five sub agents running, and I will check-in with you guys when that is finished up. Okay.

05:33So the verdict here is to reshape, and the confidence in that is very high. So in one line, says kill the $9 YouTube to LinkedIn posts product exactly as described.

05:42It's a free no login commodity wrapped in a subscription that's structurally built to churn. But keep the engine and aim it at a narrow paying niche with the two features that are the actual moat, which is provable voice matching and direct scheduled posting. So here you can see it goes into the why.

05:56It goes into our biggest risk, which is no moat and a free substitute and no distribution with no audience and a few $100 budget. CAC, which is customer acquisition cost, will exceed a $9 LTV, lifetime value, on day one, and you'd ship a polished MVP, minimal viable product, to single digit sign ups.

06:12It goes over the biggest upside if we do want to, you know, look glass half full, the money read, the cheapest forty eight hour test. So what it recommends we do before we go write any code, which would be pick one niche, DM or email 20 to 30 of them, and see if there's actually a market there. See if people would pay for that.

06:28So here's the overall score. The contrarian gave us a two out of 10. Expansionist gave us an eight out of 10.

06:32We got a three out of 10, a two out of 10, and a two out of 10. So, obviously, we would want to reshape this idea. Now let's just go over real quick to the basic Claude and see what we got.

06:40Looks like there's a few questions I have to answer, so let me do that real quick. Actually, I have to run this again because it actually used the roast skill without me asking it, which proves that it's you know, that that's good. Right?

06:49But let me just run this again and explicitly say don't use the roast skill. And now this one has come back. It did give us a good analysis and said, like, you know, this probably is something that you wanna rework a little bit before you actually go ship it.

06:59But this advice is so much more generic, and we didn't get the right perspectives, and it doesn't even really tell us what we should do in order to actually push this out the door. And because we just got Opus 4.8 and the models are gonna get better and better, the whole sycophancy thing is something that all of these model providers are aware of and, you know, taking steps to make sure that it's not just a yes, man.

07:17But clearly, if you compare these two outputs, getting sort of an a council that has different areas of expertise and different personas is going to be much better to actually help you analyze business decisions and look at what you should be doing in order to make money. So that is how the roast skill works. Even if you don't wanna use that exact skill, I think the methodology of having your ideas always be stress tested, always have a devil advocate, look at it from different perspectives is the best way to make a good decision.

07:43Even if it's not explicitly about making money, it's a really good way and a really great way to just default when you're talking to Claude or any AI model for that matter. Alright. So that was roast.

07:52Now once Claude actually builds something for you, there's one step that it almost always skips, and it's the one that can cost you days to fix. So Claude will hand you something that looks finished, but something being finished and something actually working are not the same thing at all. And this is once again a real measured problem.

08:07There was a study out of NYU where researchers reviewed around 1,600 programs generated by GitHub Copilot. Well, we all know that Copilot isn't the best, but anyways, roughly 40% of them had security vulnerabilities in them. And the scary part about these mistakes is that they're super easy to miss.

08:22So a lot of the time, you don't even know they exist until something crashes in front of a client or in some sort of, like, worst case scenario for something to crash, like a live demo. I remember one specific time where we were shooting off a bunch of emails to people who wanted to work with us, but we basically didn't have capacity.

08:36So we were shooting off emails to let them know. And we had hundreds of people to reach out to. And so the agent that I was building told me that it had sent out all those outreach messages, and I didn't know until four days later that, you know, I checked the email and saw that it only sent about the first 25% of them.

08:49So I'm not exactly sure why because it confidently told me, yeah, I sent off all those emails. Everything is good to go. So not only did it not do what it was supposed to, but it also lied about it.

08:57And so in that situation, it wasn't really a huge deal obviously because that wasn't like a super high risk situation where it costed us a ton of money. But imagine what it would have looked like if it was legitimately building a bunch of dark code, meaning, you know, code that you didn't write and it's shipping features or building out automations that's a pretty legit, like, big deal, which if it lies about it or does it poorly, that really could result in your business losing a ton of money.

09:19The fix here is to make clog check its own work before it ever hands it to you and then also having it check the work that it already handed to you. So think about like how cars get built. At the factory, they test out every single piece of the car on its own, and then when the whole thing comes together, they test it a bunch again.

09:33And that's basically the methodology that we want to work with when we're using Claude. This one's a little different from the others because it's not really like a prebuilt skill that I can give you.

09:41Like I said, it's more of a methodology. It's more of a mindset shift. And there's two parts to it.

09:45Like I said, the first part is verification before Claude ever hands something to you. You want it to check the work as it goes. And then, of course, by the time it tells you it's done, you stress test it more and you try to find those edge cases that you collectively didn't think about both you and Claude were planning.

09:59Now how you actually do that like stress testing or the verification is a little bit different depending on what you're actually building because if you're trying to verify a landing page, that's totally different than verifying, like, an edited video or a data pipeline or something like that. So so this isn't just like one magic button you can press.

10:14Like I said, it's more of a habit that you bake into Claude and more of the way that you prompt and the way that you think about working with Claude code. So let me show you guys what this actually looks like. I'm going to have Claude build out a landing page with a wait list form for our app or our product, and then it's gonna verify it with screenshots, and it's gonna look at this page as if a real person was actually looking at it.

10:32And then we're gonna it stress test it by clicking through the buttons, submitting a bunch of forms, and trying to break it and see if there's anything that we need to fix. Okay. So now it's recommendation for us to verify if this is gonna work was to DM some people and get the proof of concept.

10:46Right? And so what we wanna do is have a landing page to actually send them to. Somewhere that shows the features and the brand and gives it a feel and then also has a little bit of a wait list to see if people actually opt in.

10:56So I have this prompt here. I'm not gonna read the entire thing, and I will kind of slowly scroll through it if you wanna pause and look at what I've written up here. But the idea is that we have a verification loop.

11:05So right here. Right? After you build it, do not trust that it looks right.

11:08Verify yourself with Playwright, and I need to add CLI here, before reporting back. So start the local server. Use Playwright CLI, which is just basically computer use.

11:16So it can open up the actual website, look around, take screenshots, click around, things like that, and it needs to verify it. So screenshot each section individually, look at them, and if you need to, you'll come back and iterate. Right?

11:27So the whole point is you repeat the loop and you iterate and you only stop once every section has been screenshotted at both viewports and there are no visible errors and the waitlist form looks clean. And I gave it down here a definition of done. So what I'm gonna do is copy this prompt and just put it right in there and hit go.

11:43Now, obviously, like I said earlier, depending on your actual whatever you're building right here, your verification loop will look a little bit different. In this case, it's able to look visually, take screenshots, things like that. But the whole idea is a lot of times on the first shot, you might hear this thing called, like, one shot prompt.

11:58On the first shot, AI will maybe get you, let's just say, 65% of the way there. And your job then is to review and to judge and add your taste and go back and forth. But what if you could have AI get you 90% of the way there first and then you iterate from there?

12:10And the whole idea of verification and checking its work on the way is where you can have it be a little bit less lazy and it doesn't actually stop until it gives you something that you can basically quickly review and shoot off. Because it's a complete waste of time if it gives you something and then you have to make all these changes.

12:24Right? Like, think about it if you wanted someone who reports to you, an actual human, you would want them to give you a report that you're able to just review once over and it all looks good and it's all real. You wouldn't as much value the employee who's giving you things to review and every single time he or she hands you something, you have to make a ton of changes.

12:39So as you can see, it is throwing together this little task list, and it's gonna go through and run the verification loop and fix until there are zero errors. So I will just check-in with you guys when that is done. Okay.

12:49So everything checks out end to end, apparently. It's done and verified, not just asserted. We have a live URL, which I'll click open in a sec.

12:57But let's see. It said it built a single page premium waitlist landing page for cadence with all eight sections. Verification loop actually ran and passed.

13:04Playwright took screenshots of all the sections. If I open up this folder right here that you can see it made, Cadence Landing, we have, like, the actual code that went into the building out the site. We have the nodes.

13:14But right here, we have screenshots, And we can see desktop. We have 11. And on mobile, we have also 11 that were taken.

13:20So that is really, really nice to see. And just to show you guys, I clicked in here, we can see that it's actually looking at what the page looks like based on mobile or desktop view, and that's how it's able I mean, obviously, this is pretty AI sloppy. Like, it's very generic.

13:31That's not the point. The point I'm trying to make right now is the verification loop. Right?

13:35Obviously, we could do things from a design perspective to make this feel more branded, to feel less AI created. So, anyways, let's take a look now at the actual site. If I click open here, we're in the Versus Code in app browser sort of thing.

13:46We can see cadence, features, click on this button, that zooms us down, how it works, pricing. Let me just zoom out this a little bit. There we go.

13:52Um, join the waitlist brings us down here to this section. We have different LinkedIn followers, annual revenue, stuff like that, and these buttons down here work as well. So from a visual perspective, besides the fact that it is pretty AI generic, it's good.

14:05Right? Like, everything is in line. Nothing's out of bounds.

14:07All the text is readable. The sections are clean. There's not any, like, bugs or glitches, m dash.

14:12Uh-oh. But, anyways, that is showing us how we can get outputs using sort of a verification loop. Now we can even take this one step further.

14:19Part of having it check its own work is not just in the build process, but it's also in stress testing process. Right? So because we have the ability with our website to test out and making sure that things are functional, we haven't yet tested filling out the form.

14:32So what I can say is, awesome. So what I want you to do now is use Playwright CLI and open up a headed browser and show me that you are submitting forms and do multiple passes of submitting forms with different drop down options and, you know, different types of emails, different types of phone numbers, basically just to stress test this thing to make sure that there's no bugs in the form submission aspect of this And so when I say headed browser, that just means that I can, like, watch it rather than a headless browser would be running in the background and we wouldn't see it even though it is actually going on and working in the background.

15:02So here you can see it just opened up a tab. It just submitted a form, and it's filling out a bunch of different versions right here. It's doing it really quick.

15:08Right? We saw different drop down options, different types of emails, different types of names. And, obviously, we don't have any back end configured yet, but that would be the next step.

15:17Right? We could configure a back end and then have it it out more. It even I don't know if you guys saw that.

15:21It was trying out putting spaces in weird spots. It was putting some spaces before the email. There we go.

15:26We just got a bug there where it wasn't a valid email. Right there again. So we're we're seeing all these edge cases that humans might actually get.

15:32There's another one. Right? And so the idea here is that it's finding things that you might not be able to think of or you don't wanna sit here and manually do that.

15:39Right? So that is what's really cool about this because we get the creativity of a model like Opus, and then we get the ability for Claude code to actually do stuff like this. And now we understand what all of the edge cases are and what users might do.

15:52Anyways, I'm gonna go ahead and just let this keep running, but two parts of having it check its work on the build side to save you some time, and then, of course, on the stress testing side to also save you some time. Looks like it found all the edge cases, and it decided that that was good enough for that first run. Right here, you can see all 22 of its 22 tests passed.

16:09So it's gonna pull the evidence. It's going to look at those passes and the rejections, and then basically just let us know what we need to change if anything. So there you go.

16:16We can see we had eight valid submissions, and then we had 14 malformed submissions. But then it said two honest non blocking notes, no duplicate guard, so the same email could join twice. And email validation is intentionally lenient, So structure only, not deliverability.

16:31Meaning, people could submit a fake email, but if it fits the structure of, like, name@domain.com, it will go through. So there's not a deliverability check.

16:39So those are two things that if we wanted to action, we could action that, honestly, I wouldn't I didn't think about right away, you know, in our initial build. So very, helpful. Alright.

16:46So those were the first two upgrades. Now those work for every single output Claude gives you. But to make them work, you actually have to get the output in the first place.

16:53And most of the time, the reason people move slow has nothing to do with what they're doing when they work with Claude. It's that they literally hit a wall. The conversation starts to fill up.

17:00Claude gets slower. It gets worse. It starts to, you know, burn through your usage limit, and it feels like it just has no memory.

17:07And once again, there's a study on this. It's basically called context rod. Researchers tested 18 of the top AI models out there including Claude, and every single one of them starts to perform worse as the conversation gets longer.

17:17Even if it's really, really simple tasks, that's where you start to get just so much degrading in the performance and, you know, hallucinations. And the problem is that drop off starts way before anywhere near the context window being completely full. So more is not better, and a longer conversation literally makes Claude get dumb.

17:34So think about Claude's context like a desk. If you piled up a bunch of paper onto it and then you needed to find one specific document, it's gonna be way harder to find. It's gonna take you way longer because there's so much information in there.

17:43And on top of that, if you're not running the best version of Claude, meaning like the best most capable model, whether that's Opus 4.8 or whatever it might be, it's going to design things worse. It's gonna build sloppier code, and it might even get worse at the reviewing and the verification and the stress testing. So those two things that secretly decide whether you make money with Claude are managing your context and making sure you're working with the right model for the right use case.

18:04So the fix here is handling your context properly. There's a lot of things that go into that, but basically just making sure that you're taking care of that and it's on top of your mind before it quietly wrecks your outputs. And there's a couple commands worth knowing here.

18:15So first one is using slash context, which lets you see exactly what's eating up your context window. Slash clear lets you wipe the whole thing and start fresh. Instead of using slash compact, like compacts your conversation and then you can, you know, keep going, I built my own custom skill called slash session handoff.

18:29So before I ever clear anything, I run session handoff. It writes me a summary of everything that matters, what we're working on, the key files we produced or key files that hold information, any open decisions that I've made, and then basically exactly where to pick back up. So all I have to do is run the session handoff, copy that message, clear the context, paste it back in, and now I'm sitting in a completely clean window.

18:46But I'm basically just picking up exactly where I was, and it doesn't feel like I lost anything. Now let me show you what types of things you wanna think about when it comes to making sure you're not hitting that context rot territory. So the first thing, and the reason why I'm using this CLI version right now, what I typically use anyways, you can see my status line down here.

19:03What I'm looking at is throughout my sessions, I can see the model I'm using, what the context window is. I can see the effort that's being used. I can see basically a visual indicator of how much of my context window has been filled up.

19:14So 12%, which is about a 125,000 tokens out of our 1,000,000 token window.

19:19I don't really like to let this really pass like a quarter million. Whenever this passes a quarter million, I typically tend to start a new session. So a couple of things that you wanna leverage.

19:28Right? We talked about slash context. So if I do this, this is going to actually show me and visualize what is going on in our session.

19:34So we can see, wow. All of these MCP servers might be well, these aren't actually taking tokens. These are load on demand.

19:39But if they were loaded in, that would be a lot of tokens. We can see we have free space. We have skills, memory files, system tools, system prompts, all of that kind of stuff.

19:47And this also will show us, you know, how many tokens roughly for each of those items. And this is good to be able to clean up your products a little bit if you wanna make sure you're not, you know, starting off with just a ton of context already eaten up. It also right here gives us a suggestion.

19:58So read results using 490,000, 49%. So you could save about a 140,000 tokens here.

20:05But, anyways, that is one thing. You could also do a slash compact or Cloud Code has its auto compact. But, honestly, I don't leverage this very much.

20:12It takes a long time. I I basically built my own skill, which is called session handoff, which I will give you guys for free, of course, in the free school community. But when I run my session handoff skill, I basically prompted this thing to give me a summary of what we've done.

20:23You know what? I'll just wait till this runs, I'll show you exactly what it gives us. Alright.

20:27So this is the session handoff. We get where it started, decisions that are locked and what shipped, key files, running state, verification, deferred and open questions, and then pick up here.

20:36So now I can just do a slash copy, which grabs everything that Claude just output to us. I do my slash clear. You can see the context window completely resets.

20:43I paste that in, and now our project has the exact context that we were basically working in. It has all the files. It knows where to look.

20:49It knows what we're doing, and it knows where to pick up. And it's just super, helpful to be able to just constantly do a session handoff and clear. Or even if I wanted to do a session handoff and then move it over to, like, I don't know, a different model or maybe even codex or something like that, I'm able to do so super, super easily.

21:02And sometimes it'll even do something like this where it says, I've got the handoff. Let me quickly confirm the current running state before I recommend our next move, and we keep working. So that skill is super, super helpful and easy to use.

21:14Okay. So this is now the last upgrade. And once you start using it, you'll produce more progress in a single day than most people can produce in a week.

21:21So no matter how good your prompts are, there's still one hard limit, and that's the fact that you can only point Claude basically one direction at a time because you are the bottleneck. You are the decision maker and the reviewer. And Anthropix's own engineering team actually tested this directly.

21:33They set up a lead agent coordinating a team of little sub agents all working in parallel, and they compared it to a single agent doing the whole job alone. The team setup obviously outperformed the single agent by over 90% on their internal research evaluation. So real quick, in case you don't know what a sub agent is, a sub agent is basically a separate Claude that gets its own task and its own clean context window.

21:52It works all alone by itself, and then it reports back to that main terminal session. So instead of one worker doing everything, you know, one step at a time, you have a whole team of them running, and they're each working on one of the pieces at once. So personally, if I'm doing something like planning out a YouTube video, I'll maybe have one doing research on a certain topic and one doing research on another, and one maybe looking through comments on past videos.

22:11The key here is anything that can happen in parallel, independent of each other, I will spin up sub agents to do that. And then when everything gets synthesized together, I can take that output and just do whatever I need to do with it. And then I'm gonna add one more thing on top of that, which makes it feel completely like the future, and that is a command called slash goal.

22:26So using goal, that lets you set a finish line, an actual completion condition, and then Claude will basically just work turn after turn for as long as it takes until you hit that condition. And the cool part about that is that there's a separate evaluator. There's a second model that checks every single turn to see if, you know, done equals true or not.

22:42So Claude doesn't get to declare itself done. A different model has to look at it with a different persona and actually grade it and see if it's done. And that's what's so cool about it.

22:50Because the whole problem in upgrade one was that Claude would just agree with itself or agree with you too often. So now you have a different one, and it literally separates the worker from the judge. So let me go ahead and give it one job, set the goal, and run this live.

23:02And this last move is cool because it basically stacks every single upgrade from the whole video into this one test. Because the idea got validated with the roast, it verified its own work before declaring done, and that's verification methodology from upgrade two. It spins up a whole team of sub agents.

23:15Each one runs in its own clean context, so nobody hits the context's rot wall. And then we use goal to drive the entire thing home. Okay.

23:21So this one is really, really cool because it combines basically everything that we've talked about so far. We talked about making sure that we have the right idea by having some sort of counsel and playing devil's advocate. We then talked about how you can have Claude verify and check its own work.

23:34Then we talked about context and making sure that things are clean. As you can see, we just set our session handoff. And now we can loop all of that back together by using things like sub agents and slash goal to help us work faster.

23:45So if I do slash goal right here, you can see it says set a goal, keep working until the condition is met. And then I'm gonna basically just paste in my prompts. So I'm gonna shoot this off, and we'll see what it says.

23:54And you'll notice that there's elements that we've talked about like I just mentioned. So we have our product. So the goal is to build a complete, ready to execute, go to market kit for our product and save it in this project.

24:03The product is obviously our web app. We have our ICP here. And what's really cool is inside of the goal, we're able to leverage sub agents.

24:10So use parallel sub agents, one per deliverable, so there should be six, and they each have their own context, and they're each going to produce different files that don't overwrite each other. So this is what we're having it create. And yet down here, can see that I defined when this thing is done, which is that all six files exist and none of them are empty.

24:25The market research has six plus competitors. Personalized drafts has 25 number draft. Things like this.

24:30The more objective you can be with your goal, the better that it's actually going to work because, obviously, it's gonna keep working until it thinks that it's done. You also will notice that in here, said, after the sub agents finish, run a verification pass yourself. So open each file, confirm that it meets the bar, fix anything thin or generic before you declare yourself done.

24:46And so this is just gonna run. And now because I front loaded all of my thinking into that prompt and set the goal, I can just kinda walk away and do whatever I want until this is done. So this will be running in the bottom right.

24:55It'll say goal active. It'll tell us how long the goal has been running, and then when it's done, it'll say goal done. So I'll check-in with you guys when we actually have that finished goal back.

25:03Alright. So that just finished up as you can see, and it only took about eight minutes. So one thing about the goal is just because it's a goal and just because it has a loop ability doesn't mean you have to set goals that are gonna run for hours and hours.

25:13I use goal a lot, and most of the time I use goal, it's runs that take less than, you know, twenty to thirty minutes because I'm able to just be super clear about my prompt and just have more confidence that it's going to achieve the goal. So eight minutes, we have our six different files. And keep in mind, this spun up six different sub agents, and all of the sub agents were working on their files independent in parallel.

25:31So that's another reason why this was able to go pretty fast. But all of these have been verified. All of these have been checked, and now it would be on us to be able to look at the positioning, the market research, the launch plan, the outreach templates, the outreach drafts, and the content calendar.

25:43And because we've looped together all of these upgrades and all of these skills, we're in a really good spot now to be able to start executing on this vision. And think about this. In total, all of these demos probably took me under an hour.

25:54And so if you really wanted to go, you know, set up like, spin up a business like this, you're gonna put more than just an hour in. But think about if you put in, like, a week of focused work with all of these strategies, ideation, building things out, and then having this full launch plan and all of this stuff ready to go, where could that take you?

26:08And how could you have just leveraged Cloud Code to be able to have done something that probably would have taken a team of 10 and probably would have taken more time? So just to show you what's in here, if I click on the go to market, we can see let's just first look at the positioning. We have our ICP.

26:21We have our segment a, our segment b, our core offer, our tier ladder. It looks like pricing got locked at $19.39 and $90.99 per month.

26:28We have upgrade logic. We have our one line value prop, and we have our three sharpest objections with rebuttals. So I could use chat GBT for that.

26:36We have I don't post on LinkedIn enough to need this. AI posts sound fake and will hurt my brand. So we have good rebuttals for all of those, and we could obviously come through, read all of this, and put our own personal touch on it.

26:45We've also got our market research. So we've got our product, our wedge, our ICP competitors, which found looks like seven of them, and we said it needed at least seven, I believe.

26:53We have some adjacent ones as well. We've got a full comparison table of these. We've got where cadence fits.

26:58Why $19 is the right entry price? So as you can see, all of our sources are here. This is very in-depth.

27:03We've also got our launch plan. So this is a fourteen day launch plan, which we would basically just be able to follow. We've got our outreach, and then we would start making our content based on this calendar.

27:12So anyways, that is how we're able to leverage sub agents, goals, automations, other things like that to make sure that you stop being the bottleneck. You are very much changing from the builder and producer to the problem solver, the decision maker, the reviewer, the judge. That's how you need to leverage this type of technology to help you grow your business, to help you make more money.

27:29So that was the four upgrades. Stop letting it agree with you so you build the right thing. Make it check its own work so you ship stuff that actually works.

27:36Manage your context so Claude stays sharp, and stop being the bottleneck. Use sub agents, use slash goal so that stuff can run without you. So now you can use these upgrades to make more money using Claude.

27:46You can get everything that I talked about today inside of my free school community. There, you'll also find hundreds of free resources and courses and over 400,000 people building with Claude. And if you're ready to go deeper and build an AI business, then you can join my plus community where we hop on weekly calls to answer your questions.

28:00The link for both of those communities is in the description. But, anyways, that is gonna do it for this one. So if you guys enjoyed the video or you learned something new, please give it a like.

28:06It helps me out a ton. And as always, I appreciate you guys made it to the end of the video, and I'll see you all in the next one. Thanks, guys.

The Hook

The bait, then the rug-pull.

The title makes a claim most productivity videos cannot back up — but Nate Herk does it by naming the problem before the pitch: Claude Code is tuned to make you feel productive, not make you money. What follows is 28 minutes of live demos attacking four documented AI failure modes with custom slash commands he built and is giving away free.

Frameworks

Named ideas worth stealing.

04:30list

The /roast Council

Contrarian (finds fatal flaws)
Expansionist (biggest upside)
First Principles Thinker (pure logic, no outside context)
Deep Researcher (live market data and competitor pricing)
Buyer (role-plays customer, says if they buy)
Judge (synthesizes verdict: green / reshape / kill)

A six-persona adversarial review council that stress-tests an idea from different angles and produces a single actionable verdict.

Steal forAny offer validation or go/no-go decision before writing code or copy

08:10model

Two-Part Verification Methodology

Build-time verification (Playwright screenshots, viewport checks before reporting done)
Stress-test verification (form submissions, edge-case inputs, adversarial user simulation)

Separates finished (asserted) from working (proven) by embedding verification into the Claude prompt itself.

Steal forAny deliverable that can be inspected programmatically — landing pages, APIs, automations, data pipelines

01:00concept

Income Cap Formula

Income is capped by two variables: quality of output times speed of production. Every upgrade addresses one or both.

Steal forFraming for any productivity or AI workflow video

CTA Breakdown

How they asked for the click.

VERBAL ASK

27:45product

“You can get everything that I talked about today inside of my free school community. There you will also find hundreds of free resources and courses and over 400,000 people building with Claude.”

Mentioned twice in the body and once in the close. Free community as lead magnet, paid plus tier with weekly calls as the upsell. No pressure, no countdown — clean and trust-forward.

FROM THE DESCRIPTION

PRIMARY CTAWhere the creator wants you to go next.

AFFILIATECommission earned if you click.

OTHER LINKSAlso linked in the description.

Storyboard