Modern Creator
Eric Nowoslawski · YouTube

Claude Code + Karpathy Destroys Every Lead Gen Agency

An 11-minute demo of an AI outbound system that doubled reply rates by running self-improving campaign experiments and giving the repo away for free.

Posted
3 days ago
Duration
Format
Tutorial
educational
Views
5.2K
230 likes
Big Idea

The argument in one line.

The next era of outbound marketing is AI running hundreds of campaign experiments simultaneously and learning from reply data weekly, making human creative labor approval-only rather than primary.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • You run B2B cold email campaigns and want to improve reply rates through systematic experimentation rather than intuition.
  • You have an existing outbound stack (SmartLead, Instantly, Clay) and want an AI layer to run the testing and iteration cycle on top of it.
  • You are comfortable setting up a GitHub repo, connecting API keys, and approving campaign batches weekly.
  • You want to understand how to structure a TAM database so an AI agent can generate and test campaign hypotheses without human prompting.
SKIP IF…
  • You are doing B2C, social-first, or content-led acquisition. This system is strictly cold email outbound for B2B.
  • You want fully autonomous sends. User approval gates are hard-coded and non-negotiable in this repo.
  • You have no existing way to pull company data or build a TAM. The system requires a complete data layer before any experiments run.
TL;DR

The full version, fast.

The system applies the Karpathy AutoResearch loop to cold email: run experiment, score result, keep what works, discard what does not. Claude Code ingests your website, onboarding voice memos, and a pre-built TAM spreadsheet to generate campaign hypotheses, propose sample emails, and load approved contacts into SmartLead. Each week it reads reply data, identifies patterns by audience slice and copy style, and proposes the next round of tests. No email uploads without user approval and MillionVerifier sign-off, keeping a human in the loop at the launch gate while removing humans from the creative and analysis work entirely.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0000:42

01 · Hook + results reveal

Bold claim: this system will destroy lead gen agencies including his own. Backed with the headline number of automatic campaigns at 2x reply rate.

00:4201:17

02 · SmartLead data breakdown

Screen share of SmartLead analytics showing 1.93x, 2.06x, 1.81x lift week-over-week across April 6-17. The fair comparison is replies per send, not raw volume.

01:1702:04

03 · Karpathy AutoResearch origin

Explains how the Karpathy self-improving ML experiment loop inspired the idea. Claude Code discarded the actual repo as unnecessary. Only the concept transferred.

02:0403:21

04 · 8-step system overview

Screen share of the full onboarding flow: Enter Website, Auto Draft, Fill Gaps, Pull TAM, Review List, Create Experiments, Approve and Upload, Learn Every Week.

03:2104:21

05 · Context building

System writes ICP, case study, value prop, and problem statement markdown files. Voice memo walk-and-dump is the fastest way to load context. Without good context nothing else works.

04:2106:38

06 · TAM build

Screen share of a real enriched TAM spreadsheet covering sales motion, pricing tiers, headcount ratios, ad spend, CTA type. Pre-building the full TAM eliminates bad AI game-time decisions.

06:3808:09

07 · Experiment creation and copywriting

The list is the message. AI proposes campaigns based on TAM signals and context, presents 3-5 sample contacts so the human writes the message. User approves messaging before any upload.

08:0908:40

08 · Campaign loading and weekly cadence

Everything loads into SmartLead and Instantly. Weekly Friday Codex automation reviews past performance and proposes next round. User still approves before anything sends.

08:4009:44

09 · Hard gates

CTA is locked to prevent drift. MillionVerifier must pass every email. User approval is hard-coded. Supabase internally; CSV file for end users. These are non-negotiable.

09:4411:15

10 · Weekly learning loop and CTA

System reads results, finds patterns by title and industry, suggests next test slice, repeats. Free first campaign offer for B2B businesses doing over 3M in revenue.

Atomic Insights

Lines worth screenshotting.

  • The AI-powered outbound loop is the product. The emails themselves are nearly irrelevant compared to the experiment structure that tests and learns from them.
  • The Karpathy AutoResearch repo was discarded by Claude Code as unnecessary. Only the concept of autonomous iterative experimentation transferred, not the code.
  • Letting an AI make game-time decisions about which companies to target produces small, unrepresentative samples. Pre-building the full TAM eliminates that failure mode.
  • Context quality is the ceiling of system quality. ICP files, case studies, value prop docs, and problem statements must exist before any campaign logic runs.
  • The list is the message. Which companies you select and the signals used to find them often drive more reply-rate variance than the email copy itself.
  • Locking the CTA as a hard constraint prevents AI from generating campaigns that accidentally give away the product or drift from the core offer.
  • MillionVerifier as a hard gate before SmartLead upload is a deliverability protection that cannot be skipped. Unverified emails poison domain reputation.
  • The weekly review loop compounds. Each round of experiments makes the next round smarter, widening the moat against competitors every seven days without additional human investment.
  • Human approval at the messaging and launch steps is the trust mechanism that makes autonomous campaign generation safe to run in production. Removing them is how systems fail at scale.
  • Automatic campaigns achieved 20.71 replies per 1000 sends versus 10.71 for manually managed campaigns, a 1.93x to 2.06x lift held across multiple weeks.
  • GPT-4o nano batch API handles TAM data processing, not a premium model. Cost efficiency at the data enrichment step matters when processing thousands of companies.
  • Recording a voice memo while walking and dumping everything you know about your dream customer, then giving that transcript to Claude Code, is the fastest way to build the context layer.
Takeaway

The experiment loop is the product, not the email.

WHAT TO LEARN

The leverage in modern outbound is systematic experimentation at machine pace. The copy is nearly irrelevant compared to the loop that tests and learns from it.

  • AI-powered outbound works by running experiments against your reply data, not by generating better copy. The loop is the product, not the prose.
  • Letting an AI make game-time decisions about which companies to target produces small, unrepresentative samples. Pre-building a complete TAM with every useful signal eliminates that failure mode.
  • Context quality is the ceiling of system quality. ICP files, case studies, value prop docs, and problem statements must be built before any campaign logic runs, or the experiments optimize against the wrong target.
  • Locking the CTA as a hard constraint prevents AI from generating campaigns that drift off-brand or accidentally give away the product. The constraint is a feature, not a limitation.
  • User approval gates at the messaging and launch steps are the trust mechanism that makes autonomous campaign generation safe to run in production. Removing them is how systems fail at scale.
  • The weekly review loop compounds. Each round of experiments makes the next smarter, widening the moat against competitors without this layer every seven days without additional human investment.
  • The list is the message. Which companies you select and what signals you used to find them often drive more reply-rate variance than the email itself, which is why TAM quality is the first investment.
Glossary

Terms worth knowing.

TAM file
A pre-built spreadsheet of every company you might want to target, enriched with every relevant signal upfront including sales motion, pricing, headcount ratios, ad spend, and CTA type, so the AI never has to fetch data on the fly during experiment runs.
AutoResearch
An open-source repository by Andrej Karpathy that trains a small local ML model by running self-improving experiments every five minutes, measuring what works, and iterating. The name and concept were borrowed here; the actual code was discarded.
ICP
Ideal Customer Profile. A markdown file describing the exact type of company and buyer persona you want to reach, used as primary context for the AI when generating campaign hypotheses.
Hard gate
A non-negotiable checkpoint in the workflow that blocks campaign upload unless a specific condition is met: CTA must match the locked objective, email must pass verification, and the user must explicitly approve.
MillionVerifier
An email validation service used as a pre-upload gate to confirm that each contact email is deliverable before it enters a SmartLead campaign.
SmartLead / Instantly
Cold email sending platforms that manage campaign delivery, inbox rotation, and reply tracking. They are the execution layer this system loads approved campaigns into.
Experiment ledger
A running log of every campaign hypothesis tested, its results, and which contacts were already reached, used weekly by the AI to avoid repeating tests and build the next round on what was learned.
Resources

Things they pointed at.

08:09toolInstantly
08:09toolCodex
04:21toolClay
06:38toolRapid API
06:38toolAppify
06:38toolProspio
06:38toolBlitz API
06:38toolhtml-to-text
09:44toolSupabase
Quotables

Lines you could clip.

00:05
Claude Code plus Karpathy are about to destroy every lead gen agency in the next twelve months, including ours.
Self-aware founder calling his own industry death. Instant credibility and curiosity hook.TikTok hook↗ Tweet quote
01:37
Claude Code actually threw away the Karpathy repository because it was not actually necessary to use that. It is just simply the idea.
Surprising reversal. The famous repo was irrelevant; only the concept mattered.IG reel cold open↗ Tweet quote
02:16
Without really good context, none of this is gonna matter.
Punchy one-liner that reframes the whole AI outbound conversation from tooling to data quality.newsletter pull-quote↗ Tweet quote
06:56
A lot of you might have heard the saying, the list is the message.
Classic direct response principle applied to AI outbound, bridges old-school and new-school thinking.newsletter pull-quote↗ Tweet quote
08:44
I personally think this is going to usher in the next era of marketing where AI is running these experiments faster than a human would ever be able to.
Bold macro prediction from someone with live data backing it up.IG reel cold open↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

00:00OddCode plus Carpathi are about to destroy every lead gen agency in the next twelve months, including ours, and I'm gonna show you exactly what I mean. We have over 50 customers generating anywhere from 200 to 300 positive responses per day.
00:12And we had an enterprise customer that we presented the strategy to, and it doubled their reply rates without us writing a single email or manually uploading any leads to any of their campaigns. So in this video, I'm gonna go over what we did, and by the end, I'm gonna give away the repository so you can install this for your own business as well too.
00:29But before we get into the actual meat and potatoes of everything that we did, I thought we would first go over the results. I used the Smart Lead API to pull the reply rates, and as you can see, between April sixth to seventeenth, our automatic campaigns produced 20.71 replies per 1,000, and the other campaigns produced 10.71 replies per 1,000.
00:47And then it stayed strong week over week as well too. So our automatic campaigns were getting getting this reply rate, and then our other campaigns were getting just around a 1% reply rate. While into the second week when we were still running, it still kept that pace as well.
01:00So it wasn't just a little bit of a fluke. So now that you've seen the result, let's talk about what we actually did here. Andre Karpathy is a software engineer that used to work at OpenAI, and from time to time, he comes out with these amazing ideas and he shares them as open source repositories.
01:14He created a repository he called AutoResearch that he was using in order to train a small local model in order to get some kind of computation right that he wanted to be able to do personally, and it basically ran an experiment every five minutes to self improve itself. And it would run experiments, figure out what worked, and then keep progressing forward.
01:32Basically, giving you machine learning capabilities straight from your MacBook. As soon as I heard about this, I thought, oh my gosh, you know what? The number that we wanna optimize for is positive reply rate.
01:42We should install this as well too. And so then, we presented this to this enterprise customer who was totally game. And what was funny is when we actually set it up with Cloud Code, Cloud Code actually threw away Karpathy's repository because it wasn't actually necessary to use that.
01:53It's just simply the idea of what we're trying to achieve. That's the most important part. And so what we've done is we've created a system that can read through all of the past campaigns that you've sent with the copywriting and what were the leads and who were the companies that we were targeting.
02:08And then it can look at who negatively responded and who positively responded and which campaigns had the highest positive response rate so that could keep doing more of the thing that's working and less of the things that aren't working, and doesn't require a human to intervene and come up with all of the new ideas. And so if I were to give you the repository right now, you would actually be shocked at how much of the auto research part it actually is, and it's actually all about getting context about your company.
02:34Because without really good context, none of this is gonna matter. And so this visualization is just based off of all the skills that we have based into the repository and goes over what we need. So when you start this auto research skill, you're first gonna enter your website, and it's gonna just learn everything about your company and your offer that it automatically can.
02:52So we'll enter their website in step one. In step two, you'll do your auto onboarding draft. And then for step three, you're gonna fill in some of the gaps.
02:59And so I would highly recommend you go for a walk and you start recording a voice memo and just dump as much context about your dream outbound campaigns and who your target customer is. Take that transcript and then give that transcript to Codex or Cloud Code or wherever you're gonna run this so it has an unbelievably great amount of context about your company.
03:17What it's then gonna do is it's gonna create an ICP markdown file. It's gonna create a case study markdown file.
03:23It's gonna start getting a value proposition markdown file, a problem statement markdown file, and start filling in those things so it now has the context about your company, what your customers care about, how you help them, all those other things. After the context, we have also found that pulling your entire TAM is incredibly important in order to run this process because one mistake that we made when we were first running this is we were letting Claude try to make game time decisions about, hey.
03:51I wanna run an experiment on companies that recently released a new product. Let me go find companies that recently founded a new product, and let's run a campaign to that. And then it just would make decisions that weren't really that great, and it wouldn't get good coverage.
04:05And then there weren't actually that many companies that launched new products, and the sample size was small, and it it just it just didn't work. So now what we do is when we're taking customers through this process internally, we build the entire TAM with every conceivable data point or qualitative signal that you would wanna know about these companies.
04:24We just get it all done upfront So that when Claude Code or Codex is reviewing all of the data for the experiments that it can run, it doesn't make any game time decisions. All the data is just done, and it's super easy with Claude Code and Codex to keep this data updated on a monthly basis so that everything is fresh and new.
04:38And here's an example we made for another company. And as you can see, what this company really cared about is they wanted to target b to b SaaS companies, but the main thing that they wanted to know is, as you can see, this is all just data that you'd be able to get from Clay and their derived data points. Then you see we have this custom data right here where we have their demo booking CTA, the contact sales CTA, the soft match, and the match text and all these things.
05:00This company, it's really important for them to know, does this company have a talk to sales button, a get a demo button, meet with somebody button? What is their sales motion? This is the most important thing for this customer.
05:11The second most important thing is figuring out what is this company's pricing. What's the lowest pricing that they list on the website, and what's the highest pricing that they list on the website. Do they have enterprise?
05:21Do they have a free trial? And we got all of that data done right here. The other thing that we did is we don't necessarily have a use case for this right now, but what if one day we wanted to run an experiment where we wanted to know, okay, what about companies that had a sales leader but no sales individual contributors?
05:35Or there was a lot of sales individual contributors, but there was no sales leader? Or there's a lot of marketing leaders but no individual contributors? Or the ratio of all of them.
05:44So we even got how many sales leaders there are, how many sales individual contributors there are, how many leaders there are on the marketing team, do they have a CRO, do they have a VP of sales, do they have a director of sales, do they have a CMO, do they have a VP of marketing, do they have a go to market engineer, are they running meta ads, are they running Google ads, do they have case studies listed on their website, what case studies do they mention, what's their offer summary.
06:04We have all of this data already put in here. And I just go over all of this to kinda just inundate you with what's possible that what you wanna do when you set this up is just get any data point you could reasonably wanna know about your TAM done inside of Clogcode. To shortcut this, I think most of the data that we used was with Rapid API, Appify, Prospio, Blitz API, and then that website crawling that you saw, that was all done with an open source library called HTML to text to be able to find the keywords about the book a meeting and all those other things.
06:33And then we use the batch API to do some AI processing with GPT five nano. The Appify MCP was also great for finding the ads data and things like that.
06:42All of these things are all in easy Google search or just ask Claude Code, hey. The data that I wanna know is this. What's a good data provider for that?
06:50And Claude Code is gonna point you in the right direction for sure. Now we've got everything. We have your social proof, your context, the companies in your TAM, what do they look like, what are the different data points, everything, and then that's when Claude or Codex starts creating the first round of experiments for us to run.
07:04A lot of you might have heard the saying, the list is the message. It will create campaigns based on, alright, what are the cool data points that we can find here that can uncover some pain for us? What are some just overall campaigns that share value propositions that we should send?
07:17And so it uses that underlying data with your context and then suggests the campaigns for you. After you approve the experiments that you want it to run, it will even say, hey. Based on my knowledge, this is an email that I would send.
07:28I would say I spent no time trying to perfect this email writing thing because when we give this away for free to everybody, you all are gonna have your own opinion on what a good cold email looks like. So it's gonna give you a suggestion of what to send based on the experiment and based on the context, but it will also present you with three to five contacts with all of the information about the contacts and the companies so that you can write the message, and then you don't have to blame me for saying that the messaging isn't good.
07:52Just write the message based on if this is a VP of sales and they have a demo button and they don't have a sales team, this is what I would say. And then you can build your own messaging skill into the repository as well too.
08:03So now we have all the ingredients for an outbound campaign. We have our company list. We have our contact list.
08:08We have the context about the sales offer that we're doing, and then we have the experiments, and we have the copywriting. Then it loads everything into Smart Lead and everything into Instantly to be able to run the first batch of campaigns.
08:19What I then suggest you do is you use Codecs in order to they have an automation button, and I would run an automation every Friday to use this repository to review the past performance and help come up with the next round of campaigns that we should be trying. And then approve that copywriting, and then push it all in. You could.
08:37I am not even gonna attempt to release this internally. We have it automatically making the campaigns and then sending them. I am not even gonna attempt to release this for free and make that the promise.
08:47You still will have to see the next round of campaigns. You'll see the insights from what you learned in the last week, and then you'll be able to make decisions on what campaigns you're gonna run next week. It's not set to automatically run.
08:57I'm sure you can just prompt it to automatically run on your side as well, but I just don't wanna make that that kind of promise. A couple of things that I urge you not to change about the repository is you wanna make sure that the CTA is locked. I have it in there that one fear we had was when we set this up for a customer, we would accidentally come up with a campaign where it would just give away the software for free.
09:18We locked the CTA. So if you say, I wanna book meetings, then it's gonna be locked to booking meetings. If you say, you wanna give out a lead magnet or you have some kind of offer of a lead magnet, it's gonna get locked to that.
09:28It might come up with different ways to say the CTA, but it's gonna get locked onto that CTA, and I think that's the most important part. We already have in in there that unless Million Verifier has approved the email, it cannot be put into Smart Lead. I showed you guys that we put all of the customers' companies and their TAM file into Supabase.
09:44For you, it'll locally just run inside of a CSV file so that you don't have to go crazy and put it on the cloud or anything. And then we hard code it in that it has user approval as well. And then it'll learn every week.
09:55I would just say, hey, Codex or Claude Code, schedule this so that it runs every week with a routine on Claude Code or an automation on Codex, and then you'll get the suggested next round of tests, and then you'd be able to approve those and see the copywriting and everything, and then upload everything to SmartLead with all the contacts because it even will keep track of who you reached out to and who you haven't so that you can keep running these new experiments and all this.
10:17I personally think that this is going to usher in the next era of marketing where AI is kind of running these experiments faster than a human would be able to. The previous round of outbound that we were just in was kind of this world of, okay, now we can have as much data as possible. Let's see all the crazy things that we can do with it.
10:32But now we're gonna give all the context and all the data to AI, and AI will be able to run way more variations than you'll ever ever be able to run by yourself. And I I personally think that this is the future not only of outbound marketing, paid media, organic. We're already seeing it being done by other people as well too.
10:47If you are a b to b business owner doing over $3,000,000 in revenue and you have a sales team that doesn't have enough leads and you don't wanna touch this and you want a company to install it for you, we always launch our first campaign for somebody for free. There's a link below in the video description where you can apply to launch your first campaign with us for free.
11:03We'll use our infrastructure. We'll pull the list. We'll do the copywriting.
11:06We'll do everything for you and validate that we can get you leads before you ever sign a contract with us. If you're interested in that, the link is down below in the video description, and I hope to see you there. Thanks for watching.
The Hook

The bait, then the rug-pull.

A lead gen agency founder walks into his own execution. In a video titled with his industry death sentence, he demonstrates the exact system built with Claude Code and borrowed from the Karpathy AutoResearch loop that doubled one enterprise client reply rates without a human writing a single email.

Frameworks

Named ideas worth stealing.

01:17concept

AutoResearch applied to outbound

Karpathy self-improving ML experiment loop applied to cold email: run experiment, score result, keep what works, discard what does not. The actual code was discarded; only the idea transferred.

Steal forAny repeatable marketing experiment where you have a measurable success metric and want to systematically learn what drives it
02:04list

8-Step Auto Outbound Onboarding

  1. Enter Your Website
  2. Auto Onboarding Draft
  3. Fill the Gaps via voice memo
  4. Pull the TAM List
  5. Review the List
  6. Create Experiments
  7. Approve and Upload
  8. Learn Every Week

The end-to-end flow from zero context to running campaigns, designed so a non-engineer can operate it after initial setup with Claude Code or Codex.

Steal forStructuring any AI agent workflow that needs human-in-the-loop at approval points while automating the analysis and creation work
08:40list

Hard Gates Before Campaign Upload

  1. CTA locked to approved objective
  2. Email must pass MillionVerifier
  3. No duplicate outreach via used_state flag
  4. User explicitly approves messaging and audience

Non-negotiable checkpoints that prevent autonomous campaign systems from sending bad emails, making off-brand offers, or burning domain reputation.

Steal forAny autonomous AI system where one bad output could damage customer relationships or brand trust
CTA Breakdown

How they asked for the click.

VERBAL ASK
09:44product
If you are a B2B business doing over three million in revenue and have a sales team that does not have enough leads, we always launch our first campaign for somebody for free.

Qualified lead gen offer at the end after full value-first demo. Repository giveaway pre-qualifies DIY technical users while the free campaign offer converts enterprise viewers who want it installed.

MENTIONED ON CAMERA
Storyboard

Visual structure at a glance.

Title card
hookTitle card00:00
Host intro
hookHost intro00:12
SmartLead results
proofSmartLead results00:38
Karpathy AutoResearch
contextKarpathy AutoResearch01:17
8-step flow diagram
framework8-step flow diagram02:41
Context files
valueContext files03:27
TAM spreadsheet
valueTAM spreadsheet05:05
Experiment creation
valueExperiment creation07:00
We Have Everything slide
summaryWe Have Everything slide08:04
Hard gates slide
guardrailsHard gates slide09:40
Host closing
ctaHost closing10:37
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

Chat about this