Modern Creator
AI Stack Engineer · YouTube

This Free File Makes Claude Code 10x Cleaner (Karpathy Skills)

A 9-minute breakdown of the CLAUDE.md file that fixes the four most expensive AI coding agent failure modes.

Posted
1 months ago
Duration
Format
Tutorial
educational
Views
16.4K
480 likes
Members feature

Chat with this breakdown.

Modern Creator members can chat with any breakdown — ask for the hook, quote a framework, find the exact transcript moment. Unlocks at T2: refer 3 friends + add your own API key.

Create a free account →
Chapters

Where the time goes.

00:0001:15

01 · Karpathy thread + the four problems

Opens with Karpathy workflow flip (80% manual to 80% agent). Frames story around failures. Names four agent failure modes: silent assumptions, overengineering, scope creep, no verification.

01:1503:35

02 · Excalidraw: problems mapped to cost

Three-column diagram: Your Request / What the Agent Does / What You Get. Walks each failure mode with concrete cost examples (400-line OAuth, 200-line date formatter, 40-line diff, untested validation).

03:3504:11

03 · The four principles (GitHub README)

Introduces the Karpathy Skills repo. Maps each principle to the problem it solves: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution.

04:1105:31

04 · Installation walkthrough

Two paths: Claude Code plugin (global, recommended) via /plugin marketplace add + /plugin install; and per-project curl with append support for existing CLAUDE.md.

05:3107:32

05 · Live demo: ecommerce dashboard

Builds dashboard with guidelines active. Agent asks 3 clarifying questions. Output: 1 file, 120 lines. Without guidelines: 6-8 files, 500+ lines, unasked-for features.

07:3208:07

06 · VS Code: clean diff

Shows actual code output. Every changed line traces to what was asked. No renamed variables, no reformatted comments, no drive-by refactors.

08:0709:04

07 · Tradeoff + close

Guidelines bias toward caution, not speed. For trivial tasks they are overkill. For nontrivial work where wrong assumptions cost hours, they are the fix.

Takeaway

Your CLAUDE.md is a behavioral contract, not a hint.

Steal this for JoeFlow / MCN+ / any Claude Code project

The agent does not need better prompts — it needs explicit rules about what NOT to do.

  • Install the plugin route (global, 10 seconds): /plugin marketplace add forrestchang/andrej-karpathy-skills then /plugin install andrej-karpathy-skills@karpathy-skills
  • Or append to your existing CLAUDE.md with the curl append command — your rules stay on top, Karpathy principles go at the bottom
  • The four failure modes (silent assumptions, overengineering, scope creep, no verification) are the ones that kill review time — name them in your own CLAUDE.md
  • The Excalidraw diagram format (Your Request / What Agent Does / What You Get) is a reusable content frame for any AI tool breakdown video
  • The tradeoff is honest and worth saying out loud: these rules slow down trivial tasks, so scope them to nontrivial work
Resources Mentioned

Things they pointed at.

00:00linkKarpathy X thread on coding workflow flip
Quotables

Lines you could clip.

08:28
Coding agents are capable, but they behave badly. They make silent mistakes that look correct on the surface. They build too much when you need too little. They touch things they should not touch, and they do not verify their own work. This file corrects those patterns in about 50 lines of markdown.
Self-contained thesis, no setup needed, lands hardTikTok hook or IG reel cold open↗ Tweet quote
08:08
Every line that changed traces back to what I asked for. There are no surprise edits, no renamed variables in other files, no reformatted comments, no drive-by refactors.
Concrete payoff after the demo — shows rather than tellsNewsletter pull-quote↗ Tweet quote
00:48
And then you end up reviewing a giant pull request that solves a problem you never actually had.
Tight punchline, universal pain point for any Claude Code userTikTok hook↗ Tweet quote
The Script

Word for word.

metaphoranalogystory
00:00Andre Karpathy posted a long thread on X about how his coding workflow completely flipped. He went from 80% manual coding to 80% agent coding in just a few weeks, and he was pretty honest about it.
00:14He said he is now mostly programming in English, telling the model what to write in plain words. But the interesting part was not the productivity gains. It was the failures.
00:25He laid out clearly how coding agents keep messing up in ways that are not obvious. They are not writing broken syntax anymore. The mistakes are deeper.
00:35They make assumptions about what you meant and keep going without checking. They pick one interpretation out of three possible ones and commit to it silently. They do not ask you to clarify.
00:46They just act confident and move forward. And then you end up reviewing a giant pull request that solves a problem you never actually had. So that thread got a lot of attention, and a developer named Forrest Chang took the core ideas from that post and turned them into a single file, a file called claud dot m d.
01:05It is a set of behavioral guidelines for claud code. The repo is called Andre Karpathy Skills, and it is on GitHub right now. It has more than 26 thousands, and honestly, it is one of the most practical things I have seen in this space recently.
01:20Not because it does anything fancy, but because it targets the exact problems that waste the most time when you are working with AI coding agents every day. Okay.
01:31So before I show you how to install it and use it, I want to explain why this matters. Because if you have not run into these problems yet, you probably will soon. And if you already have, then you know exactly what I am about to describe.
01:44The first problem is silent assumptions. You ask your agent to add user authentication. There are 10 different ways to interpret that.
01:53Session based? Token based? OAuth?
01:55The agent does not know. But instead of asking you which direction to go, it just picks one and starts building. And it picks the most complex version.
02:05So twenty minutes later, you have a 400 line auth system with OAuth, refresh tokens, and role based access control. All you needed was basic email and password for a prototype.
02:16The model guessed instead of asking. The second problem is overengineering. You ask for a simple function that formats a date string.
02:25You get back a configurable date formatting utility class with six methods, a builder pattern, and error handling for edge cases that will never happen. The model writes 200 lines when 30 would do. It is trained on massive code bases where abstraction is valued, so it defaults to that style even when the task is small.
02:45The third problem is scope creep in edits. You ask the agent to fix a bug in one function, it fixes the bug. But it also reformats the file, renames variables, cleans up comments that were not part of the task, and refactors an adjacent function that was fine.
03:01Now your diff is 40 lines instead of four, and you have to review every change to make sure nothing broke. The fourth problem is the lack of verification. You tell the agent to add form validation.
03:14It adds the validation code and says done, but did it actually test it? Did it check if it handles empty strings, special characters, values that are too long?
03:23Usually no. It just writes the code and moves on. There is no verification step.
03:29There is no success criteria. It does what you asked in the most literal way and calls it finished. So those are the four core problems, and the Carpathi Skills repo addresses each one with a matching principle.
03:42The first principle is called think before coding. It basically tells the agent to stop and surface any ambiguity before writing a single line. If there are multiple ways to interpret a request, list them.
03:54If something is unclear, ask. Do not guess. The second principle is simplicity first.
04:00Write the minimum code needed to solve the problem. No speculative features. No abstractions for things that are only used once.
04:08Alright. So now let me show you how to actually set this up. I am going to go through the installation, and then we are going to build a small project with it so you can see the difference.
04:18So I am on my screen right now. There are two ways to install this. The first way is the recommended one, and that is the Claude Code plugin route.
04:27You open Claude Code and run slash plugin marketplace add Carpathi skills that adds the marketplace. Then you run slash plugin install skills. And that is it.
04:38Now the guidelines are installed as a plugin, which means they apply across all your projects automatically. You do not have to copy any file into each project folder. It just works everywhere.
04:50The second way is the per project route. This is for when you only want the guidelines inside one specific project. You open your terminal, go to your project folder, and run this cURL command that downloads the file straight into your project root.
05:04Now if you already have a claud. Md file and you do not want to lose your existing rules, you can append instead of overwrite. The repo shows the command for that.
05:14You run echo to add a blank line and then curl the file content and pipe it into your existing claud. Md. Your project rules stay on top and the Carpathi guidelines get added at the bottom.
05:27Either way takes about ten seconds. Okay. So now that it is installed, let me actually build something with it so you can see how the agent behaves differently.
05:35I am going to ask Claude Code to build a simple ecommerce dashboard. Nothing crazy. Just a front end page that shows total revenue, number of orders, top selling products, and maybe a recent orders table.
05:49The kind of thing you would want to glance at to see how your store is doing. So I type my prompt, build an ecommerce dashboard page that shows total revenue, order count, top products, and a recent orders table, use React and Tailwind, keep it simple, and now watch what happens. With the Carpathathy guidelines active, the first thing the agent does is ask me questions.
06:10It wants to know if the data should come from a real API or if hard coded sample data is fine for now. It asks if I want the dashboard to be responsive or desktop only. It asks if I need any filtering or date range selector or if this is just a static snapshot view.
06:26This is exactly the behavior we want. It is not guessing. It is asking.
06:31I tell it sample data is fine, desktop only for now, and no filters needed. Just the basic overview. And then it builds exactly that, nothing more.
06:41The output is clean. One file, maybe 120 lines of code.
06:45Four stat cards at the top, a simple table for recent orders, and a short list of top products. No router setup, no state management library, no API service layer, no authentication wrapper, no dark mode toggle.
06:59Just the dashboard I asked for. And if you have ever asked an agent to build a dashboard without these guidelines, you know what usually happens. You get six to eight files, maybe 500 lines, with a full component tree, context providers, a mock API with fetch hooks, loading skeletons, pagination, and a sidebar navigation for pages that do not even exist yet.
07:21For a prototype, for something you just needed to check a layout idea, that is the difference these guidelines make.
07:28And here is the other thing I noticed. The diff is exactly what I expected. Every line that changed traces back to what I asked for.
07:35There are no surprise edits, no renamed variables in other files, no reformatted comments, no drive by refactors. Just the code I asked for, and nothing else. That is the surgical changes principle doing its work.
07:49You spend way less time reviewing because there is less noise in the output. And now there is a trade off the repo is honest about. These guidelines bias toward caution over speed.
08:00For simple stuff like fixing a typo, you do not need the full rigor. The guidelines are meant for nontrivial work. The kind where wrong assumptions cost you hours and over engineering means throwing away the output.
08:14And I think that is what makes this repo different from a lot of other stuff in the AI coding space right now. It is not trying to impress you, it is solving a very specific, very real problem.
08:25Coding agents are capable, but they behave badly. They make silent mistakes that look correct on the surface. They build too much when you need too little.
08:33They touch things they should not touch, and they do not verify their own work. This file corrects those patterns in about 50 lines of markdown. If you are using Claude code regularly, this is worth trying.
08:46It takes ten seconds to install, and the worst case is you delete the file. But once you see the difference in your diffs and code reviews, you will want to keep it. Alright, so that's it from the video, and I hope you enjoyed it.
08:58If you did, please like this video and subscribe to the channel, and I'll see you in the next video.
The Hook

The bait, then the rug-pull.

Andrej Karpathy posted a thread saying he now programs mostly in English — and then catalogued exactly how that breaks. Not broken syntax. Something worse: agents that guess silently, build too much, touch things they should not, and call it done without checking. One developer turned those observations into a single file. This is a breakdown of that file.

Frameworks

Named ideas worth stealing.

03:35list

The Four Karpathy Principles

  1. Think Before Coding — surface ambiguity first, ask not guess
  2. Simplicity First — minimum code to solve the problem, no speculative features
  3. Surgical Changes — touch only what the request requires
  4. Goal-Driven Execution — define success criteria, loop until met

Four behavioral rules for AI coding agents derived from Karpathy X thread, packaged as a CLAUDE.md file by Forrest Chang.

Steal forDirectly applicable to any Claude Code workflow. Also useful framing for a JoeFlow or CLAUDE.md content piece.
01:40list

The Four Agent Failure Modes

  1. Silent assumptions — picks one interpretation and commits without asking
  2. Overengineering — writes 200 lines when 30 would do
  3. Scope creep in edits — reformats, renames, refactors beyond the task
  4. No verification — says done without checking edge cases

Problem taxonomy from Karpathy thread, named and made concrete with cost estimates.

Steal forPerfect for a short-form breakdown or a things-Claude-does-wrong hook video.
CTA Breakdown

How they asked for the click.

09:03subscribe
If you did, please like this video and subscribe to the channel, and I will see you in the next video.

Standard end-screen CTA, no mid-roll asks. Clean close after the argument lands.

Storyboard

Visual structure at a glance.

karpathy thread
hookkarpathy thread00:01
github repo intro
promisegithub repo intro01:04
excalidraw diagram
valueexcalidraw diagram02:05
four principles
valuefour principles03:35
install demo
valueinstall demo04:25
agent asks questions
valueagent asks questions06:10
clean dashboard output
valueclean dashboard output07:04
vs code clean diff
valuevs code clean diff07:32
tradeoff + CTA
ctatradeoff + CTA08:27
Frame Gallery

Visual moments.