Big Idea

The argument in one line.

Handoff documents let you spin up parallel agent sessions for out-of-scope work without diluting your primary session's context, keeping both conversations smart and focused.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You're an AI engineer or developer who runs multiple parallel Claude sessions daily and wants to keep each context window sharp instead of bloated.
A programmer working on scoped tasks across several agents simultaneously who needs a lightweight way to hand off progress without losing critical context.
You're hitting the attention degradation wall around 120k tokens in long sessions and want a systematic method to split work across fresh agent instances.

SKIP IF…

You work exclusively in single, linear sessions and rarely need to spin up parallel agents — the overhead of handoff docs won't save you tokens.
You're new to Claude Code or agent-based workflows and haven't yet felt the performance drop from context window bloat — this addresses an advanced optimization problem.

TL;DR

The full version, fast.

The /handoff skill compresses the relevant slice of a current Claude Code session into a disposable markdown document so a fresh parallel agent can continue scoped work without polluting the original conversation. The mechanism exploits the fact that context windows have a smart zone roughly under 120k tokens and a dumb zone beyond it, where attention diffuses and output quality degrades. Unlike /compact, which clobbers the active session into one long sediment-layered thread, /handoff branches: the parent stays pure while a child agent owns the spinoff task. Two patterns work especially well, splitting an out-of-scope refactor mid-grilling, and handing off to a prototype session that later returns its learnings as a second handoff back to the planner.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 01:30

01 · Discovering the handoff skill

Hook: too-simple-to-be-a-skill pattern becomes essential. Introduces the skills repo (97.8k stars), explains handoff concept.

01:30 – 03:20

02 · Course CTA + context window primer

Soft pitch for AI Coding for Real Engineers cohort (June 1). Transitions into explaining context windows and the Smart Zone / Dumb Zone model.

03:20 – 04:49

03 · The Dumb Zone explained

Animated context-bar diagram. Attention quality degrades past ~120k tokens despite 1M context advertised.

04:49 – 06:10

04 · /compact vs /handoff -- the core distinction

/compact resets to smart zone but clobbers session state. /handoff preserves the current session and spawns a clean parallel session.

06:10 – 08:20

05 · The /compact sediment problem

Repeated compaction builds up sediment layers from prior sessions. Useful for debugging long-runners, but inefficient for parallel work.

08:20 – 10:14

06 · Why handoff exists -- the split-session problem

Discovers a refactoring opportunity mid-session. Neither extending nor compacting works. Handoff keeps the current session pure.

10:14 – 12:20

07 · Pattern 1: Handoff during grilling for scope isolation

Live demo: Sandcastle grilling session. Spots an API split. Handoff with stated reason sharpens the grilling (Q2 collapses) and files a GitHub issue via a parallel agent.

12:20 – 15:00

08 · Pattern 2: Grilling to Prototype and Back

Grilling hits an unknown (TLDraw SDK). Handoff spawns a 169k-token prototype session. Prototype returns a handoff doc back to the parent. DIY sub-agent loop.

15:00 – 16:30

09 · Cross-agent handoff (Claude Code to Codex, Copilot)

Because handoff is just a markdown file, it works across any agent. Enables adversarial review workflows.

16:30 – 18:30

10 · Skill design decisions read live

Reads SKILL.md line by line: suggest skills, use pointers not duplication, save to OS temp dir, redact sensitive info, tailor to stated purpose.

18:30 – 19:04

11 · Wrap-up and course CTA

Recap: handoff is essential. Cohort outro.

Atomic Insights

Lines worth screenshotting.

There is a smart zone and a dumb zone in a Claude Code context window — early in the session the agent is sharper, and it gets measurably worse as tokens accumulate.
At 800,000+ tokens in a context window, agent performance degrades significantly — the attention relationships between tokens become too diffuse to focus effectively.
The /handoff skill compresses the current session into a markdown file and saves it to the system temp directory so a fresh parallel agent can continue the work without touching the original session.
/handoff beats /compact for parallel work because compact modifies the existing session in place, while handoff creates a clean copy the new agent reads without any residual degradation.
A skills repo with nearly 100,000 stars was built by packaging instincts and coding practices into reusable single-file skills — the simpler the skill, the more it gets used.
The /handoff skill was considered too simple to bother shipping — and turned out to be the one that got used the most.
Keeping the original session clean while dispatching scope-limited work to a fresh agent is the pattern that makes parallel coding sessions practical instead of chaotic.
Context efficiency is not about using fewer tokens — it is about keeping the agent in the smart zone of the context window as long as possible by managing what accumulates there.

Takeaway

The skill-as-content flywheel.

Steal the workflow

Matt ships a real tool he uses daily, makes a 12-minute video about it, and sells a cohort -- each step feeds the next.

Ship a real tool you use every day, not a tutorial about someone elses tool.
The /handoff pattern maps directly to JoeFlow Sessions -- DIY sub-agent is the same mental model as morning batch launcher.
The Smart Zone / Dumb Zone metaphor is steal-worthy for any content about AI productivity.
Demonstrated patterns beat explained patterns -- every demo here uses real session footage, not slides.
The state-the-reason-for-handoff rule sharpens the current session by declaring scope -- worth building into JoeFlow UX.

Glossary

Terms worth knowing.

Skill: A reusable instruction packaged for an AI coding agent so it can perform a specific task or workflow on demand, invoked by name within a session.
Context window: The total span of tokens an AI model can consider at once, including the conversation history, file contents, and tool outputs accumulated during a session.
Token: The basic unit of text an AI model processes, roughly a word or word fragment, used to measure both input size and pricing.
Harness: The application or interface that wraps an AI model and adds tools, file access, and session management, such as Claude Code, Codex, or Copilot CLI.
Claude Code: Anthropic's command-line coding agent that runs Claude models with tool access for reading files, editing code, and executing shell commands inside a project.
Attention: The mechanism a language model uses to weigh relationships between tokens when generating output; performance degrades as more tokens compete for focus.
Compact: A built-in command that summarizes a long conversation into a condensed recap and starts a fresh session seeded with that summary, freeing up context space.
Auto compact buffer: An automatic safeguard that triggers compaction when a session approaches the end of its context window, preventing the agent from running out of room mid-task.
Handoff: A pattern where one agent session writes a compressed markdown briefing so a separate fresh session can pick up a specific scoped task without inheriting the full prior context.
Session: A single continuous conversation with a coding agent, bounded by its own context window and history, distinct from other parallel or sequential conversations.
Sub agent: A secondary agent spawned by a parent agent to handle a focused task in its own context window, returning a compressed result to the parent when done.
Grilling: A planning technique where an agent interrogates the user with successive questions to surface requirements and edge cases before any code is written.
Grill with docs: A variant of the grilling pattern where the agent reads provided documentation first, then asks targeted questions informed by what it learned.
PRD: Product requirements document — a written specification that describes what a feature should do, who it serves, and what success looks like.
AFK agent: An away-from-keyboard agent, meaning a coding agent left to run autonomously on a long task while the user steps away.
tldraw SDK: A developer toolkit for embedding an infinite-canvas drawing and diagramming surface into web applications.
Codex: OpenAI's coding agent that competes with Claude Code, used from the terminal to read, write, and run code with model assistance.
Copilot CLI: GitHub's command-line coding assistant that brings Copilot's model-driven help into the terminal for shell and code tasks.
PII: Personally identifiable information — data such as names, emails, or addresses that can identify a real person and should be kept out of shared artifacts.
Adversarial review: Having a second agent or model critique the first agent's output to catch flaws, blind spots, or weak reasoning the original missed.

Resources

Things they pointed at.

00:16toolmattpocock/skills (GitHub) ↗

01:15productAI Coding for Real Engineers cohort ↗

10:22productSandcastle (software factory)

12:30toolTLDraw SDK ↗

Quotables

Lines you could clip.

02:15

“There is actually a smart zone and a dumb zone in these context windows.”

Memorable coined term, instantly relatable to anyone who has noticed agent quality degrade→ TikTok hook↗ Tweet quote

02:58

“Around by the 120k token mark, I start to feel like I am in the dumb zone.”

Concrete number + personal confession -- clippable calibration moment→ IG reel cold open↗ Tweet quote

08:53

“What I really wanted to do was just say, okay, I want to complete this other thing in a separate session and keep my current session pure.”

Clean statement of the core insight -- session purity as a concept→ Newsletter pull-quote↗ Tweet quote

15:06

“It is almost like you have done a kind of DIY sub agent where you are able to use a context window for one specific task, compress your learnings from that task, and pass it back to the parent.”

Clear, quotable explanation of an advanced workflow pattern→ IG reel cold open↗ Tweet quote

The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogystory

00:00A few weeks ago, I noticed myself doing something with agents that I thought was very clever, but I thought it was just too simple to require a skill. For those who don't know, I'm constantly thinking about skills.

00:13I'm constantly thinking about how to package my instincts and coding practices into reusable skills. And this has meant my skills repo has almost a 100,000 stars at the time of recording.

00:23The skill that I started to think about was a handoff skill. And the theory was that this skill would take the context window of the current session and compress it down into a markdown file that could be handed off to another session. And so a couple of weeks ago, I shipped this.

00:38It's inside skills, inside productivity, and it's inside handoff here. And it's a very, very simple skill.

00:45It says to write a handoff document summarizing the current conversation so a fresh agent can continue the work. Save it to the temporary directory of the user's operating system, not the current workspace. I put this into my skills folder as an experiment to see how much I would use it, and it turns out I used it a lot.

01:02In this video, I'm gonna show you a deep dive of the skill, kind of why I designed it, what is the point of it, how it compares to built in tools in some of these harnesses like compact, and also how you can get the most out of it to make the most of your grilling sessions. And if you dig the kind of stuff I've been showing you, then you will love the course that I've put together which is AI coding for real engineers.

01:24A two week cohort for folks who want to use AI coding tools for shipping quality code, not slop. It starts on June 1. We're doing a discount right now.

01:33Get into the link below so you can check out. Let's start first of all by explaining why I made this skill and how it differs from compaction which you may have heard of before. When we're inside a session like this, uh, coding session, we essentially as we, you know, converse with the agent, as it does tool calls, as it makes file edits, then this context window is gonna be filled up and filled up with more and more stuff in it.

01:57More and more tokens will fill up the context window. Now in the harness I use, Claude code, it's the context window is huge. Right?

02:03You get 1,000,000 tokens worth of context window. But there is actually a smart zone and a dumb zone in these context windows. Early on in the context window, you are gonna get much better performance from the agent because the attention relationships are not so strained there.

02:21Because there's much fewer tokens to calculate, fewer attention relationships between those tokens, then the agent's attention isn't so diffuse.

02:30In other words, it's better able to focus when there's less content in there. This means that as your conversation develops, you're going to get dumber and dumber and dumber responses from the agent all the way up to going up to, you know, 800,000 tokens, which personally I've never been in because around by the 120 token mark, I start to feel like I'm in the dumb zone.

02:52So this means, yes, that even though Anthropic advertises a ton of context window on these models, really for, you know, proper smart tasks, you've only got about a 120 to work with, which means you need to budget really efficiently and you need to be aware of your context window at all times.

03:08So the question then becomes, what do you do when you're starting to hit up against this dumb zone? How do you recover your conversation? How do you continue the conversation beyond the dumb zone while staying smart?

03:20And the answer to that is compact. What compact does is it will take a large conversation like this and summarize it. So you go essentially from near to the dumb zone to all the way into the smart zone here.

03:34And there's even sometimes an auto compact buffer depending on what harness you're using and whether you've got it turned on, Which means that when you're near to the end of the context window, let's say deep in the dumb zone, the auto compact buffer will kick in and automatically summarize your conversation inside a new session.

03:51This summary usually looks like the files reference, so just a list of files that have been referenced. The things that you said in the conversation are usually included, and the general tone of the conversation as well.

04:03This is then included as a little nugget at the start of the new session. And as you build up context in the new session, then you're continually referencing the old session. This means as you continue to compact and compact, you're gonna end up with this kind of sediment of different layers here from previous conversations.

04:20And this can be a little bit inefficient, but it's also a decent way if you want to do certain types of sessions where you just need to barrel on on the same problem again and again and again. It can be really useful for debugging actually because you can compact all of the other options that you've tried and then continue to try different things, hit the barrier, and then compact again to just save your state essentially.

04:44So it's a way of doing a long running session, but it's only really one session. So I continue to find compact a really really useful tool for creating these long single sessions.

04:55But what I started to notice was I wanted to do other things with compact. I wanted to compact into another session. For instance, let's say I was in one session here and while I was in this session, I noticed a little refactoring opportunity.

05:10Something that was totally out of bounds, out of scope from my current session, but I knew I would need to get there eventually. So what were my choices? I could extend my current session, but then I would end up with this sort of like diluted context where I was half working on one thing, half working on the other, and I would definitely hit the dumb zone.

05:27Right? So I probably wouldn't be able to finish my initial goal. I could compact, but then I would clobber all of the progress that I'd made in my current session.

05:37Right? What I really wanted to do was just say, okay, I wanna complete this other thing in a separate session and keep my current session pure. In other words, this was what I wanted.

05:47I wanted to essentially take the context or take just the slice that pertain to this extra bug fix, hand it off to another session, and then these two could just run independently. And so for a while, what I was doing was saying, okay, take the stuff in my current session. I want to fix this particular bug.

06:04Write me a hand off dot m d document so that I can then just pass that into another agent. And it turned out I was doing this so freaking often that I just decided, okay, I need a skill for this. I most often use hand off while I'm grilling here.

06:18Here, I'm inside a grilling session that I did for planning some future features for Sandcastle, which is my sort of software factory. And what you can see here is that I'm kind of answering some questions.

06:28I'm only in q two of this grilling session, so not a long one. And I say here, I think in future, we may want to move the iterations and the completion signal onto a separate API. In fact, let's hand off that task to a separate agent.

06:41You can see here that when I'm defining hand off, what I'm saying, I'm saying the reason why I'm handing off and exactly what should be in that document. This does two things.

06:51First of all, it actually sharpens the current grilling session I'm on. So it says that given that constraint q two collapses. So it doesn't actually like, it helps my current grilling session because I'm saying that's out of scope.

07:03We'll pick that up somewhere else. It then goes and creates a markdown file just here with the focus for the next session, file a GitHub issue, and eventually design for splitting iterations in the completion signal into a separate API. And then later, I just pass this into a another agent in order to create the issue.

07:20Simple. Another pattern that I really strongly recommend is handing off during a grilling session to prototype. When you're grilling, when the agent is asking you questions from a grill me or grill with docs, which are more of my skills, you will often find those two categories of questions you need to answer.

07:37There are the kind of known unknowns, the ones that the agent can ask you about. And then there's stuff that you really need to see in code or need to see prototyped.

07:46This can be really true with like UI prototypes or complicated bits of logic that you're not quite sure how to deal with yet. So in this grilling session, we're down to question 13 actually, and we've got a sort of final, uh, resolution from the agent.

08:00And then we can see I say hand off to prototype the difficult bits here. The window communication, the TL draw SDK integration, which was something I was building at the time. It creates the hand off and then I go and implement the prototype on that branch.

08:14So in the prototype session, this ended up being a huge session. So a 169 k tokens.

08:19So way bigger than would have fit inside the grilling. And what I did was I created this prototype of the UI and the kind of interaction that I wanted to see. And then I said, okay, let's hand this off back to the grilling session that spawned this.

08:34Take all of the learnings from the prototype, anything that's not directly captured in the prototype itself or that's non obvious, give me a hand off document that I can pass back to the planner. This is actually a really common pattern that I'm using here where you have the initial session where you do some work, you hand off to another session.

08:51That session then creates another hand off document and then passes it back to the original session. It's almost like you've done a kind of DIY sub agent where you're able to use a context window for one specific task, compress your learnings from that task, and pass it back to the parent. Then I was able to finish the grilling session and create some proper PRDs and issues with the prototype in there.

09:14So it's incredibly rich pattern for actually getting what you need out of AFK agents and using prototypes.

09:22It's very very cool. It's worth saying too that the thing that's cool about just using like a markdown document here and not relying on kind of native agent stuff is that you can have this first session be Claude code, but you can just pass this to another agent. Right?

09:37You can pass it to codecs or pass it to, you know, Copilot CLI, whatever you're using. So if you want to do any kind of adversarial review or any kind of, you know, interaction between different coding agents, this is a very, simple way to do it.

09:51We should also just read through the final bits of the skill here just so you understand the reasoning behind everything. The theory here is include a suggested skill section in the documents which suggest skills that the agent should invoke. I added this because sometimes it would I use skills to kinda define the flavor of that session.

10:10And so having a suggested skill section means that you can kind of just paste the hand off document into the new session. It will invoke the skills needed like grill with docs or diagnose or prototype or something. And then you're kind of good to go.

10:24So you don't need to think about the skills that you need to use in the next session. It's pretty handy. Another one is do not duplicate content already captured in other artifacts.

10:32I would often find these hand off documents just got really big, and they were just duplicating stuff that was already present either in other markdown files or in resources like GitHub issues or things like that. So it's basically saying just use pointers instead of, um, you know, repeating everything that's in the documents.

10:48I also really strongly believe that you should save these hand off files to the temporary directory of the user's OS. In other words, these hand off files are disposable. They are not something to be kept around for a long time to rot in your code base as documentation.

11:02Another one is redact any sensitive information, API keys, passwords, or PII. This is, you know, pretty essential. You don't want these floating around in markdown files in just random places.

11:13And finally, if the user passed arguments, in other words, what the next session will be used for, treat those as a description as to what the next session will focus on and tailor the doc accordingly. I think of this as essential for handoff because in order to write a decent document, the agent needs to know what the next agent session is going to focus on.

11:30Every time I use handoff, I always describe the purpose, the reason that we're handing off because I just can't see how you would write a good handoff document otherwise. And of course, dictation makes this really easy because I just blast it out and then we're good to go.

11:44So there we go. That's hand off. This is an essential skill in my toolkit that you know just like a lot of my other skills didn't exist but a few weeks ago.

11:52If you've been enjoying my skills then you should check out the cohort course. It is an absolute banger. We had about 2,500 people take it last time and I'm expecting, you know, a decent whack this time too.

12:02Other than that, thank you so much for watching. My bookshelf behind me is filling up with new coding books that I'm gonna be reading over the next couple of I'm thinking about maybe making a sort of what's on my bookshelf video of recommended books. And I don't know, if you like that, then maybe give us a like and a comment or let me know what you want to see next.

12:20Either way, thanks for watching and I'll see you very soon.

The Hook

The bait, then the rug-pull.

Matt Pocock had been copy-pasting handoff notes between Claude Code sessions for weeks before he admitted it was a pattern worth formalising. The result -- a five-line skill that compresses context into a markdown file and hands it off to a fresh agent -- turned out to be the most-used tool in his workflow.

Frameworks