§ 01 · The Hook

The bait, then the rug-pull.

Brad opens with a claim that doubles as a threat to every expensive AI video tool on the market: for free, with no proprietary video model, Claude can now watch anything. Before you've hit play, Claude's already an expert on what's in it.

§ · Stated Promise

What the video promised.

stated at 00:39“I'll walk you through exactly how it all works, the use case that completely changed how I consume content, and how to set this up in your own Claude Code in under five minutes.”delivered at 07:42

§ · Chapters

Where the time goes.

00:00 – 00:52

01 · Cold open

Problem stated: other transcript tools only read words and miss half the video. Promise: how it works, the life-changing use case, and a 5-minute setup.

00:52 – 02:43

02 · Watch videos in minutes — live demo

Side-by-side screen recording: 45-minute Sam Altman YC lecture ingested in under 2 minutes. Claude returns structured speaker summary, queryable in terminal.

02:43 – 03:43

03 · Setup

GitHub link (free), install commands, automatic dependency install, API auth on a free-tier transcription service.

03:43 – 05:18

04 · Under the hood

Core insight: a video is just two things — frames and a transcript. yt-dlp + FFmpeg do the heavy lifting locally. No MCP, no third-party wrapper, no cloud service.

05:18 – 06:30

05 · The cost math

Frame scaling table: 1 min = 60 frames / $0.70; 1 hr = 100 frames / $1.62 (capped). YouTube captions are free; Groq Whisper free tier covers everything else.

06:30 – 07:03

06 · Analyze video hooks

Use case #1: content research — paste a winning video URL, ask Claude to break down the hook. Replaces 10 min/video of manual scrubbing.

07:03 – 07:42

07 · Debug screen recordings

Use case #2: developer QA — drop in a 30-second screen recording of a UI bug; Claude pinpoints the exact frame the state change happens.

07:42 – 08:36

08 · Content intelligence / second brain

Use case #3: Obsidian second brain — Claude auto-watches competitor videos and feeds structured notes in. Compounds over time.

§ · Storyboard

Visual structure at a glance.

hookopen00:00

promisepromise00:39

valuelive demo00:52

valueframework03:43

valuecost math05:18

valuesecond brain07:42

ctaCTA08:25

§ · Frameworks

Named ideas worth stealing.

03:43concept

A video is just two things

Frames
Transcript

Instead of paying for an expensive multimodal video model, decompose any video into the two things Claude already reads natively — screenshots and timestamped text. Feed both together.

Steal forAny explainer on giving LLMs multimodal context; the "decompose the expensive thing" framing is a teachable pattern

04:40concept

Battle-tested tools, not new wrappers

yt-dlp (universal video downloader)
FFmpeg (frame + audio extraction)

Brad explicitly contrasts his use of decade-old, rock-solid CLI tools against MCPs and third-party wrappers. Trust signal: millions of developers, no vendor risk.

Steal forPositioning own-your-stack tools against SaaS wrappers — direct language for MCN+ messaging

05:18model

Frame cap cost scaling

1 min -> 60 frames / $0.70
10 min -> 80 frames / $0.82
30 min -> 100 frames / $0.95
1 hr -> 100 frames / $1.62

Capping frames at 100 beyond 30 minutes means cost is nearly flat at scale — a key objection killer for "this will torch my token budget."

Steal forCost math slides for any AI tool pitch; the framing of "surprisingly cheap" deserves its own section in any tutorial

§ · Quotables