Become AI Native in less than 60 mins
A 57-minute masterclass on the three-layer system that separates companies that merely use AI from organizations that get smarter every day.
June 8thA 25-minute field guide to local AI models, written the weekend a government letter erased the world's most powerful model overnight.
The overnight disappearance of a frontier AI model proves that renting intelligence is a fragile strategy — owning a local layer of your stack that no government letter, policy change, or pricing shock can revoke is the only durable hedge.
When a US government letter took Claude Fable 5 offline overnight, it exposed a structural fragility: cloud AI is rented access, not owned intelligence. Local models — running entirely on your own hardware, with no API key, no per-token cost, and no kill switch — are the generator in the garage for when the grid goes down. The speaker walks through the exact learning order: pick a runtime (LM Studio or Ollama), match model size to your RAM (12B on 16GB is the sweet spot), understand quantization (Q4 halves memory with barely any quality loss), then point an agent like Hermes at the model so it runs free and offline. Five startup ideas close the video — all targeting the market segment that cloud AI simply cannot serve: regulated industries, sensitive operations, and anywhere with no internet.
Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.
Create a free account →
Personal hook: a planned weekend of building with Fable 5 undone by a US government letter at 5:21 PM Friday. Stakes established in under 30 seconds.

Context: cloud frontier models are the smartest tools available, but they share one weakness — you do not own them. One letter, gone overnight.

The electricity/generator analogy. Cloud is the grid, cheaper and easier. Local is the generator in the garage. The ban is the hurricane.

Dead-simple definition: download once, runs on your machine like a video game. Three benefits: privacy (data never leaves), zero marginal cost (unlimited queries after hardware), always-on (works on planes, in bunkers, through bans).

Five-layer pyramid to learn bottom-up: 1) Runtime (Ollama/LM Studio), 2) Hardware Match, 3) Model Choice, 4) Quantization (Q4/Q5), 5) Connect to Agent (Hermes).

The single most useful mapping: 4B runs on anything; 12B is the sweet spot for 16GB RAM; 27-35B needs 32GB+ or a GPU; 70B+ needs DGX Spark or maxed Mac Studio.

Four models to know: Qwen 3 (best all-around, start here); DeepSeek (reasoning + coding, 10-30s think time); Gemma (small, beautiful writing, phone-sized); Llama (biggest community, runs anywhere).

Q4/Q5 labels on model downloads are compression levels. Raw model equals uncompressed photo; Q4 equals high-quality JPEG. Halves memory needed with barely any quality loss.

The real unlock: point Hermes at your local model. Text tasks from your phone; the box on your desk runs them free and offline. Context window is now your real constraint — keep sessions tight.

Run local and cloud side-by-side for a week. You will be surprised how often the free local model is good enough. Knowing what to run where is the skill that separates pros from tourists.

Ideas that only exist because local AI is real: 1) On-device AI for regulated industries; 2) Local clones of popular cloud tools with a data-never-leaves pitch; 3) Air-gapped agents for defense/sensitive ops; 4) Offline AI for ships/planes/rural clinics; 5) Resilience-as-a-service fallback when cloud goes dark.

The lesson is not cloud bad / local good — it is do not build your entire life on something that can disappear with a single letter. Own a part of your stack. Build something nobody can turn off.
Cloud AI is rented access — a government letter, a policy shift, or a pricing change can zero it out overnight, and the only durable hedge is a local layer that runs on hardware you control.
“You don't own them. You rent access. And rented access could be revoked at any time by a government, by a policy change, by a pricing change.”
“You need a layer that nobody can take away from you.”
“After you've got the hardware, every query is free. Unlimited. You can run a model twenty-four hours a day for a month and your bill is just going to be the electricity.”
“Build something today that nobody could turn off.”
“Quantization is how a model that supposedly needs a server ends up running smoothly on your laptop.”
See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.
One government letter. Sent on a Friday afternoon. By Friday night, the most powerful AI model on the planet had been switched off for everyone — no warning, no appeal. The speaker had a weekend of building mapped out. Instead, he made this video.
A bottom-up learning pyramid for local AI. Start with the runtime before hunting for the perfect model — everyone gets this backwards.
The single most practical framework in the video — tells you exactly which model to download given the machine you already own.
In regulated industries (healthcare, legal, finance), data must legally stay on-premises. On-device AI wins by default because the privacy constraint IS the moat — cloud competitors physically cannot enter.
A fallback AI layer that activates when a cloud provider gets banned, goes down, or prices out. Positions as insurance — selling the answer to what happens to your workflows if your provider disappears.
“Do yourself a favor — like, a comment, and subscribe. That just means more of this stuff is going to appear in your feed.”
Soft, self-deprecating framing avoids the hard pitch while still landing the ask.
00:00
00:27
00:47
01:05
01:31
01:46
02:06
02:22
02:31
02:58
03:14
03:32
03:57
04:10
04:30
04:50
05:10
05:30
05:49
06:04
06:29
06:49
07:09
07:27
07:41
08:02
08:19
08:36
08:55
09:15
09:35
09:55
10:15
10:35
10:54
11:12
11:30
11:48
12:06
12:24
12:42
13:00
13:17
13:35
13:52
14:09
14:27
14:45
15:04
15:23
15:42
16:01
16:19
16:38
16:57
17:16
17:35
17:54
18:14
18:26
18:53
19:13
19:23
19:51
20:11
20:30
20:49
21:09
21:28
21:47
22:07
22:17
22:43
23:01
23:18
23:36
23:54
24:11
24:29
24:47A 57-minute masterclass on the three-layer system that separates companies that merely use AI from organizations that get smarter every day.
June 8thGreg Isenberg and Jonathan Courtney pressure-test nine startup categories live and land on one portable rule: date the product, marry the niche.
May 18thA Digg founder walks through the full pipeline of a personal Techmeme-clone he built alone — from RSS to vector clusters to an editorial gravity engine.
February 2ndAlex Finn walks through every surface of the new Hermes Desktop app and shares the session management insight that turns a $1,000/month bill into almost nothing.
June 6thEight copy-paste prompts and three startup ideas for the most powerful AI model yet — no benchmarks, just tactics.
June 11thA 22-minute honest debrief on agentic loops — what they are, why well-funded builders swear by them, and the one case where they actually work.
June 9th