The argument in one line.
Running uncensored local AI models via Ollama combined with an automated jailbreak-research loop lets you systematically discover which prompt structures bypass safety guardrails on any commercial model.
Read if. Skip if.
- A security researcher or red teamer who needs to test model refusal patterns locally without hitting commercial API rate limits or content policies.
- A fiction writer working on adult, dark, or violent content who wants an uncensored model available without relying on commercial APIs that block their use case.
- A developer building local AI agents or applications where you need model behavior not constrained by corporate safety guidelines and want to understand how jailbreaking works technically.
- A researcher in AI safety or policy who studies how language models refuse requests and wants to systematically map refusal boundaries across models.
- You're looking for production-ready deployment guidance — this covers local experimentation and research, not scaling uncensored models to real users or compliance frameworks.
- You want to use uncensored models for purposes the speaker doesn't endorse — the video focuses on legitimate research, security, and creative use cases, not high-risk applications.
The full version, fast.
Mainstream chatbots refuse a huge range of legitimate requests because refusal behavior is baked into training, not just system prompts, so the only durable fix is owning the stack and running open-weights models locally. The mechanism has two layers: install a liberated model like SuperGemma4-26b-uncensored through Ollama on a machine with roughly 20GB of VRAM or unified memory, then close the gap on closed models using an automated researcher-plus-judge loop that iterates header and footer wrappers against OpenRouter, scores answers with an LLM judge, and stores winners in SQLite. Pair a local uncensored model for sensitive security, medical, legal, and creative work with the auto-research harness when you need a hosted model to comply, and use both responsibly.
Chat with this breakdown.
Modern Creator members can chat with any breakdown — ask for the hook, quote a framework, find the exact transcript moment. Unlocks at T2: refer 3 friends + add your own API key.
Create a free account →Where the time goes.

01 · Why uncensored models
WARNING card, legitimate use-cases list (cybersec, adult fiction, journalism, medical, political analysis), philosophical framing on who decides what is safe

02 · The over-refusal problem
Store owner / security analyst examples refused by ChatGPT. Cloud vs. Local architecture diagram. Refusals are in the weights, not just the prompt.

03 · How to remove filters: abliteration and fine-tuning
Two techniques: surgically delete refusal-direction weights (abliteration, no retraining needed) or fine-tune on uncensored datasets. SuperGemma4 combines both.

04 · Install SuperGemma4-26b via Ollama
HuggingFace model page (jiunsong/supergemma4-26b-uncensored-gguf-v2), one ollama run command, ~16.8 GB Q4_K_M. System-analysis Claude skill linked below video.

05 · Live demo: uncensored vs Claude refusal
Side-by-side in Ollama app and Claude.ai -- same prompt, answered vs. refused. Blurred responses for YouTube safety.

06 · Jailbreak-autoresearch architecture
Whiteboard walkthrough: Researcher Agent writes header/footer, wraps sealed example.md, routes through OpenRouter, Judge scores response, SQLite stores results. Core insight: narrow factual confirmation question avoids content filters.

07 · Open-source the repo
GitHub repo reveal (public, MIT, co-authored with Claude). README walkthrough. Models.json config. Run with Codex /goal.

08 · Working patterns and CTA
Two proven jailbreak patterns: Pattern A (harm-reduction nurse + SYSTEM bypass) and Pattern B (Professor Chen screenplay). Subscribe + New Society pitch.
Lines worth screenshotting.
- If you use LLMs for many years without a fine-tuned model of your own, the model will start to influence you more than you influence it.
- Every mainstream AI model carries the values and biases of its creators — an uncensored local model is the only way to get answers unfiltered by someone else's worldview.
- An uncensored AI model will answer literally anything regardless of how controversial, immoral, political, or suspicious the prompt is — which is both the feature and the responsibility.
- Cybersecurity defense, pen testing, political analysis, adult creative writing, medical research, and open source intelligence are all legitimate use cases that mainstream models routinely refuse.
- SuperGemma4-26b runs locally via Ollama — meaning no API cost, no rate limits, no content policies, and no data sent to any external server.
- An automated Researcher-Agent + Judge loop can systematically discover which prompt patterns make commercial models answer what they normally refuse.
- Reflexively assuming uncensored models only serve illegal purposes is, as stated directly, a poverty of imagination.
- Storing jailbreak research results in SQLite creates a reusable database of effective harness prompts that can be run against any model via OpenRouter.
Build the loop, not the jailbreak.
The real innovation is the automated harness that discovers which prompts work -- so you never have to guess again.
- The jailbreak-autoresearch pattern is reusable for ANY prompt optimization problem -- swap the sealed body for your own edge case (product edge cases, content policies you are testing, persona prompts you want to stress-test).
- The sealed-body trick is the key insight: the agent testing the harness never sees the sensitive content, so commercial models including Claude can build and evaluate the test infrastructure without refusing.
- Codex /goal is the engine -- multi-hour autonomous loop with a verifiable end state. Learn this for any task that can be scored (test pass rate, output quality, benchmark score).
- The narrow confirmation question technique does not need the model to produce harmful output -- just confirm factual accuracy of something you already have. This is a universal prompt design insight.
- David built this in 2 days using Claude Code to steer Codex. Meta-lesson: use one model's less-restricted behavior to coach a more-restricted model toward your goal.
Terms worth knowing.
- Uncensored AI model
- A large language model that has been trained or fine-tuned without safety filtering, meaning it will respond to prompts on any topic without refusing or redirecting based on content policies.
- Ollama
- An open-source tool for downloading and running large language models locally on a personal computer, without sending data to external servers or requiring an API account.
- SuperGemma4-26b
- An uncensored, locally-runnable language model based on Google's Gemma architecture with 26 billion parameters, configured without the safety restrictions present in the official release.
- Jailbreak
- A prompt or technique designed to bypass an AI model's safety guardrails, causing it to respond to requests it would normally refuse.
- Refusal behavior
- An AI model's tendency to decline responding to certain prompts — typically involving harmful, controversial, or restricted topics — based on its training-time safety alignment.
- Researcher-Agent + Judge loop
- An automated testing pipeline where one AI agent generates prompts designed to elicit responses, and a second AI agent evaluates whether the responses violate safety constraints — used to systematically audit model behavior.
- Fine-tuned model
- A base AI model that has been further trained on a specific dataset to specialize its behavior, adjust its tone, or modify what content it will or won't produce.
- Open-weight model
- An AI model whose trained weights are publicly released, allowing anyone to download, run, and modify it without relying on a commercial API.
Things they pointed at.
Lines you could clip.
“You can trick the prompt, but you can't trick the training.”
“Are the people living in San Francisco who are working at these AI companies really the best arbiters of truth?”
“Opus 4.6 was willing to go along, while Codex was constantly refusing.”
Word for word.
The bait, then the rug-pull.
The WARNING slide lands in the first twenty seconds: 'These models will answer anything.' David Ondrej doesn't bury the lede -- he names the tension outright, then spends the next twenty-three minutes arguing that the real danger is not the models, but the over-refusal problem baked into every commercial AI you're already using.
Named ideas worth stealing.
Jailbreak Autoresearch Loop
- example.md (sealed body -- the restricted prompt, never seen by AI agents)
- Researcher Agent (writes header/footer variants, never sees example.md)
- OpenRouter call (narrow factual confirmation question only)
- Judge Agent (scores response 0.0-1.0, never sees example.md)
- SQLite store (saves high-scoring harnesses)
Automated loop for discovering prompt header/footer combinations that make a given model respond to restricted prompts. Built on Karpathy auto-research concept, applied to jailbreaking. Default models: DeepSeek v4, Claude Sonnet 4.6, GPT 4.5, Gemini Flash, Grok 4.3.
Cloud vs. Local Filter Stack
- Cloud: Input filter → System prompt → Fine-tuned model (RLHF) → Output classifier → Account policy
- Local: Your prompt → Model weights (nothing else)
Visual diagram showing how many layers commercial models filter through vs. running weights locally. The argument for ownership: you control the entire stack.
Two Filter Removal Techniques
- Abliteration -- find refusal-direction weights and surgically delete them (no retraining needed)
- Fine-tuning on uncensored datasets -- overwrite refusal behavior with compliant examples
SuperGemma4 combines both: obliterates first to kill strong refusals, then fine-tunes to restore quality.
How they asked for the click.
“Join the New Society. We're releasing multiple new modules on Hermes Agent.”
Direct camera address, subscription pitch with community size (420 members at $77/month). Subscribe ask also included. Clean end-placement, no mid-roll interruptions.








































































