Big Idea

The argument in one line.

Hermes is not a smarter chatbot but a personal AI operating system that compounds its usefulness the longer you use it — and understanding its 21 building blocks is the difference between occasional help and continuous leverage.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

You have heard of Hermes or open-source AI agents but could not figure out where to start or what the terminology means.
You are paying for multiple AI subscriptions and want to know whether a self-hosted agent could route the same work more cheaply.
You want to move from typing questions into a chat window to having an AI that takes actions, remembers context across sessions, and runs scheduled tasks while you sleep.
You already use Claude Code or ChatGPT and want to understand how a persistent mobile-first agent complements those tools rather than replacing them.

SKIP IF…

You want an enterprise or team deployment guide — this is entirely solo-builder focused.
You are looking for a code-first deep dive into Hermes internals; the video stays deliberately conceptual and demo-based throughout.

TL;DR

The full version, fast.

Hermes is an AI agent that uses tools to take real actions rather than only generate text. It runs on your own machine and learns your context over time through three layered memory systems. The video covers 21 concepts in ascending complexity: choosing which model to wire in, how MCP servers standardize tool connections, how sub-agents parallelize work, and how cron heartbeats keep tasks running while you sleep. The critical operational insight is that 73 percent of every API request is fixed system-prompt overhead, making session hygiene the single biggest cost lever.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 01:21

01 · Agent Not A Chatbot

The foundational distinction: a chatbot creates a plan, an agent executes it. Live demo shows Hermes booking a flight and rendering results as HTML.

01:21 – 04:14

02 · Hermes vs Other Tools

Four-animal framework: Hermes is your dog (companion), Claude Code is your contractor (project-scoped), OpenClaw is your roommate, Antigravity is your IDE buddy. Use Hermes mobile; Claude Code at desk.

04:14 – 05:48

03 · One Brain, 22 Mouths

Same Hermes agent accessible from Telegram, Discord, WhatsApp, browser OS, or any of 22+ platforms. One intelligence routed through many interfaces.

05:48 – 09:12

04 · Where Hermes Lives

Three deployment options: laptop (free), VPS (~$5/mo), serverless (cents/month). Start local, never pay until you must.

09:12 – 11:33

05 · OAuth vs API Key

OAuth = one-click sign-in, revocable. API key = secret string, keep out of chat logs, rotate when exposed. Claude requires API key only.

11:33 – 14:20

06 · Pick The Right Model

Model-agnostic multi-brain strategy: Claude/Opus for reasoning, GPT for generalist volume, Grok for search and Twitter, DeepSeek for free high-volume tasks. OpenRouter as the single hub.

14:20 – 16:02

07 · Run It Locally Private

Local model via Ollama = 100% offline, private, no costs. Limited by laptop RAM. Paste your Mac specs into Hermes to ask which quantized model you can run.

16:02 – 18:17

08 · Memory System Explained

Three memory layers: memory.md (persistent facts), SQLite full-text search of every session, Obsidian integration. Two weeks of use before it feels magical.

18:17 – 19:34

09 · Build Your soul.md

The character bible that gives the agent a consistent persona, values, tone, and communication style. What makes your Hermes different from everyone else's.

19:34 – 22:33

10 · Connect Your Entire World

~70 built-in integrations. Demo: Granola meeting-notes MCP connected via OAuth and API key to answer questions about past meetings from anywhere.

22:33 – 23:13

11 · Control Your Computer

Hermes can operate a real browser via Chrome DevTools Protocol — real cursor movements, not screenshots. Native computer use.

23:13 – 25:36

12 · MCPs Simply Explained

API = raw wiring, MCP = instruction manual around the wiring. A universal remote that tells the AI every button each app has — token-efficient and reliable.

25:36 – 27:28

13 · Build Hermes Muscle Memory

Skills and Pantheon personas: assign specialist models to specialist tasks. Sub-agents inherit fresh context per task and report back.

27:28 – 31:49

14 · 6 Power Commands

Six slash commands: /q (queue next prompt), /background (run in parallel), /kanban (task board), /reset (clear session), /compress (summarize context), /model (swap brain).

31:49 – 32:49

15 · Safety And Least Access

Principle of least privilege: only grant minimum access the task needs. Never paste API keys into chat. No send-email permission until error rate is trusted.

32:49 – 33:53

16 · Goals vs Super Goals

Goal function: 20-turn North Star — Hermes won't stop until it decides. Super goals: structured multi-step project plans with human and agent tasks, progress bar, dashboard.

33:53 – 35:35

17 · Spin Up Sub Agents

Parallel sub-agents each get fresh context, work simultaneously, report back. Tuesday-morning work done by Tuesday lunch. Hermes co-founder runs 12 parallel agents daily.

35:35 – 38:03

18 · Heartbeat And Cron Jobs

Heartbeat = zombie watchdog that restarts crashed agents. Cron = natural-language scheduled tasks (no syntax required). Daily morning briefing demo.

38:03 – 38:19

19 · Token Cost Secrets

73% of every request is fixed overhead. One user spent 4M tokens asking about the weather by mistake. Use cheap model for volume, expensive for hard calls.

38:19 – 40:31

20 · Build An Operating System

Hermes as the hub: Pantheon personas, Obsidian memory vault, terminal/gateway, tool workshop. GitHub backup daily. The everything system for your AI life.

40:31 – 41:21

21 · Connect Hermes To Claude

Hermes (business brain, long-term memory) + Claude Code (precision builder) share context so neither operates in isolation.

Atomic Insights

Lines worth screenshotting.

A chatbot creates a plan; an agent executes it — that one distinction changes everything about how you prompt and what you expect.
Hermes is designed to live with you: the more sessions you run through it, the more accurately it models your preferences, habits, and goals.
Start local on your laptop for free — never pay for a VPS until you have outgrown your laptop, because most people never do.
73 percent of every AI API request is fixed system-prompt overhead, which means session length is a direct cost multiplier, not just a UX issue.
OAuth gives you a revocable button; an API key gives you a secret string — never paste either into a chat that logs its history.
Claude requires an API key and charges per token; Grok and ChatGPT can authenticate via OAuth using existing paid subscriptions.
The right model for volume is different from the right model for reasoning — treating them as interchangeable is the fastest way to overspend.
MCP servers are instruction manuals wrapped around API wiring — they let the agent know every button an app has without burning tokens discovering it.
One session, one goal — then compress or reset; the longer a context window grows, the more every new query costs and the lower quality the responses become.
Parallel sub-agents each get fresh context and work simultaneously, turning hours of sequential research into a single synchronized handoff.
A heartbeat watchdog pings every few seconds and automatically restarts crashed sub-agents so no task is silently dropped.
Cron jobs in Hermes require no cron syntax — you write a natural-language schedule and the agent handles the timing itself.
The soul.md character bible is what makes your Hermes agent different from everyone else's: without it the agent has no voice, no values, no consistent way of addressing you.
Connecting Hermes to Claude Code means neither tool operates in isolation — shared context compounds both.
Principle of least access applied to AI agents: only grant the minimum permissions a specific task requires, then revoke when done.

Takeaway

The 21 concepts that turn Hermes from confusing to essential.

WHAT TO LEARN

Hermes is not a better chatbot — it is a personal operating system that gets smarter the longer you use it, and these 21 concepts are the mental models that unlock the difference between occasional help and daily leverage.

01Agent Not A Chatbot

An agent differs from a chatbot in one fundamental way: it has tools and takes real actions, not just generates plans — which means the gap between telling you and doing it for you is the whole product.

02Hermes vs Other Tools

Hermes is optimized to compound over time with you — mobile, persistent, personal — while Claude Code is optimized for precision on a specific project; using both is not redundant, it is additive.

03One Brain, 22 Mouths

The same agent intelligence is accessible from 22 or more platforms simultaneously — your entry point does not determine what the agent knows or can do.

04Where Hermes Lives

Start running Hermes on your own laptop for free; only move to a VPS when uptime genuinely matters to your workflow, because most use cases never require it.

05OAuth vs API Key

OAuth and API keys are the only two ways you will ever connect a service; treat API keys as secrets that live in environment variables, never in chat logs that get indexed and stored.

06Pick The Right Model

Model selection is not brand loyalty — it is task matching: use a reasoning model for hard problems, a cheap volume model for loops, and a free model where accuracy is not critical.

07Run It Locally Private

Running a local model via Ollama means zero API cost and full privacy; the tradeoff is raw capability, which shrinks with every billion parameters you cannot fit in your laptop RAM.

08Memory System Explained

Hermes memory compounds across three layers — a plain-text facts file, full-text search of every session, and optional Obsidian integration — and takes roughly two weeks of regular use to feel genuinely magical.

09Build Your soul.md

The soul.md character bible is what separates your agent from a generic chatbot; without explicit values, communication style, and goals written into it, the agent has no consistent voice.

10Connect Your Entire World

Connecting external tools via MCP or API gives the agent access to your real world — meetings, emails, files — which is what makes responses accurate to your actual situation, not generic.

11Control Your Computer

Computer use via Chrome DevTools Protocol means the agent moves the real cursor and clicks real buttons — not screenshots — which makes it usable for tasks you cannot stay at your desk to complete.

12MCPs Simply Explained

MCP servers are more token-efficient than raw API calls because the agent already knows every button the app has — it does not need to discover capabilities mid-task.

13Build Hermes Muscle Memory

Consistent feedback and purpose-built skills let Hermes delegate to the right model for each sub-task — the Pantheon persona system is how you build a specialist team without hiring anyone.

146 Power Commands

One session, one goal — then compress or reset; context windows that grow unchecked cost exponentially more and produce progressively worse results.
The six slash commands are what separate users who run Hermes manually from users who run it efficiently.

15Safety And Least Access

The principle of least access applied to agents means granting only what the current task needs and revoking it after; even the host does not give his agent permission to send emails yet.

16Goals vs Super Goals

The goal function turns a freeform chat session into a directed project with a defined end state — super goals extend this to multi-week plans with mixed human and agent tasks tracked on a dashboard.

17Spin Up Sub Agents

Parallel sub-agents with fresh context each can cut multi-hour sequential research into a single synchronized handoff — treating the agent as a manager, not a single worker, is the power-user shift.

18Heartbeat And Cron Jobs

Cron jobs and heartbeats together give your agent a 24/7 presence: heartbeat restarts crashed agents silently, cron fires scheduled tasks in natural language without any syntax.

19Token Cost Secrets

73 percent of every API request is fixed system-prompt overhead — which means session hygiene is the single biggest cost lever available to you.

20Build An Operating System

An AI operating system is not complexity for its own sake; it is the infrastructure that prevents your agent from starting from zero every session.

21Connect Hermes To Claude

Connecting Hermes long-term memory and mobile reach to Claude Code precision building means neither tool operates in isolation — shared context makes both more accurate.

Glossary

Terms worth knowing.

Agent: An AI system that uses external tools to take real-world actions such as searching, booking, or writing files, rather than only generating text responses.
Hermes: An open-source AI agent framework that runs on your own machine, connects to 22 or more messaging platforms, and accumulates memory about you over time.
OAuth: Open Authentication — a one-click sign-in flow that grants an app access to your account without exposing a raw key; access can be revoked from a dashboard button.
API key: A secret string of characters that authenticates your requests to an AI provider; must be kept out of chat logs and rotated if exposed.
MCP (Model Context Protocol): A standardized instruction package that tells an AI agent exactly what actions a given tool or app supports, making tool use faster and cheaper than raw API calls.
OpenRouter: A routing layer that gives a single API connection access to hundreds of AI models from different providers, with per-key monthly spending caps.
VPS (Virtual Private Server): A rented server that keeps Hermes running around the clock even when your laptop is off; typically costs five to ten dollars per month.
soul.md: A markdown file that defines an AI agent's persona, values, communication style, and goals — the character bible that makes the agent behave consistently as yours.
Sub-agent: A child agent spun up by the main Hermes instance with its own fresh context window; multiple sub-agents run in parallel and report results back to the parent.
Heartbeat: A watchdog process that pings sub-agents every few seconds and automatically reclaims and restarts any that have crashed or gone silent.
Cron job: A scheduled recurring task; in Hermes these are written in plain English rather than traditional cron syntax.
Context window: The total amount of text an AI model can process in one session; as a conversation grows, every new message re-reads the entire history, increasing cost and reducing quality.
Ollama: A tool that downloads and runs open-source AI models locally on your machine, enabling fully private offline inference without API costs.
Pantheon: A Hermes UI feature that lets you create and manage named AI personas, each with its own system prompt and assigned model, for different task types.
Principle of least access: A security practice of granting an AI agent only the minimum permissions a specific task requires, reducing the blast radius of any error or misuse.

Resources

Things they pointed at.

01:21productHermes AI Agent (open source) ↗

11:33toolOpenRouter ↗

14:20toolOllama ↗

19:34productGranola (meeting notes) ↗

23:13toolZapier MCP ↗

16:02toolObsidian ↗

25:36linkagentskills.im ↗

38:19toolGitHub ↗

Quotables