Modern Creator
Nick Puru | AI Automation · YouTube

Hermes Agent's Biggest Update Yet

Nine updates to the open source AI agent that lives on your computer -- from persistent goals to a self-cleaning skill library.

Posted
yesterday
Duration
Format
Tutorial
educational
Views
3.9K
113 likes
Big Idea

The argument in one line.

Hermes Agent now solves the three things that made every local AI agent impractical: it remembers your goals across sessions, multitasks without blocking, and lets you swap models mid-conversation so you stop paying premium prices for commodity tasks.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • You already run Hermes Agent and want to know which of the recent updates are worth setting up first.
  • You are paying Anthropic or OpenAI API costs for routine tasks and want a concrete way to cut that spend.
  • You manage a content production workflow and want an agent that can run multi-step jobs in parallel while you film.
  • You want computer-use automation that works with whichever vision model you already subscribe to, not just Claude.
SKIP IF…
  • You have not set up Hermes Agent at all -- the video assumes familiarity with Telegram-based agent setup.
  • You are looking for a first-time introduction to local AI agents rather than a feature update tour.
TL;DR

The full version, fast.

Hermes Agent's latest release ships nine features that address persistent agent drift, memory loss, and cost bloat. The two structural upgrades are slash goal (a pinned multi-turn objective backed by a judge model that watches progress) and full session recall (cross-session memory with zero setup). Parallel execution comes via slash background, raw-idea-to-subtask routing via an auto Kanban, and cost control via slash model which lets you drop to a cheaper model mid-thread without losing context. For coding work, routing tasks to Codex CLI means Anthropic tokens are untouched for line-by-line generation. The host's ranked picks: slash goal, the curator, slash model, and native Codex.

Members feature

Chat with this breakdown.

Modern Creator members can chat with any breakdown — ask for the hook, quote a framework, find the exact transcript moment. Unlocks at T2: refer 3 friends + add your own API key.

Create a free account →
Chapters

Where the time goes.

00:0002:14

01 · Hook + proof claims

Personal usage proof, stakes the 9-update promise, no developer knowledge required

02:1403:35

02 · Webinar sponsor block

June 3 live webinar pitch for AI agency offers

03:3506:53

03 · #1: Slash goal

Pinned multi-turn objectives, judge model, Ralph loop, slash sub-goal

06:5307:35

04 · Goal quality caveat

Vague goals break the judge model -- specificity is the lever

07:3509:32

05 · #2: Memory upgrade

Full session recall, cross-session cache, tool call indexing, zero setup

09:3212:22

06 · #3: Slash background

Five concurrent background tasks, task IDs, foreground chat stays live

12:2215:01

07 · #4: Auto Kanban

Triage to Specifier to subtasks to parallel sub-agents, orchestration=auto, demo with filming day brief

15:0119:00

08 · #5: Computer use (any vision model)

Previously Claude-only; now GPT-5, Gemini, Grok Vision -- ClickUp navigation demo, remote task marking from phone

19:0020:33

09 · #6: The Curator

Auto-running 7-day skill pruning agent, ranked skill list, zero config

20:3324:50

10 · #7: Native video generation

Text-to-video via Grok or Fal.ai natively in Telegram, robot bartender demo

24:5028:10

11 · #8: Slash model

Mid-conversation model swap, preserve full context, auto-selection by task complexity

28:1031:04

12 · #9: Codex as a worker

Route coding to ChatGPT/Codex CLI -- Opus plans, Codex builds, Anthropic API untouched, landing page demo

31:0422:55

13 · Top 4 + CTA

Host ranks slash goal, curator, slash model, native Codex as the four most slept-on; community plug

Atomic Insights

Lines worth screenshotting.

  • Most AI agents lose your original goal by message 10 -- Hermes slash goal pins the target and spins up a separate judge model to check whether each response is still moving toward it.
  • The judge model in slash goal does no work itself -- its only job is to watch the primary agent and flag drift, acting as an independent reviewer on every turn.
  • Session recall costs nothing to set up -- once you update Hermes, your full conversation history, tool call log, and cross-session cache are indexed and searchable by default.
  • Slash background lets you fire five concurrent research or inbox tasks and keep chatting in the foreground while they run, each with a unique task ID for later reference.
  • Computer use in Hermes now works with any vision-capable model -- GPT-5, Gemini, Grok Vision -- so you are not locked to Claude to drive your screen remotely.
  • The Curator runs silently every seven days, ranks your skills by usage frequency, and prunes dead ones without any manual intervention.
  • Native video generation inside Hermes removes the need for a separate AI video subscription -- text-to-video runs directly from chat using Grok or Fal.ai as the backend.
  • Slash model lets you drop to a cheaper model mid-conversation without losing a word of context -- most routine tasks do not need the most expensive model.
  • Instructing Hermes to auto-select the model tier based on task complexity means cost savings happen automatically, not just when you remember to manually downgrade.
  • Routing coding work to Codex CLI means the build runs on your ChatGPT subscription, not your Anthropic API budget -- Opus plans, Codex executes.
  • The goal spec quality determines the judge model quality -- a vague goal like build me an app gives the judge nothing to check against; specificity is the lever.
  • The auto Kanban Specifier takes a one-line brief, produces a full spec, breaks it into subtasks, and dispatches sub-agents in parallel -- what used to take a full afternoon now finishes while filming one video.
Takeaway

Nine ways to stop babysitting your AI agent.

WHAT TO LEARN

The core failure mode of every local AI agent is drift -- losing the goal, losing the context, blocking you while it works -- and these nine updates address all three at once.

  • Pinning a goal with an explicit success criterion and a judge model is structurally different from just typing instructions -- the judge creates an independent review loop that catches when the agent wanders.
  • Cross-session memory recall costs nothing to configure; the value comes from being specific enough in your original requests that the indexed history is actually retrievable later.
  • Running parallel background tasks requires thinking in queues: fire multiple jobs simultaneously rather than waiting for each one to complete before starting the next.
  • A raw idea dropped into an auto-orchestrated Kanban only produces useful subtasks if the original brief is tight -- vague inputs produce vague specs, regardless of how sophisticated the Specifier agent is.
  • Model-switching mid-conversation makes economic sense only if you audit which tasks in your workflow actually require high-reasoning models versus which ones are cleanup, formatting, or lookup.
  • Routing code generation to a CLI worker rather than the primary agent changes the cost structure of building: planning tokens are cheap, but execution tokens on premium models add up fast across long builds.
  • A self-maintaining skill library compounds over time -- skills you use daily rise to the top, dead weight gets pruned, and the agent response quality on frequent tasks improves without manual intervention.
  • Computer use gains most of its practical value not from obvious demos but from the edge case: completing a task on your machine while you are physically away from it.
Glossary

Terms worth knowing.

Ralph loop
Hermes term for a goal that stays pinned across every conversation turn until explicitly cleared -- the agent locks on target regardless of how many messages have passed.
Judge model
A second agent spawned by slash goal whose only function is to evaluate whether the primary agent output is progressing toward the stated goal, flagging drift rather than doing content work itself.
Slash background
A Hermes command that dispatches a task to run asynchronously in the background, leaving the foreground chat fully interactive. Each background job gets a unique task ID.
Auto Kanban
Hermes built-in project board with a Specifier agent that converts a raw triage idea into a full spec, decomposes it into subtasks, and routes each to a sub-agent when orchestration is set to auto.
Curator
A background maintenance agent built into Hermes that runs on a 7-day cycle, scores all skills by usage frequency, and prunes low-use ones to keep the skill library clean without user intervention.
Codex CLI
OpenAI command-line coding agent that Hermes can use as a worker node, routing line-by-line code generation to the user's ChatGPT subscription rather than consuming Anthropic API tokens.
Slash model
A Hermes command that swaps the active language model mid-conversation to a cheaper tier, a different provider, or a locally-run model without losing any prior context in the thread.
Computer use
An agent capability that lets Hermes control the host machine's GUI by clicking, navigating, and interacting with desktop apps, driven by any vision-capable model the user has configured.
Resources Mentioned

Things they pointed at.

Quotables

Lines you could clip.

07:35
The recall, the cache, the tool call history, all of it -- it is going to be on by default. It just works.
zero-friction feature promise, punchy closeTikTok hook↗ Tweet quote
18:25
From my phone, I was able to just text Hermes and told it to mark it as done. I didn't have to pull up my laptop or anything like that.
concrete real-world use case, no setup context neededIG reel cold open↗ Tweet quote
26:05
Most of the stuff that you are doing, it's probably just clean up, or formatting, or quick look ups, or like simple summaries. Tasks that a much cheaper model could handle just as well.
calls out audience behavior directly, quantifiable implicationnewsletter pull-quote↗ Tweet quote
29:44
Opus, it should be your strategist. Codex, it should be your builder. And you don't even have to pick which one does what part.
clean two-line framework, no setup neededTikTok hook↗ Tweet quote
05:50
The goal that you write is what makes or actually breaks it. Something as simple as build me an application -- that is far too fundamental, far too broad.
common mistake plus direct correction formatIG reel cold open↗ Tweet quote
The Script

Word for word.

00:00So this is Hermes agent, the open source AI agent that lives on your computer and runs your day for you. And in the last thirty days, it just had the biggest update run since launch. Now I have been running it nonstop on my own machine for the last month.
00:13I was able to pull a six week old sponsor conversation in under a second when I needed to follow-up. It ranked my whole sponsor inbox in the background while I was filming yesterday, and honestly, I have not even touched my email manually in the last two weeks, and half of what I'm about to show you in this video, it was not even possible thirty days ago.
00:32Now, here's why this actually matters, because Hermes, it is moving so fast right now that even people running it every single day, they are missing half of the updates. So by the end of this video, you're going to know the nine that actually move the needle, not just flashy demos. I'm going to be giving you guys the ones that quietly will save you hours and money, of course.
00:50So here's just a quick taste of what we're going to be getting into. There's one specific update that finally fixes the thing that every AI agent on the planet gets wrong, and there's one that actually lets you swap models mid conversation without losing a word of context. And the last one alone, it could save you a few $100 a month.
01:05So there's nine total, six more that I haven't even hinted at yet, and we're going to be getting through all of them right now. And by the way, you do not need to be a developer for any of this whatsoever. You don't need to know what f t s five means.
01:18If you can type a slash command and copy and paste and read a telegram message, every one of these is going to be working for you, but let's get into it. By the way, I'll also have all of this stuff available in my free Skull Community. We've almost got 20,000 members, so make sure to check that out if you want all this stuff step by step and more news right away.
01:34Hey, really quick, I just wanna mention on June 3 at 7PM Eastern, I'm doing a free live webinar. And basically, what I'm gonna do is I'm just going to walk you through the four AI agency offers that are actually working right now that either my agency or my students are actively selling.
01:48Now the whole point of this session is you watch it, you figure out which one of these four is going to be fitting you, and then you just go land your first client. It's completely free. It's live.
01:57I'm not recording it, so if you don't show up, you completely miss it. And if you do show up live, I'm going to be giving you this thing that I made called the AI offer selection scorecard. It's basically how I would pick the right offer if I was starting from scratch today.
02:09So look at me down below in the description. Just click it, grab your spot, and I'll see you on June 3, but let's get back into this. So this first update, it is slash goal, and in my opinion, this is the biggest one in the entire release.
02:20Now every agent that I've ever used, it had the same problem. So you give it a task, you have a few back and forth messages, and then by message, maybe 10, it's completely forgotten what you had asked for.
02:31Hermes, it just completely fixed this. So watch this. I'm gonna set a goal here for the week.
02:35So I'm going to say slash goal, draft a YouTube script covering the latest Hermes agents updates, plus three title options for the video, and design the thumbnail by Friday. So I'm gonna run this off. We can see up at the top that we are going to be getting this pinned banner.
02:48So this is going to be showing up every time we have new messaging and everything, so it's not going to be losing that specific goal. But I'm just gonna type out a simple message like, can you first break down what the updates actually are? So we see within our output, we're going to get this sort of banner that's going to be pinned at the top of our conversation.
03:06In Telegram, it's little bit wonky, and it's going to be more applicable to the actual terminal if you're going to be using Hermes in this fashion. But like most of you, you're going to be playing inside of different channels like Telegram, so that's where I'm going to be showcasing. But anyways, you can see this is where it's actually getting a little bit interesting, is that Hermes, it is breaking this into subtasks on its own.
03:28So it's going through scripts, and titles, and thumbnails, and actually spun up a separate judge model in the background, and the judge, it isn't the agent actually doing the work. It's actually a second agent whose only job is to watch the progress and decide if I'm actually getting closer to the goal, or if it's just wandering off from that goal.
03:46Now right up here, you could see this is where I was asking it to first just break down what the updates actually are. So I just took a pause in the middle of its output, in the middle of its, you know, goal run, and it was able to provide me with all that information, giving all the updates on, you know, what actually was released.
04:03And then up next, we can see I was simply typing out something like, okay. What's next?
04:10And it just picked right back up right into the middle of its goal. So all of this, this is effectively what they call the Ralph loop, so the Ralph Wiggum, as you guys may have remembered it.
04:20This is just where the agent locks on the target across every single turn until you actually clear it. If I ever wants to see what the judge is actually doing, I could just type out slash goal status. And we could see, because it already finished up, no active goal.
04:33We can set one with the slash goal, but it didn't just finish that. If we wanna do something a little bit different, we could say slash sub goal, make the thumbnail dark mode with one face, no text. Someone's actually generating the thumbnail.
04:42It's going to add that specific tweak to it. We can run this off. I'm probably not gonna get any crazy looking thumbnail or anything like that.
04:48Now the reason that I'm leading with this one specifically is because I had three videos to ship last week, and I kept getting pulled into sponsor email threads. But what I was able to do is just set a goal once on, you know, Sunday night. By Wednesday, Hermes was still nudging me about which videos were left.
05:01I didn't have to keep on telling it. And one thing to know if you are new to all of this, the goal that you write is what makes or actually breaks it. So saying something as simple as, like, build me an application, that is far too fundamental.
05:13It's far too broad, and you're just not passing enough information. So in that case, like the judge, it has nothing to be checking against. But anyways, that is the first update for this command set, and you'll notice that Claude and OpenAI and all these other tools, OpenCloud, like they're releasing their own forms of slash goal.
05:29So this one, from what I have seen, is one of the best. Now number two, this is the memory upgrade. And honestly, this one is about three different things rolled into one.
05:38So your agent, it now is going to remember every single conversation that you have ever had with it. So So it indexes every tool call it's ever run, so you can ask what command that it used, maybe last Thursday, and it will find it. And it also catches the whole thing across all the sessions, so you stop paying for the same context twice, which is a huge problem up till now.
05:57And the headline of all of this is the session recall. So let me show you what this actually looks like. So I'm gonna say pull our last conversation about our morning brief skill.
06:05So this is like way at the top of our conversation. We were talking about just setting up a morning brief skill, so let's run this off and see what we can get back with this. And this is being pretty brief, but anyways.
06:15So what it's doing right here, it's going through the session search, and it's just trying to recall the morning brief skill, morning whatever. So we're now getting our output back. It is pretty lengthy because it was a pretty lengthy conversation.
06:26Anyways, if we scroll all the way up to the top, we can see session search. It's doing a recall of the morning brief skill. It's trying to find exactly and, you know, just retrieving all the information.
06:37The session didn't find or return a stored prior thread, but the morning brief discussion is present in this current Telegram thread. I'm checking the live Chrome slash scripts so I can pull the actual setup state, not just the chat recap. So it's going in pretty long depth, and it's able to find exactly what I had asked for in the conversation.
06:54We can see the output format that I was asking for. If we go down below, we can see what we noted.
07:00So the calendar plus Gmail, we're pulling real data, what is working, so everything that is set up, what was excluded, and some button logging. It's not fully wired, so this is exactly what I was talking about in that conversation.
07:11And we can get way more in-depth than this. Like I mentioned, this is pretty fundamental. But if we're going to get something across a different session, we can absolutely do so because the memory, it is now effectively buffed.
07:20And one of the best things about this is it quite literally needs zero setup, So you do not have to do anything as long as you are updated to the latest instance, then you are all good to go. So the recall, the cache, the tool call history, all of it, it is going to be on by default. It just works.
07:35Now number three, this is slash background. So this is the one that finally fixed any sort of multitasking. So to give you some context, most agents, when you give them a task, they're going to be pretty busy.
07:47So you can't ask them anything else until they're going to be done. But with slash background, your agent can be working on five things at once, and still chat with you in the foreground like nothing is happening. So it's going to be somewhat similar to the slash b t w inside of Cloud Code that you may be familiar with.
08:03So watch this. I'm gonna fire three background tasks. So I'm first going to say, slash background, read the last 20 sponsored emails in my Gmail, and rank them by deal size, brand fit, and response urgency.
08:12Now I'm going to say, slash background, check the latest releases from Anthropic, OpenAI, XAI, DeepSeek, and summarize what shipped in the last seven days. And now lastly, what I'm saying is research the last seven days of YouTube videos from the top AI creators. Pull the titles, topics, thumbnails, view counts, and then identify the strongest emerging content patterns, repeated angles, and breakout trends that we can use for new video ideas, prioritize actionable insights over raw data.
08:34Boom. Now everything, it is currently running. It doesn't look like we have gotten a response back from any of them thus far.
08:40Totally fine. But whilst all of this going on, I mean, I can just keep chatting in this foreground. So I can say whatever I want, and it's going to have all of these still processing, still running for me, all in the background.
08:51Alright. So just like that, we are getting our first task. So it's going to rank all of our sponsors.
08:56We could see we have starting from the top, we have Base forty four, Asana, and then all these other people who are reaching out for sponsorships, so on and so forth. But whilst all this is going on, we have our other two that are currently running in the background as well.
09:09Now you'll notice that each one, they are going to get their own task ID, so this is just going to be used for referencing at different points. So if you ever forget or maybe some things are a little bit similar to one another, and you can't, you know, discern which one is which, you'll be able to identify them and reference them through this task ID if ever necessary.
09:27Alrighty. Now number four, this is going to be the auto Kanban. So Hermes, they have had this Kanban board for a while now, but what's new is you can drop a raw idea into Triage.
09:36In Hermes, it's going to be able to flush it out into a full spec, break it into subtasks, and just assign them out to sub agents all on its own. To actually spin this up, we could just type Hermes dashboard. And then a few seconds later, it'll automatically populate the dashboard for you.
09:50And we can just navigate to the Kanban, and you'll notice there's going to be four different columns. So we have the triage, we have the to do, we have the scheduled, ready, and the thing that makes the magic happen, it is right here at the top. So the orchestration, it is currently set to auto.
10:06But watch this. I'm gonna drop one raw idea into the triage. So how you actually do this, if you just click on the triage, click this little plus button, you give it a rough idea, and we could say something if you could actually expound this expand this rather.
10:18Prep tomorrow's filming day. List every video I need to shoot. Draft a thumbnail concept for each one, and then write three title variants per video.
10:26We could provide it with some of the skills that we want to reference and utilize specifically just to make sure that's not going to mess anything up. But we're just going to click on create. And just like that, you could see it's now listed under this specific column, and we could click on open.
10:39We have all these different options, like specifying, decomposing, moving it to ready, like all the other columns If we wanna move it into there, we can notify any of the home channels. And here, we have some of the dependencies and some of the children as well.
10:52Now automatically, it was just moved into the next column, so it's now moved into the to do. So the specifier inside of Hermes, it's grabbing that raw idea, and then flushing it out into a real spec, and then it's just going to be breaking it apart. So now we can see some of the things that are in progress.
11:08So compile tomorrow's video shoot list. So it's broken that apart. It's now doing this first, and then it has some other tasks that it has to do later on, you know, just making sure that it's delegating and assigning everything by priority.
11:21So if it has to do this first, it's going to make sure to start that task first, and then go to the other task that it had already broken apart. So with this, we have tons of different sub agents, and they're all about to start working in parallel. Now my personal favorite use for this, it's video research, at least from what I've been using it thus far.
11:37So when I'm digging into new topic for this channel, I drop a one line brief into this triage right here, something like just find everything that shipped in AI agents this week, pull the top 10 examples, and then rank them by the impact. And then Hermes, it's going to split all of them out into a bunch of different parallel research tasks, sends a sub agent at each individual one.
11:58By the time I'm back from filming, it's all going to be stacked up. It's going to be in this ready section all the way to wherever it is, ready right here.
12:06And then this right here, like, this used to take me a full afternoon, and now it's going to be done in the matter of time that it takes me to shoot just one video. But, yeah, with all this, make sure your orchestration is going to be set to automatic, and you're not going to be flicked on the manual.
12:20Very important. Alrighty. Now number five, this is going to be one of my favorites, computer use.
12:24So Hermes, they have had computer use before, but it only worked with Claude. And as of about v o point 01/04, it works with every Vision capable model.
12:34So like GPT five, which is what I'm using right now, Gemini, Grok Vision, literally all of them, which means whatever model you're already paying for, you can drive your screen for you. So all this means whatever model that you're already paying for, it can drive your screen for you. So how you actually set this up, just open up Hermes tools.
12:52And from here, it's going to automatically populate the tools section, and it'll automatically populate this little terminal tab right here. What this goes to reconfigure an existing tools provider or API key, and just make sure that computer use is actually enabled. So you can press space and make sure that everything's good to go, and then we can scroll down to click on done, and then we'll just back out of this.
13:13Alright. So now check this out. What I'm gonna say is use my browser to open up ClickUp and find today's tasks for me.
13:18So I actually had to restart my terminal and give it access and the proper permissions to do all of this, but right now it's asking me to log in. How would you like to proceed? So this is where I would have to provide it with the credentials.
13:28I'm just gonna click on one, and we'll do all of this manually. So now we can see I'm not touching anything. It's going to automatically open up ClickUp for me, and it should navigate everything.
13:36So just like that, it's automatically navigating to my to do list. Now it's clicking on each individual task, so this is the first task. Slowly but surely navigating to one of my other tasks, which is booking in for a doctor and the dentist.
13:50But anyways, it's just going to navigate through all of this just like a normal person would. Now the reason that this is a bit bigger than it actually sounds is because I had ClickUp open up at home, so my desk that I'm using right now, and I forgot to close out a task before I left for a meeting up with a friend. So from my phone, I was able to just text Hermes and told it to mark it as done.
14:09And just like that, I didn't have to pull up my laptop or anything like that. It was able to control my computer and handle everything for me, use computer use, and, you know, open up the necessary tabs, close them out, and do anything else for me. Alrighty.
14:22Number six. So So Hermes, they have this background agent called the Curator, and it's hands down one of the smartest things that they have shipped this entire release. Nobody's really talking about this.
14:32So what it does is every seven days, it goes into your skills folder, and it cleans the whole thing up. So it's doing some sort of self maintenance. You literally do not have to do anything.
14:41So watch this. I'm gonna pull up what it did the other week. So I'm gonna say Hermes curator status, and just like that, we now have a ranked list of every skill that I have.
14:50So this is a brand new fresh Hermes instance, so we don't really have much. Well, actually, we only have one. But normally, you'll have the ones that you use every single day.
14:58It's going to be up at the top, and the ones at the bottom, they're going to be somewhat deadweight. So that's going to be stuff that, like, you totally forgot about. But now if we just check out down at the bottom, we can see that the curator, it is enabled.
15:08It's seated, but it hasn't completed a real run yet, and it's now going to be running every seven days unless you manually preview something. So you don't really have to manage this. It's just gonna be running automatically.
15:17So for you and your use case, if you're using Hermes quite frequently, and you're using it over the course of weeks and months, in this case, it's just going to, you know, prune a bunch of the dead ones, and promote the other ones to the top all on its own, all while you're sleeping. Now number seven, this one is actually pretty awesome.
15:33This is the native video generation. So your Hermes agent, it can now go text to video or even photo to video all inside your telegram or in your Hermes agent all natively. There's no popping over to a separate website, no signing up for another AI tool, and, you know, having to worry about the 50 others that you're already paying for.
15:51So all those different ads that you see on YouTube and Instagram, the ones that are like, sign up for our crazy AI video tool, only $50.90 dollars a month, generate the most cinematic videos, whatever it may be. This is kind of the same thing, except it's just built straight into Hermes.
16:05So as long as you've got your Grok account hooked up, you can do it right from your chat. For me personally, I don't have Grok hooked up, so I'll show you better yet how to actually set all this up step by step. So I'm gonna say generate a five second clip of a robot bartender mixing a cocktail.
16:19Alright. Now number seven, this one's actually pretty awesome. This is the native video generation.
16:23So your Hermes agent, it can now go text to video or even photo to video all natively inside of Telegram or even inside of the Hermes terminal. So there's no popping over to any separate websites, no signing up for another AI tool on top of the 50 that you might already be paying for. So you might have seen, like, all those ads on YouTube, and the ones that are, like, sign up for our crazy AI video tool, only $97 a month.
16:45Yeah. This is kinda the same thing, except it's just built into Hermes. So as long as you do have some sort of provider, so either if you're using Grock or something like FEL, you just provide it with your API key.
16:56So I'm going to give it mine from FEL. I have my API key at the top, but what I'm also saying is generate a five second clip of a robot bartender mixing a cocktail, and we'll run this off. So while this running, is because it will take a couple of minutes or so, this is gonna be great for any b roll for videos, or any intro stings.
17:10Just little clips for social media, or maybe any animations for your thumbnails, you know, the stuff that used to either you had to film yourself or just pay an editor to make, or you can just pay another AI $12.30 bucks a month. But now you can just type a prompt inside of the chat they're already using, and it'll be able to spin it up extremely simple and very cheap.
17:27Alright. So we just got our video back. It took maybe about four minutes or so, but we have our five second clip natively inside of the chat, exactly what we're looking for.
17:34Now if you did want to use this through Grock, you would have to enable super Grock and have this connected because it is going to be running straight off of Grock. So how you can actually enable this yourself, if you did want to do it, you just have to open up Hermes tools once again. And then you'll just have to open up and enable the video generation, and then sign up and sign in with your Super Grok account once.
17:53That's all it takes. Takes literally like thirty seconds. After that, it'll work every time for you.
17:57You don't have to rely on different platforms or providers like Falle dot ai or Higgs Field or anything like that. Alright. Now number eight, this one is legitimately going to save you a lot of money if you are following my advice properly and using this practically.
18:10So with this, with slash model, you can just swap models mid conversation without losing a single word of context. You'll have the same chat, the same thread, just a different model answering, just like using Claude code. Now the biggest reason this matters is because most of the time, you don't actually need Opus 4.7 or GPT 5.5 like I'm using right now, or whatever the most expensive model is.
18:29Most of the stuff that you are doing, it's probably just clean up, or formatting, or quick look ups, or like simple summaries. So tasks that a much cheaper model could handle just as well. So you're probably paying premium models for stuff that a smaller model could do for a fraction of the cost, or literally zero if you're running models locally.
18:48So watch this. I have GPT 5.5 running right now, and I'm just going to say slash model 5.4.
18:54And just like that, we were able to swap to GPT's lower and cheaper model, a 5.4. We could also go to any other models and go to completely different providers. So if we have different fallback options and everything connected inside of Hermes, maybe we have Anthropic, or maybe we have OpenRouter, or DeepSeq, all these different model providers.
19:14We can just easily select slash model, and we can just go to Anthropic, or we can go to something like DeepSeq, if I can spell properly, or even OpenRouter, so on and so forth.
19:25So this is where you can save a significant amount of money. And something you'd also do is you can assign Insta Hermes and say something as simple as, I want you to determine which models are going to be the best model for the job. So in this case, if we have to go to a higher model in which we're doing a complex task, then automatically do slash model and switch to 5.5 instead of relying on the lower tier models like 5.4.
19:50Let me just type this out. Hermes will automatically be able to set this up for us. Alright.
19:54Now the last one, number nine, if you do any vibe coding at all, this is the one that's gonna save you the most money. So Hermes, right now, it can natively use Codec's CLI as a worker, which means that Opus, it stays your main brain for the thinking and the planning, but the actual line by line coding, it gets handed off to Codecs, and Codex runs on your ChatGPT subscription instead of your Anthropic API.
20:16It's a different build, but the same workflow. So just a quick heads up. For this to work, you do need Codex CLI installed on your machine and signed into your account, your ChatGPT account first.
20:26It takes about thirty seconds. I've already got mine set up, so I'm just gonna drop the prompt and the commands, and show you what actually happens. And the install steps are gonna be linked inside of our free school community if you need them, but you just have to install through NPM, Codecs, and then just simply type out Codex inside of the terminal.
20:43So again, check that out inside of our school community. So I'm back inside of Hermes, just using Telegram, of course. And I'm gonna say use Codex to build a one page landing site for an AI boot camp in a single HTML file, dark theme, three pricing tiers, and use Tailwind.
20:57So now I'm just going to press enter. So now Hermes, it sees the use codex part, and it's going to route the work over. So it's going to just notice the skill view codex.
21:07Alright. So we just got this wrapped up, and let's now open this up and see what it actually populated just using codex. So just like that, we now have our landing page.
21:16We have the dark theme pricing section. Okay. So we have 1,800, 3,200, and 9,500, the whole thing.
21:22And the part that matters the most, the entire build just ran on the ChatGPT side of my stack. So my Anthropic build, it didn't even move a millimeter. Now the way that I think about this is if you've got two specialists on call, Opus, it should be your strategist.
21:38Codex, it should be your builder. And you don't even have to pick which one does what part. You just say use Codex when you want Hermes to delegate the build, and it routes the work for you.
21:46Anyways, guys, that is the nine. And if I had to pick the four that I think most people are actually sleeping on, it's gonna be the slash goal, the curator, the slash model, and the native codec. So those four, it's going to quietly save you the most time and most money.
21:59So make sure you set those ones up first. And one last thing, if you want every single one of these features, all you need is the latest Hermes setup. So if you haven't updated in a while, just run Hermes update, your terminal takes literally a minute, and everything I just showed you in this video goes live.
22:14And the thing with all of this, I told you at the start that these updates, it quietly saves you a bunch of time and a bunch of money, but watching me do this, it's one thing, and actually wiring it up to your own machine, it is another. So I built a community where we break every one of these down step by step, and we have 18,000 other people already in there building the same stuff.
22:31It's the first link in the description. It's completely free. Come drop a comment when you are in, and tell me which of the nine that you set up first, because I genuinely want to know which one nine's for people.
22:39And if even one thing in this video saved you time, just hit subscribe, drop a comment. Really appreciate it. But every single week, I will be posting videos just like this and bunch of other things.
22:50So make sure to check that out. Check out the links down below in the description. Thank you guys for watching.
22:54I'll see you in the next video.
The Hook

The bait, then the rug-pull.

Thirty days, nine updates, and half of what this video demonstrates was not technically possible a month ago. The host opens with personal proof -- a six-week-old sponsor email retrieved in under a second, an inbox ranked in the background while filming -- before committing to show the updates that actually move the needle rather than flashy demos.

Frameworks

Named ideas worth stealing.

06:53concept

Ralph loop

An agent lock-on pattern where the stated goal persists pinned across every conversation turn until explicitly cleared, with a judge model checking progress each turn.

Steal forany workflow requiring multi-session task tracking without drift
29:44model

Opus plans, Codex builds

A two-specialist stack where the high-reasoning model handles architecture and planning, and the coding agent handles line-by-line execution on the user's ChatGPT subscription.

Steal forany AI coding workflow to reduce Anthropic API spend
CTA Breakdown

How they asked for the click.

32:00link
It's the first link in the description. It's completely free. Come drop a comment when you are in.

Soft community plug at end. Secondary CTA for a live webinar placed mid-video at minute 1:34 with higher pressure framing (not recorded, miss it if absent).

Storyboard

Visual structure at a glance.

title card
hooktitle card00:00
school/community context
promiseschool/community context00:21
slash goal demo in Telegram
valueslash goal demo in Telegram02:15
slash background 3 tasks running
valueslash background 3 tasks running09:32
Hermes Kanban dashboard
valueHermes Kanban dashboard12:22
slash model swap demo
valueslash model swap demo24:50
Codex CLI worker
valueCodex CLI worker29:44
top 4 summary and CTA
ctatop 4 summary and CTA31:04
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

33:44
Matthew Berman · Tutorial

21 INSANE Use Cases For OpenClaw

How one MacBook running Claude Opus 4.6 replaced a CRM, a security firm, a content team, and a personal chef -- with the exact prompts to copy every piece.

February 17th