Modern Creator
Mark Kashef · YouTube

Claude Code + AWS Bedrock = Enterprise AI

A 36-minute blueprint for moving a personal AI agent stack into a locked-down, compliance-ready AWS environment — built over a month and nearly 10 million tokens.

Posted
yesterday
Duration
Format
Tutorial
educational
Views
1.2K
65 likes
Big Idea

The argument in one line.

The gap between a personal AI agent stack and an enterprise-ready one is not capability — it is a one-for-one swap of every off-the-shelf service for a native, auditable, kill-switchable equivalent inside a single AWS account.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • You are an agency owner or consultant who wants to pitch AI agent infrastructure to clients that have data-sovereignty or compliance requirements.
  • You already run a personal agent OS (Hermes, OpenClaw, or similar) and want to understand what it would take to make it enterprise-deployable.
  • You work in or alongside a company where IT will ask 'where does our data go?' — and you need a real answer.
  • You are a solo builder exploring AWS Bedrock and want a concrete architecture reference before touching the console.
  • You are in healthcare, finance, or another regulated industry and need a starting point for SOC 2 or HIPAA-adjacent AI infrastructure.
SKIP IF…
  • You want a quick no-code AI agent setup — this involves AWS CLI, IAM roles, DynamoDB, S3, and weeks of planning.
  • You have no corporate or client use case; the fixed AWS infrastructure costs make personal hobby builds expensive.
  • You are already deep in Azure or GCP — the concepts transfer but every specific service name will differ.
TL;DR

The full version, fast.

Personal agent stacks fail enterprise audits the moment IT asks where data goes. The solution is not a different product — it is mapping every component of an off-the-shelf stack to a native AWS equivalent: Bedrock replaces the model API, S3 replaces local files, IAM roles replace env secrets, DynamoDB replaces SQLite, Secrets Manager replaces .env files, and Bedrock Guardrails replace ad-hoc output filtering. Claude Code drives all of this from the CLI so you never need to navigate the AWS console directly. The result is a multi-agent platform with kill switches, write-once audit logs, cost caps, DLP scanning, and a compliance posture dashboard — deployable for clients or internal teams after roughly a month of planning-first iteration.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0002:05

01 · Live demo — Jarvis on the dashboard

Opens directly into the finished product: a custom enterprise dashboard with a multi-agent OS running on AWS. Jarvis delivers a live spoken briefing covering agent status, kill switches, compliance score, spend, and audit log.

02:0506:00

02 · Platform tour

Full walkthrough of every tab: overview (spend, agents, sessions), playground (multi-model comparison including Claude, Qwen, DeepSeek, GPT-4o), Slack and Telegram integrations, agent management, knowledge upload, audit log, kill switches, and compliance posture.

06:0009:30

03 · Course pitch + transition

Mid-video offer for a Claude Code living course with a new module weekly. Bridges back into the technical content.

09:3010:44

04 · Five enterprise variables

Names the five design pillars that differentiate enterprise from personal builds: Transparency, Simplicity, Scalability, Security, Cost. Frames why personal AI tools are inaccessible to enterprises — the moment IT asks 'where does our data go?' everything stops.

10:4414:42

05 · Service mapping — Hermes to AWS

The core conceptual section. Side-by-side comparison of Hermes agent components versus their AWS equivalents: hosted model API → Bedrock, local files → S3, one operator → IAM roles, SQLite → DynamoDB, .env secrets → Secrets Manager, no audit → CloudTrail + guardrails.

14:4217:45

06 · Driving AWS with Claude Code

Explains how to avoid the AWS console entirely: connect the AWS CLI, then use Claude Code or Codex to provision and configure all services in plain English. Emphasizes two weeks of plan-mode planning before any building.

17:4521:10

07 · Architecture deep-dive — request flow

Step-by-step walkthrough of what happens to a single message from Slack or Telegram: rate limit check → agent load → Bedrock kill-switch check → cost cap check → guardrail → tool dispatch → response DLP scan → audit log → propagation back to the surface.

21:1024:40

08 · Model flexibility and data plane

Explains that Bedrock is not locked to Claude — open-source models (Qwen, Titan) handle grunt work while Claude handles reasoning, cutting cost. All agent memory, system files, and folders live in encrypted S3 buckets that can be wiped in one click.

24:4028:00

09 · Security posture — six domains

Six-part security framework: kill switches, write-once audit logs, cost caps, Bedrock Guardrails, least-privilege IAM, and continuous credential leak scanning. Includes the DLP scan list (AWS access keys, credit cards, SSNs, Slack/Salesforce/GitHub tokens) and the compliance posture dashboard with SOC 2 and HIPAA readiness scores.

28:0030:55

10 · RAG and multimedia inside the account

Shows the knowledge upload flow: documents are ingested, chunked, embedded with Titan Embed, stored in S3, and queried by any agent. Also covers image generation via Nova models inside Bedrock — no external Gemini or OpenAI calls required.

30:5533:31

11 · Team access and Jarvis wiring

Multi-tenant team management: invite members, assign roles (admin/operator/viewer). Jarvis architecture: Nova Sonic handles voice and tab navigation, Claude handles reasoning, both behind kill switches. Jarvis is read-only by design — write operations require explicit unlock.

33:3135:58

12 · Closing — idea to packaged OS

Closes with a 'from idea to OS' diagram: idea → build by hand → package the skill → team taps in. Offers a free care package (blueprint, slide deck, prompts) and a premium course with the full repo.

Atomic Insights

Lines worth screenshotting.

  • The jump from personal to enterprise AI is not a capability problem — it is a data-residency and auditability problem.
  • Every component of a personal agent OS has a direct AWS native substitute: local files → S3, env secrets → Secrets Manager, SQLite → DynamoDB, model API → Bedrock.
  • You should plan for two weeks before writing a single line of infrastructure, then plan again before each subsequent build phase.
  • Claude Code connected to the AWS CLI means you can provision and configure cloud infrastructure entirely in plain English without touching the console.
  • A cost cap check before every agent turn is not optional on AWS — token burns map directly to real cash charges, not just API credits.
  • Write-once audit logs that persist for years are not a nice-to-have; in regulated industries they are a compliance requirement that personal stacks cannot satisfy.
  • Least-privilege IAM means every team member starts with the minimum access possible and gets more only when explicitly granted — the opposite of typical personal builds.
  • Bedrock Guardrails can scan both inbound prompts and outbound responses for PII, API keys, credit card numbers, and Slack tokens before they ever reach a user.
  • Open-source models like Qwen or Amazon Titan can handle routine grunt work inside Bedrock while Claude handles planning and reasoning — cutting inference costs significantly.
  • A 'kill switch' is not just a metaphor: the platform has 42 individual service toggles and a single panic button that shuts everything down in one click.
  • The Jarvis voice agent is deliberately read-only — it can navigate tabs, read live data, and explain it aloud, but cannot write or mutate state without an explicit permission unlock.
  • SOC 2 and HIPAA readiness scores against your live infrastructure are useful starting points, but a human cybersecurity professional must validate the gaps before any real compliance claim.
  • Multi-tenant access means the same platform can serve multiple team members with distinct roles — viewer, operator, admin — each scoped to what they need and nothing more.
  • Hosting a custom Hugging Face model inside your own AWS account costs money per hour but keeps inference inside the security perimeter with no external API calls.
Takeaway

How to move an AI agent stack into enterprise infrastructure

WHAT TO LEARN

The difference between a personal agent OS and an enterprise-ready one is not capability — it is a systematic substitution of every convenience-optimized component for an auditable, kill-switchable, compliance-adjacent AWS equivalent.

01Live demo — Jarvis on the dashboard
  • Starting a demo with the finished product speaking for itself — before any explanation — is a faster credibility signal than a slide deck introduction.
  • A spoken AI briefing that covers spend, agent status, compliance score, and audit state in under 90 seconds defines what 'mission control' means without requiring a lengthy definition.
02Platform tour
  • A multi-model playground inside a secured account lets you compare Claude, Qwen, DeepSeek, and GPT-4o on the same query while tracking the inference cost of each — cost visibility and model flexibility are not mutually exclusive.
  • Shared agent memory across Slack and Telegram means a user picks up the same conversation on any surface, which is the feature enterprise clients most often ask about before compliance.
04Five enterprise variables
  • Transparency, Simplicity, Scalability, Security, and Cost are the five levers that determine whether an AI build survives contact with an enterprise IT team — optimizing for any one at the expense of the others creates the next blocker.
  • The reason most AI tools are inaccessible to enterprise users is not price or capability — it is the inability to answer 'where does our data go?' with a verifiable technical answer.
05Service mapping — Hermes to AWS
  • Mapping each personal-stack component to an AWS native equivalent preserves the architecture you already understand while satisfying the data-residency and auditability requirements that block enterprise adoption.
  • S3 is not just file storage — it can also vectorize documents in place, making it both the storage layer and the RAG foundation without an additional service.
  • Secrets Manager ensures that no agent ever reads credentials from a file on disk — every API call retrieves the secret at runtime, which is the single change that closes the most common enterprise security objection.
06Driving AWS with Claude Code
  • The AWS CLI is the bridge that lets Claude Code provision and configure cloud infrastructure from a plain-English prompt — the goal is to use the console only when the CLI asks you to confirm a policy or credential.
  • Planning for two weeks before the first build sprint sounds slow but is the single most important factor in whether the security model holds — infrastructure decisions made under time pressure become technical debt in regulated environments.
07Architecture — request flow
  • A single chokepoint through which every agent turn must pass — rate limit, kill switch, cost cap, guardrail, tool dispatch, DLP scan, audit — means security is structural, not bolted on after the fact.
  • Checking the cost cap before executing a request (not after) is the difference between a predictable infrastructure bill and a surprise charge that closes the project.
08Model flexibility and data plane
  • Using Claude for reasoning and Qwen or Titan for grunt work inside the same Bedrock account gives you frontier-model quality where it matters while cutting per-turn inference cost on high-volume routine tasks.
  • All agent memory living in encrypted S3 buckets that can be wiped in one click is the architectural guarantee that makes data-deletion requests (GDPR, client offboarding) trivially fulfillable.
09Security posture — six domains
  • DLP scans that check outbound responses for AWS access keys, credit card numbers, Slack tokens, and GitHub PATs are the guardrail most personal stacks skip — and the one most likely to cause a breach.
  • A compliance posture dashboard with a remediation brief (a prompt you can send to a language model to close each gap) turns a static audit into an actionable improvement loop.
  • SOC 2 and HIPAA readiness scores are starting points, not certifications — a human cybersecurity professional must validate the gaps before any compliance claim can be made to a client.
10RAG and multimedia inside the account
  • Embedding documents with Titan Embed and storing the vectors in S3 keeps the entire RAG pipeline — ingestion, storage, retrieval — inside the security perimeter with no external embedding API calls.
  • Image generation via Nova models inside Bedrock trades some output quality against the guarantee that no prompt or image crosses an external API boundary — a meaningful trade for clients with data-sensitivity requirements.
11Team access and Jarvis wiring
  • Role-based access (viewer, operator, admin) scoped to what each team member actually needs is the organizational complement to least-privilege IAM — the same principle applied to the application layer.
  • Making a voice agent read-only by design (it can navigate and explain but cannot write) is the architectural decision that prevents a casual voice command from becoming a production incident.
Glossary

Terms worth knowing.

AWS Bedrock
Amazon's managed service for accessing large language models — including Claude, open-source models, and Amazon's own Titan — inside a private VPC, with no data leaving your account.
IAM Role
An AWS identity that defines exactly which services and resources a user or service can access. Enterprise deployments use least-privilege IAM so each actor gets only the permissions they need.
S3 (Simple Storage Service)
Amazon's file storage service, functionally equivalent to a private Google Drive or Dropbox. In this architecture it stores agent memory, uploaded documents, and vectorized embeddings.
DynamoDB
Amazon's managed NoSQL database. Used here for fast key-value storage of agent state, session history, and memory records.
Secrets Manager
An AWS service that encrypts and stores API keys and credentials in the cloud. Agents retrieve secrets at runtime instead of reading from a local .env file.
Bedrock Guardrails
AWS-native content filters applied to both prompts and model responses. Configured to block PII, API keys, and other sensitive data from entering or leaving the system.
Fargate
AWS's serverless container runtime. Used here as the orchestration layer that routes incoming requests from Slack, Telegram, or the dashboard to the correct backend microservice.
VPC (Virtual Private Cloud)
An isolated network inside AWS where your services run. Traffic between services in the same VPC never touches the public internet.
SOC 2
A security audit standard used by most commercial companies to demonstrate that their systems protect customer data. Compliance requires meeting criteria across security, availability, confidentiality, and privacy.
HIPAA
A US federal regulation governing the handling of protected health information. AI systems used in healthcare settings must demonstrate HIPAA-compatible data handling to avoid regulatory penalties.
DLP Scan
Data Loss Prevention scan — automated inspection of messages or files to detect and block sensitive information (credit card numbers, API keys, SSNs) before it is transmitted or stored.
Least Privilege IAM
The security principle that every user or service gets only the minimum permissions required to do its job, reducing the blast radius if any account is compromised.
Titan Embed
Amazon's in-house embedding model available on Bedrock. Used to vectorize documents so they can be stored in S3 and queried by agents using retrieval-augmented generation.
Kill Switch
A per-service toggle in the platform dashboard that instantly disables a specific AWS service or agent. A panic button shuts down the entire stack in one action.
Resources

Things they pointed at.

01:25productAWS Bedrock
02:50productTelegram
02:50productSlack
03:52productSalesforce MCP integration
09:40productHermes agent (open source)
09:40productOpenClaw agent (open source)
14:42toolAWS CLI
19:39productAmazon Titan
09:09productClaude Code Zero-to-Hero living course
Quotables

Lines you could clip.

01:25
Your SOC 2 and HIPAA readiness is scored against the live setup, so you always know exactly where you stand.
The demo agent speaks this line in the opening 90 seconds — hits the compliance pain point before the host says a word.IG reel cold open↗ Tweet quote
09:33
As soon as they ask one question — where does their data actually go, how long is it retained for — everything stops.
Crisp articulation of why enterprises can't participate in the 'AI party' — the exact blocker every agency owner hits.TikTok hook↗ Tweet quote
10:23
If you're solo and security is not top of mind, stick with the OpenClaws and Hermes. Everything beyond that is where the next portion of this will lie.
Rare moment of honest audience filtering — tells most viewers to leave, which makes the ones who stay feel they've earned access.Newsletter pull-quote↗ Tweet quote
23:50
Everything is meant to be switched off with one click of a button. You could completely nuke and delete everything in one shot because there is no copy on someone else's server.
The data-sovereignty pitch in one sentence — lands the core benefit of an owned-cloud stack versus a SaaS platform.TikTok hook↗ Tweet quote
34:50
I probably planned for two weeks before I started building and then every single time for the next phase, I would spend another couple weeks planning before the next iteration of building.
Counters the 'just vibe-code it' narrative with a concrete planning cadence backed by real results.Newsletter pull-quote↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogystory
00:00Alright. So watch this. I'm gonna navigate to our custom dashboard here and hover over our new Jarvis agent and have her walk through a quick TLDR
00:09of what you're looking at. Welcome back, Mark. Let me catch you up on what your team has been doing.
00:14Research wrapped the competitor pricing, ops staged the v 1.2 rollout, and I pulled it all together. That is your whole operation at a glance, three agents working and all of it inside your own AWS account. Your kill switches are set the way you left them, and least privilege is holding across the board.
00:30Your SOC two and HIPAA readiness is scored against the live setup, so you always know exactly where you stand. Every response is checked before it leaves, so nothing sensitive has slipped out since you were last here. Spending is right in line with usual, and the daily cap is still well clear.
00:44And every action since you stepped away is logged, named, and reversible, so nothing here is a mystery. You are all caught up. Everything is calm and under control.
00:52So we looked at right now wasn't OpenClaw,
00:55wasn't Hermes agent, and it wasn't something you can just pull off the shelf. Because this is one of multiple agents running in what's called AWS Bedrock. For months now, anyone who's an individual or a very small company have been able to take advantage of all these open source frameworks and use them pretty much out of the box.
01:11But if you're a more mature company, a nonprofit, or even an entity that just has to stick to very strict guidelines, you've been pretty much on the sidelines for the past few months.
01:20You haven't been able to create your mission control for enterprise or even take things like Salesforce or Slack on the go securely through your mobile device. The goal of this video is to walk you through more than a month of work and almost 10,000,000 tokens to put together a platform on AWS Bedrock.
01:37And you can use the exact blueprint and how I'm gonna walk you through how I built this to emulate it into other environments like Azure, like GCP, and other cloud environments as well. And if you stick around till the end, I'm gonna show you what I built and how so you can emulate it for yourself, a client, or just be aware of what it might look like to build these kinds of products at a more enterprise level.
01:57So if you're ready to take your agentic workflow building to the next level and look at what the next league looks like, then let's dive in. Now before we get into the nitty gritty, it would be rude of me not to give you a quick tour of the entire platform as I have it. So really quick, we have this overview tab, and this tab gives us a breakdown as to how much we've spent because building on these kinds of platforms can be expensive, especially on the inference side.
02:19Then we have the list of agents that we have, recent sessions that we've engaged in, and we can engage in these sessions from things like Telegram where we can pop into my assistant right here and I can ask it as I did here to create an ASCII diagram as to how it's designed and I can even go into something like Slack where I can have the exact same conversation where that Telegram agent has the same shared brain, same shared memories, and most importantly, the same shared entire stack that is fully secure in my AWS environment.
02:49And just in case you're not familiar, AWS stands for Amazon Web Services. Just because I'm gonna keep saying it over and over again. Adding an agent is very straightforward.
02:56All I have to do is click through here, click on here, click on new agent, add an ID, add some form of template, add its role and responsibility, it's assigned model, then and I can set it up on the back end. And once we have these agents set up, you can always click into them, click on their system prompts, edit them from here, have a chat with them, go through your different sessions, your memories, connect them to things, like I said, a Slack app where you can actually create this out of the box very quickly as long as you have the manifest, or you can even ask Claude Code or Codex to make it for you.
03:25And you can easily connect it to something like Telegram. The hardest part is just understanding how you would go about assembling such a stack. Now the functionality of the chat and playground tabs are very similar.
03:36On chat, you could just have a conversation to test out and make sure that your agents are up to standard in the playground. A lot of people don't know that on these platforms like Amazon Web Services, like Azure, like GCP, you can have conversations with Claude, but also other open source models as well.
03:52So if you wanna use Claude, you'll always be able to because Anthropic and AWS have a partnership. But on top of that, they keep adding brand new models like Quinn, Moonshot, Meta, Mistral, some Amazon native models that we actually end up using for Jarvis because the entire goal is to also keep everything within the environment.
04:09No external calls to things like Gemini outside of your core servers. And of course, we have the beautiful Jarvis tab where you can not only have a conversation here directly with Jarvis and have it navigate through tabs as you saw in the demo and answer questions about your system, but if you feel like actually sending a request textually, you can and you can do what I showed you earlier, which is a routine where I created a routine that will go through and give you a demo of the platform with a voice over.
04:34You can also create a routine and tell it, I want you to go through the agent tab, then the usage tab, then the cost tab, and this is what I want you to break down when I click on that specific button. These next tabs are straightforward. You have just a record of all of your sessions you've had with your agents, your usage across the board in terms of spend.
04:51One thing to be very aware of is if you start building on these kinds of platforms, just the host infrastructure costs money, a fixed amount per month on top of the variable amount, the ad on top. So this isn't a substitute for a personal solo build of something like Hermes agent deployed on hosting or railway or whatever cloud provider you might prefer.
05:11Moving on to the cost tab, it'll give you a TLDR of your overall spend and your projected spend, and you could see exactly where your culprits are, where you might have recoverable costs because sometimes when you build on these platforms, there are tons of hidden costs if you don't know what you're doing. You could always scroll the bottom, get an AI generated report of exactly where the costs are sourced from.
05:31You could see what services are responsible for what cost, and then you can go back and forth with your language models of choice to see, do you need this service based on what you're trying to accomplish with your personal agents or your agent platform. Then we have the audit log of every single thing that's happened in the system, which is complemented by the ability to have kill switches.
05:48So across the board, when it comes to all the different services running, if you wanted god access to turn off any service for whatever reason, then you could do that from here. You also have a panic button that will stop the entire server and everything associated with one click of a button, and you have different ways to filter responses that I will get to in the video.
06:07And then for many people that are concerned with things like compliance, this is not Fort Knox, at least not the way I've built it myself. But you can use this structure or mental model to start getting on the path to building something that is SOC two compliant or HIPAA compliant for those in the medical care industry.
06:23And one thing to help that here is I have a full breakdown of what is partially meeting or not meeting those standards, and I even have what's called a remediation brief where you can click on this and it will walk you through a full breakdown of a prompt you could send to a language model to help you close the gap slowly but surely.
06:41But naturally, you will want an IT, a second set of eyes on this that actually is aware of security because all the devils in the details matter. Now when it comes to things like integrations with things like Hermes and OpenClaw, life is good.
06:53It's very easy. You're one click of a button or one prompt away. But when you work at a more high security standard, you have to worry about exactly how you're talking to these services.
07:03So things like Amazon have MCP server marketplaces of their own. So you could theoretically hook up your agents. You can ask something like, can you go through my recent Slack messages and just tell me the first name of the person that last sent me a message?
07:18And just tell me what that message was. So then I can send this over. And then within seconds, this will make a tool call to our Slack MCP that's securely going through our AWS server.
07:28And after that, we can ask a question of our Salesforce. So I can say, can you walk me through going into my Salesforce, the top 10 deals or opportunities that we currently have right now?
07:38Then I can send this over, and this should be able to talk to my Salesforce integration in real time, go through the MCP, and pull the information we need. And there we go.
07:48We have the hypothetical 10 opportunities by value with the name of the company, the close date, etcetera. Now for the sake of time, I won't go through the rest of these right now. We'll get to them in due time, but pretty much all of these have showed on this channel before for the personal build of Clawd Claw that I showed in the past, where you have a knowledge tab where you can drag and drop information right here, talk to your data, have your agents access said data to answer any question on it.
08:12You could set up missions for them, have memories, create a hive mind for all of their memories in one unit, then you can always have a group chat with them in one zone if you wanna test out or ask a question to all of your agents at once. And since this is made for more of a mature audience, you can always change things like the branding.
08:30I can switch this to be my agency theme, so my prompt advisors theme. I can switch the colors around as needed, etcetera. And by the way, I know this is a lot, but I'll be breaking down all these concepts a lot more clearly throughout the rest of the video, and there's a whole care package waiting for you absolutely free.
08:44But on top of that care package, if you like the way I teach and break down these more intermediate concepts that you don't really find on YouTube, then you'll wanna check out the first thing down below. You'll not only get access to additional training on this concept, but also all the other concepts related to workflows and Claude code in our Claude code living course.
09:02We add a brand new module every single week to stay on top of the cutting edge of agentic AI and agentic operating systems. So if you're interested in mastering it, then check out the first thing down below and maybe I'll see inside. Alright.
09:13Back to the video. Now that I've given you a tour and a preview of the finished product, I'm gonna go through the nitty gritty now, get into the weeds, and walk through the bedrock of the enterprise operating system. Because a lot of the concepts and paradigms are similar to a personal operating system, but the thing with enterprise is you wanna focus on five main variables.
09:32And those five are transparency, simplicity, scalability, security, and then cost.
09:38And when it comes to all these parameters, you do all this heavy lifting on the back end to make something unbelievably simple and easy to navigate on the front end. If you haven't worked for or with an enterprise client, it's important to remember that for the most part, a lot of this AI party that the majority of us have fun in and build and experiment with, they're not able to take advantage of.
09:59And that's because as soon as they ask one question, specifically the IT teams, where does their data actually go, how long is it retained for, all of these buzzkill questions. But in reality, these are questions that are fundamental to work in the real world.
10:14Everything stops. Everything's on pause, and you have to find ways to MacGyver solutions to get to the same result. So again, if you're a solo practitioner and security is not top of mind, you don't need this very high level of transparency, then stick with the OpenClaws, Hermes, or my build the ClawdClaw agents.
10:31Everything beyond that is where the next portion of this will lie. And just to help bridge the concept, let's say on the left hand side here, we have Hermes agent. And with Hermes, it's a beautifully built open source project, and kudos to the team for developing it.
10:45You have the ability to interact with your local files. You can have your local environment secrets just sitting in a file on your computer, which could never happen at an enterprise. And to help bridge the gap between the difference between a Hermes agent operating system and something like this is you can just imagine you're mapping each one of those services or functionalities to be a native service that lives within whatever environment you choose.
11:08If you chose Azure, then the exact same concept would play out. Whatever services or microservices live there where you can substitute one functionality for the next is what you'd select. So to make this more tactical, with Hermes, you have your hosted models or your models you're connecting to via API.
11:24You have your local files. You typically have one operator. You have a local database or something like Supabase where you're storing your information.
11:32You have the privilege of storing your files or passwords rather in an environment's secrets file. So this could be one small file on your computer, an enterprise could never. And then you pretty much have no audit, no rails unless you absolutely need to.
11:45You're just kind of experimenting and playing around because most likely your blast radius is pretty minimal. What you have to do to go from here to here is basically find a way to map each one of these services.
11:57So when it comes to the model, you have something called like I referred to before Amazon Bedrock. This is the playground where you have access to all of the language model in a secured VPC that you can access and you can do one API call at a time as you would with most consumer grade APIs.
12:13Then you have local files. You would map that to something that's called s three. S three, you can think of it as Google Drive equivalent where you can drag and drop all kinds of files.
12:23There's even files you haven't heard of before called Parquet files, but I won't get into that. But you can just think of it as a landing place for tons of files. And with s three, you can also vectorize a lot of these files in place.
12:37So not only can they store it, but they can also make it ready to be able to speak to different agents, different rag applications, etcetera. When it comes to Hermes, you have one operator.
12:48So that's just yourself, you turning on your laptop or your empire of Mac minis. In an enterprise, you would need something that's called I'm roles, and this is known as different things with different platforms. But an I'm role is basically a list of what kind of access that you have as an individual member or team member in a specific company.
13:08So what you can see, what you can download, what you can't download, etcetera. When it comes to things like databases in the Hermes world, you could use something like a SQLite or like I said, a Suba base or a Postgres database. You can also use that in things like AWS except you have way more variety.
13:23You can use these things called DynamoDB, KMS. You have many options for databases and it really comes down to a combination of cost, need, and how easy it is to actually manipulate and use these databases out of the box.
13:36Like I said, when it comes to secrets, can't just store it on a file. You can store it in what's called a secrets manager where it basically encrypts all the secrets and it lives in the cloud. So every time you wanna invoke a service or API, your agent will go and speak to the secrets manager, grab the credentials from there, execute the request, and then pretty much you're good to go from there.
13:55And in the world of AWS, you would conduct all kinds of guardrails. You'd have some audit in place to make sure they have the right files in the right place, and they have the right policies for the right people to make sure that no one can access or see something that they shouldn't. If you're less familiar, this is what AWS looks like on the front end.
14:12So you have a series of services here. You have all kinds you can access. And if you're just starting out, this is overwhelming.
14:19You don't know what to click, what not to click, what the word code build means, what code commit means, and where you should actually use them if at all. So the one key advantage here is that our goal is not necessarily to become experts at using the console, but using the console to get the credentials you need to interact with it from your language model of choice, whether it be Cloud Code, Codex, or whatever you want.
14:42So instead of looking at all of those different logos and icons, you can just connect what is called the AWS, the Amazon Web Services command line interface.
14:52It'll ask you for a series of credentials. And once it has those credentials, it can start building things on your behalf. Once in a while, it will push you and ask you to go and click on accept for things like credentials, policies, and updates.
15:05But for the most part, it can help take care of the heavy lifting, and you can speak to it in plain English. You have to create a free AWS account. Make sure you use the free tier and know that if you start building real infrastructure and put your credit card, it will charge it.
15:18So, again, only experiment with this if you have an intention for a corporate use case or client use case. Once you have the CLI connected, now you can rely on something like a Claude code to drive and build and set up these services, and you can go back and forth quite a bit in plan mode to understand.
15:35Here is exactly what we're trying to accomplish. Here's our dream vision. Here's our goal, if you will.
15:40What services would we need? And what is the best bang for the buck that we can get? So you can go back and forth, and it's not gonna be perfect, but it will give you a pretty good cheat sheet that I wish I had five, six years ago.
15:51If your eyes are glazing over already, I'm just gonna show you this next diagram that really shows the big picture in one shot. And beyond that, I'm gonna start splicing every single tab that I showed you and show exactly how it works underneath the hood. So this is pretty much the breakdown of how this works.
16:07We have the telegrams. We have the Slacks. We have the dashboards on your computer, and they all go through one central hub.
16:14This central hub basically is called a load balancer. And the whole point of a load balancer, actually many applications of it, but one of the main applications is it helps manage balance the load of different requests from different platforms.
16:27So theoretically, if I had five, ten different devices that are all accessing this platform and the agents at the same time, it would help with things like queuing requests to make sure that we don't have a bunch of concurrent requests requests happening at the same time that are shooting, maybe one is failing or timing out.
16:43So that is our little gateway here where all roads lead to here. Then we have what's called a Fargate Orchestrator. This is basically a way to navigate and decide what is the microservice.
16:55And when I say microservice, it is one of these services like a DynamoDB, a database, uh, s three, again, your Google Drive or Dropbox equivalent, your KMS where you have a lot of your credentials, your guardrails, or Amazon Bedrock for things like inference, sending requests to the models.
17:10So this will decide based on request from platform, where do we go next? So if it's something to be stored, if it's something like a memory to be stored, it might be sent to DynamoDB.
17:20It might be stored in s three. We might make a request to something like Claude from Bedrock, which gives us back a response and then gets propagated back to all the platforms of choice. So this is the stack in a nutshell.
17:32And now we can peel each layer of the onion to look at the next parts. So let's say we go through one turn, one message that we send from Slack or Telegram. What happens to it?
17:42So this is said message. We first check, are we hitting any form of rate limits? Then we load the agent that we're talking to.
17:49So in that case, that was the Telegram based main agent. Then we go and double check that we have access to Bedrock and to make sure that the kill switch, the toggle that gives us access to it is on. So we're able to access it from our account.
18:02Then we double check, are we past our cost cap before we even execute the request? Because we have a budget, and like I said, things like Amazon Web Services can burn, not just tokens, actual hard cash very quickly. So if you wanna be able to cost cap it, we wanna check if we have that in place.
18:18Then we go through a couple other layers. One could be a guardrail, which I'll get to more in a bit. And then we basically dispatch any tools we need.
18:25And once we get the response from Bedrock, what happens is we run a scan. And what we do in the scan is we double check is there any form of PII, company secrets, anything that is on our list of things we don't want propagated or responded to back to a user.
18:39If it passes that and passes the audit, then you get the response back. From the infrastructure perspective, you start with a telegram message. This gets sent to our load balancer, that balancer for requests.
18:50Then we verify that we have the right account, that has the right access, that has the right secrets. We then double check we have the right agent that we're talking to, then we conduct the agent turn that I just showed you above. And this is where all that magic happens in that funnel that we went through.
19:05Then if there is any form of conversation that used to happen with other agents, it will happen. It'll go through that scan once again to make sure all messages that are applied to to Telegram basically abide by your list of commandments of what it can or can't say, and that's pretty much it. Now the beauty of using something like Bedrock is that you're not fully dependent on Anthropix Cloud models.
19:26You can use them if you wish, but naturally since you're paying API costs not through some subsidized subscription, it's gonna be pricey.
19:33So you could use these models as your orchestrators, the planners, the deciders of what will happen and how it will happen. And then you can pass on the grunt work to open source like Quyen.
19:44It even has something like Titan where Titan is a native AWS based language model. It is not the best, but for a very menial work, go identify this, go tag all of these, it will do the job. And one last thing on this is that in case you don't want any of these models and you wanna take an additional model from the outside, let's say you found something on Hugging Face, you could theoretically host it on your AWS account.
20:08Now it will cost you per hour to host, but it's always an option that you have here that's way more turnkey than doing this yourself if you can't use something like an open router. And one thing to emphasize on the data plane side is that all of your agents' memories, system files, folders, etcetera, live in these s three buckets that are encrypted with your keys.
20:27So everything is meant to be switched off with one click of a button. You could completely nuke and delete everything in one shot because there is no copy on someone else's server. It all lives in here by design.
20:39So all of your inference, your API calls to models live in one place. All of your files and folders live in one place, and all of them live in the same ecosystem. Now this is less exciting, but when it comes to things like security, there are six main domains to think about, and this is not exhaustive.
20:54Like I said, what I built is not maximum grade security. There's definitely so much more that a cybersecurity expert would wanna tweak to make this perfect. So it's up to you to talk to your cybersecurity professional, IT person of choice to really see how this could work for you, your clients, etcetera.
21:11But we have the kill switches where depending on the type of service like we saw before, we can switch it on or off as needed. And then when it comes to any action on the platform, if you add any members, because this is a multi tenant platform. Multi tenant just means that you can have additional users kinda like a SaaS that can use the same platform in different ways.
21:32So every single action that happens on the platform is written into a log that does not delete, that basically persists for up to however long you wanna retain log actions.
21:42That could be three years, five years. It could depend on regulations in your country, etcetera. And those can be stored in databases, in s three buckets, etcetera.
21:51And then when you have things like cost, like we said, when we reach or surpass a certain cost cap, everything is automatically closed. We can't talk to our agents until we reach the cap for the next day, and we have a brand new quota. And then we have things like bedrock guardrails, and their entire point is to make sure that it catches any personal information before it goes in or out.
22:10Then we have concepts here that are little bit more advanced, but I'm gonna break it down very simply. So least privilege I am. This basically means that we're going to give each additional user the most minimum access possible to things like resources, documents.
22:25So if anyone needs more access, we can give it to them voluntarily, but we don't start off by giving additional users that access the platform and access and create these agents with everything they want out of the box. And then we have constant scanning for credential leaks across the entire platform as well. So if you go to the designated part of the platform that I've built out, you have a whole section that's called security posture.
22:48And what this means in English is how ready is your organization for all of the threats and issues that you might encounter in the day to day, and it will be at different levels of severity depending on what's at stake in the company. So this is where you could navigate and see what is toggled on, toggled off, and see whether or not it reaches your standards.
23:08When it comes to these rules that you assign to individual users, you have things like policies. Now when it comes to these policies like you can see here, they can be annoying to write. Naturally, you can use language models.
23:19So I have an AI policy drafter here. Now if you have no clue what a policy even is, then you can glance past this. I'm more so show this to show that there are many unsexy elements when it comes to building at the enterprise level.
23:31And finally, it comes to observability, there's something called CloudWatch. You can think of CloudWatch as this mega dashboard command center.
23:40We can see metrics, logs of events being fired in the system, and you can really customize it. Then you have an audit trail that acts as a maximum version of this audit log here. So going back to here, this is where we have the readiness assessment like I showed you initially.
23:55So you can go into this section on compliance right here, and then you have the scan that you can keep double checking. You can change the scan logic.
24:04Now when it comes to the underlying regulations, I just use CLOG to pull the latest version that it could find of SOC two and HIPAA. If you don't know what SOC two or HIPAA is, they are basically security requirements.
24:15SOC two is for the majority of types of companies. HIPAA is typically for health care and life science oriented companies. These regulations might differ by region, by country, by domain.
24:27So you wanna just double check that if you want to have AI help you make something quote unquote compliant, that you wanna battle test it, make sure you have the right information, and ideally the right personnel to have the human in the loop that's necessary to make sure that it's accurate. Now another element that we touched on was leak scanning.
24:44And when it comes to scanning messages, these are examples of things that you could be scanning for. So access keys to your Amazon Web Services, your credit card information naturally, any form of SSN numbers, Slack tokens, could be Salesforce tokens, could be GitHub PATs.
25:00So another of these six elements that we mentioned were DLP scans, and what we're scanning for are these as examples. So things like access keys, access keys to GitHub, credit card information, Slack tokens, Salesforce tokens, any form of API key header, anything that could be leaked and used against yourself or the company.
25:20Now switching gears a bit, when it comes to talking to your data, one of the most popular requests from any company that we've ever worked with since 2023. This is how that looks like. So you feed a document to the system.
25:33It is ingested, then it's embedded. Now typically, if you use things like Hermes and OpenClaw, you would just pick the cheapest API from whatever provider.
25:42So it could be OpenAI embedding small. You could use the Gemini models, and you'd use the typical consumer grade app. Now when it comes to here, again, we wanna optimize for having every single thing run within the system itself.
25:54So we can use something like Amazon's Titan embed, which is decent, to take the documents and vectorize them.
26:02Once they're vectorized, we store those vectors in our s three bucket, then we have the ability to have a conversation with it, and then we lock in our memory system that can take advantage of things like memories where haiku is used as a workhorse model because it's super cheap to look through the last few turns of our conversations to see, is there anything salient here that is worth remembering.
26:23And then that can be committed to the hive mind that we have in general. Then you go through this feedback loop of having back and forths, having those memories stored, and having those memories prioritized in recency, weight, and importance.
26:37So if we go into our knowledge tab here, all you'd have to do is this can take all kinds of files. You just go to browse, and then you can click on something like the prompt pack that I'm gonna make available to you to help you use something like AWS CLI to get started on this kind of build if you so choose.
26:53This will go through in real time, upload, extract, chunk, embed, and you could obviously mix and match how this is created and culminated. And then you get this file down here. And the goal here is that you can click on something like Sonnet, and you can just test asking a question.
27:11So what is in this file? And then you could say ask, and this should be able to go and send the request to Sonnet, double check against the prompt pack itself, and it comes back with a full breakdown of what's in here.
27:27And one detail here is that we're not necessarily doing reg for one singular agent. We could be doing it for shared agents where all of them will have access to all of the files that we upload. So this really comes down to preference, security, and all the other requirements.
27:41But if you wanna check how GPTOSS models and DeepSeek would all handle this exact same question, they can also compare this as well.
27:49So you can see now we live in an age where you can cannibalize a lot of platforms that have come out to do this very similar type of task, and you can do it in a way where you can even track cost of each and every inference that you send in. Now when it comes to managing multimedia files, we use something like Hermes or OpenClaw, it would be as easy as adding a Gemini skill to use the nano banana or if you want to use OpenAI, can use the GPT two image model to create images as you wish.
28:17But when it comes to this kind of platform, you'd ideally want to find a model that is native to Bedrock that you can use for the exact same purpose. And that's where that mapping concept from earlier becomes important.
28:28So in our case, we can now use Nova models not just for audio but also for images. So if I say something like create an image of a mango with a cross section breaking down each and every part of it and I paste that over here, and I send this through not only Slack, but I send it through my telegram as well, within a pretty short amount of time, it should go and be able to use the Nova models to securely make this image, propagate it back to us through the system.
28:58You can see right there, it is generating the image right here. And within five, ten seconds, we should return back an image that we can actually click and interact with. Now is this as sophisticated as Gemini models?
29:10No. But for most intensive purposes, it could be helpful to accomplish very similar results without having the headache of sending an external request where things can happen when that request is in transit. Now when it comes to managing your team, I've tried to make it as easy as possible for you to go to your members tab, click on invite member, just add the member, and then you can change the role to something like view only, then you can save that role.
29:36Then that member, when they receive an invite, it will come from a micro service in AWS that's called SCS. If you don't know what SCS is, if you ever heard of resend, which allows you to easily create a pipeline to send emails, email blasts, etcetera, resend is built on top of this micro service.
29:52They're basically making a margin by making it easy to use the service. So you would get an invite right here. This invite would culminate in a special link just for you.
30:01Once you log in, this will send you a magic link. And then once we have that magic link, it will allow us to see the platform from a different member's eyes. So the idea with access is that you have one company command center with a series of admins or a single admin and maybe have your operators go and access and use the agents, whereas a viewer or an analyst can only look at the overall numbers.
30:24You could build out different dashboards to show different outputs from your other platforms. You could have an operator role that could use the agents and access the data, but not have an understanding of what's going on when it comes to cost capping. So you'd be able to designate different roles, responsibilities, and access as needed.
30:41Now on top of everything else you wanna think about, when it comes to that beautiful Jarvis feature, even the way that works needs to be very well engineered. So when it comes to Jarvis, it's great to have it go through and navigate through tabs as you wish and click through and explain exactly what's happening on that tab.
30:58We wanna make sure that you can't just turn a voice command into something destructive that destroys data, that ruins some form of back end structure for a security requirement. So you design Jarvis to be read only and explain only.
31:12So what we use behind the scenes for the audio is the Nova model from AWS, and then we use Claude for reasoning. You could swap this out again for any open source model that's on the platform.
31:23If there is data that needs to be read so for example, if I click on our beautiful Jarvis agent and just speak to it and say, hey. Can you break down exactly what these costs are doing? It should be able to look at that page, was really hard to set up, scrape all information from the back end, quickly understand exactly what's happening, and explain it to me in plain English while holding a conversation.
31:43So if I click on it, it will take the risk here of a demo. Good to see you, Mark. What can I help you with?
31:49Yeah. Can you go on the cost tab real quick?
31:52Sure thing, Mark. I'm on the cost page now.
31:55Okay. Can you just explain to me where our projections are coming from? But explain it to me in one sentence.
32:01The projections are based on the current daily spend rate and your active tiers cost structure. But why is it so high? Or is this high, or is this average for my kind of build?
32:10Your spend is actually well below the weekly average, and the daily cap is still far out of reach. Okay. So you could see right there if I just click on this and we close our conversation, I was even able to build analytics in real time just looking through all the tabs.
32:25So it's really important that you overengineer the explainability portion, the fact that it has some social cues as to how to communicate the information, but it's meant to be a read only function.
32:37So if we go a little bit deeper here. So it's meant to be a sci fi feature. We can actually make it a tangible way to navigate and explain exactly what's happening on the platform, kinda like a glorified walkthrough.
32:48So it will walkthrough and read for the most part. If you want it to write, it will be blocked unless you really enable that and you wanna make sure exactly what you're enabling it for. And when it comes to accessibility, it's important to have some form of walkthrough.
33:02So in the platform itself, I created a full tour where you can click tab by tab and see exactly what each thing correlates to, how it works, how it relates to each other. So having this level of accessibility for a platform, assuming that you wanna go on this journey or tell someone else to go on this journey, is important because this is a very iterative process.
33:22It'll take a lot of work, a lot of security screen, and a lot of innovation to make sure that you can use AI in a practical way, but also in a novel way as well. Now this video could naturally be three to four hours, and I still wouldn't be done explaining how I built every single feature, everything that had to plan out, how I planned it out, and how I did a lot of the context engineering to make this viable.
33:44But the core idea I wanted to convey is that one, the whole world of Open Claws and Hermes are amazing and cute for personal usage, but when it comes to actually dealing with clients and enterprises at a much higher level, you have to have different conversations, you have to have different builds, and they can't participate in a lot of the AI party without having to think through and engineer these kinds of solutions.
34:08And the bright side is, although I absolutely have prerequisite knowledge to do this at a higher level, even as a non technical person, I'm not telling you you could do this in a very short amount of time, but you could have a shot at actually drafting the infrastructure, tinkering, and seeing if you can get up a v zero or a v one, they could hand off the rest of your team.
34:27But even as a nontechnical person, when you hook up these very intelligent models to something like the AWS command line interface, you have all these different options and all the micro services that Claude or Codex could go through and really understand what this service is for, how you could use it, and really comes down to supreme planning.
34:47I probably planned for two weeks before I started building and then every single time for the next phase, I would spend another couple weeks planning before the next iteration of building. When it comes to the enterprise OS system, it looks similar to a personal system, but there are many more wires in between.
35:04They have to make sure are there, many more bridges, and a lot more thoughtfulness as to what's happening to the data, where are your requests being sent, and how can you interact with language models in a responsible way. Now like I promised, I have a series of care packages that I'm making available to you for free including a blueprint of this entire system, a copy of this slide deck, a breakdown of a series of prompts that will not build this for you.
35:27This has taken a lot of time, but will at least set you on the right path and the mindset on how you could begin tackling this if you want to try this in your company, in a company you work with, work for, and basically give you the building blocks to get started. So to grab those resources, make sure to check the second link down below and you'll find all the care package you need.
35:46But if you want my exact build and the subsequent improved versions thereafter and eventually hopefully a version of this for Azure since a lot of people are in public sector that use Azure, then you wanna check out the first thing down below. So as a premium or annual member, you'll get access to our ClawdClaw OS system that we've been improving for four to five months.
36:04I now have a whole team on this that has been making iterations ever since I released that video. And on top of that, you'll get access to this repo and the subsequent versions thereafter. So if you wanna master Agendik OS at different leagues, then that will be your best resource for you.
36:17And to the rest of you, if you found this novel and helpful, the one huge thing you can do for me is leave a like on the video, a comment if you so choose, and I'll see you in the next one.
The Hook

The bait, then the rug-pull.

The demo opens before the introduction does: a custom dashboard, a live AI agent called Jarvis, and a spoken briefing delivered in real time — 'Your SOC 2 and HIPAA readiness is scored against the live setup.' Only after the product speaks for itself does the creator explain what it took to build it.

Frameworks

Named ideas worth stealing.

09:40list

Five Enterprise AI Variables

  1. Transparency
  2. Simplicity
  3. Scalability
  4. Security
  5. Cost

The five design constraints that separate a personal agent OS from one that can survive IT scrutiny at an enterprise or regulated-industry client.

Steal forScoping any AI project for a corporate or nonprofit client — use these five as your requirements checklist before architecture planning.
10:44model

Hermes-to-AWS Service Mapping

  1. Hosted model API → Bedrock
  2. Local files → S3
  3. One operator → IAM roles
  4. Local DB → DynamoDB
  5. .env secrets → Secrets Manager
  6. No audit → CloudTrail + Guardrails

A one-for-one substitution table that maps every component of an off-the-shelf personal agent stack to its enterprise AWS equivalent without changing the underlying architecture.

Steal forClient conversations — show this table to explain why the upgrade is structural, not cosmetic.
24:40list

Six Security Domains

  1. Kill switches
  2. Write-once audit logs
  3. Cost caps
  4. Bedrock Guardrails
  5. Least-privilege IAM
  6. Credential leak scanning

The six security layers the creator implemented, each targeting a distinct threat surface in a multi-tenant AI agent platform.

Steal forSecurity review checklist for any AI infrastructure project — present this as 'here is what enterprise-grade actually means in practice.'
17:45model

Agent Turn Chokepoint

  1. Rate limit check
  2. Load agent
  3. Bedrock kill-switch check
  4. Cost cap check
  5. Guardrail
  6. Tool dispatch
  7. Converse loop
  8. DLP scan
  9. Audit

The sequential gate-and-check flow that every message passes through before a response is returned — ensuring no turn can bypass cost, security, or compliance controls.

Steal forDesigning the request pipeline for any production AI agent — this is the chokepoint pattern that keeps a single bad turn from causing a cascading compliance failure.
33:31model

From Idea to Packaged OS

  1. The Idea
  2. Build it by hand
  3. Package the skill
  4. Your team taps in

Four-stage arc from concept to team-deployable product, emphasizing that packaging and handoff are distinct build phases that must be planned for upfront.

Steal forFraming client engagements — the goal is not just to build but to package so the client team can operate independently.
CTA Breakdown

How they asked for the click.

VERBAL ASK
34:20product
Check out the first thing down below — you'll get access to this repo and the subsequent versions thereafter.

Dual offer: free care package (blueprint, slide deck, prompts) linked second; premium course with full repo linked first. The free offer seeds trust while the paid offer captures the most motivated viewers.

MENTIONED ON CAMERA
FROM THE DESCRIPTION
PRIMARY CTAWhere the creator wants you to go next.
OTHER LINKSAlso linked in the description.
Storyboard

Visual structure at a glance.

open
hookopen00:00
platform tour
valueplatform tour02:05
five variables
valuefive variables09:40
service mapping
valueservice mapping10:44
drive with Claude Code
valuedrive with Claude Code14:42
chokepoint flow
valuechokepoint flow17:45
security domains
valuesecurity domains21:00
RAG and media
valueRAG and media28:01
Jarvis wiring
valueJarvis wiring30:55
idea to OS
ctaidea to OS33:31
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

Chat about this