Modern Creator
Mark Kashef · YouTube

How to Use /goal to Build a Self-Improving OS

A 10-minute live walkthrough of five agentic scenarios where /goal cleans, sharpens, revives, forges, and maintains your Claude Code OS by itself.

Posted
1 weeks ago
Duration
Format
Tutorial
educational
Views
10.6K
325 likes
Big Idea

The argument in one line.

You can use /goal to automatically optimize, audit, and maintain your agentic OS by delegating non-technical cleanup tasks like skill consolidation, rule contradiction detection, dormant project revival, and scheduled maintenance to Claude.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A builder with an existing Claude Code OS or agentic workspace who has accumulated 20+ skills, rules, or agent files and wants to automatically consolidate without manual triage.
  • Someone running multiple side projects or experiments in Claude Code who needs a systematic way to revive, categorize, and prioritize dormant work without starting from scratch.
  • A developer who understands prompt engineering and agentic loops and wants to see /goal applied beyond typical automation tasks like code migration or web scraping.
SKIP IF…
  • You've never built or deployed a Claude Code OS or agentic system — this assumes you already have a workspace with accumulated files, rules, and skills to optimize.
  • You're looking for a general introduction to /goal or want to understand the mechanics in depth — this is a five-use-case walkthrough, not a technical deep dive on how /goal's dual-agent architecture works.
  • Your workspace is already lean and well-organized — the core value here is cleanup, consolidation, and revival at scale, which won't apply if you maintain minimal, curated systems.
TL;DR

The full version, fast.

The /goal slash command in Claude Code and Codex runs an agent in a loop with a separate judge model verifying completion, and you can point it at your own agentic operating system instead of formulaic build tasks. The mechanism is simple: write an objective under 4,000 characters, optionally pair it with a rubric file for scoring criteria, and let the dual-agent loop iterate until the judge confirms terminal state. Five practical applications cover the lifecycle: clean bloated skill folders down to essentials, sharpen individual skills against your own rubric, revive dormant half-built projects, forge new skills by mining recurring patterns from session transcripts, and maintain the system on autopilot by chaining /loop with /goal so cleanup runs every thirty minutes against a maintenance log.

Members feature

Chat with this breakdown.

Modern Creator members can chat with any breakdown — ask for the hook, quote a framework, find the exact transcript moment. Unlocks at T2: refer 3 friends + add your own API key.

Create a free account →
Chapters

Where the time goes.

00:0000:56

01 · Hook + Problem framing

Reframes /goal as an OS maintenance tool. Lists the three familiar agentic OS problems: bloated skills, expanding CLAUDE.md, contradicting rules.

00:5602:05

02 · What /goal is and how it works

Goal input (<=4000 chars), primary worker agent, judge on a different model checks terminal state. Excalidraw WORKER to JUDGE to DONE? diagram shown.

02:0503:05

03 · Demo 1: CLEAN

47 skills + 7 rules in a sandbox folder. /goal audits, consolidates, archives. Result: 47 to 17 skills, 7 to 4 rules, 3 contradictions resolved. 2m 50s.

03:0503:23

04 · Mid-roll CTA

Early adopters community pitch.

03:2304:47

05 · Demo 2: SHARPEN

rubric.md with 5 criteria. /goal simulates thumbnail-prompt-builder, scores vs rubric, rewrites SKILL.md, logs to ITERATION_LOG.md. Key: write your own rubric or AI grades itself easy.

04:4706:01

06 · Demo 3: REVIVE

22 projects folder. /goal checks git log, attempts to run each, fixes deps, rewrites README, logs verdict to ALIVE.md. 18 stubs archived, 4 kept. 2m 16s.

06:0107:52

07 · Demo 4: FORGE

50 JSONL session transcripts. /goal finds 3 recurring patterns with no skill: Excalidraw 8x, content audit 7x, LinkedIn 6x. Creates all 3 SKILL.md + SMOKE_TEST.md files.

07:5210:13

08 · Demo 5: MAINTAIN + three autonomous modes

/loop + /goal combo. Three autonomous trigger types explained. Live demo: cron created, drifted-skill archived, contradiction resolved, MAINTENANCE_LOG.md written, loop complete on turn 1.

10:1310:53

09 · Wrap + CTAs

Prompts in description link 2. Community link 1. Claude Code Zero-to-Hero course upsell.

Atomic Insights

Lines worth screenshotting.

  • The /goal command runs a judge agent from a separate language model alongside the primary agent — every iteration is reviewed by a built-in devil's advocate.
  • Pointing /goal at your own workspace and asking it to optimize itself is the meta-use: the system improving the system that runs the system.
  • A CLEAN goal reduced 47 skills to 17 and 7 rule files to 4 in under three minutes — archiving removals rather than deleting them for review.
  • A SHARPEN goal with a pre-written rubric forces Claude to score its own skill output against your specific criteria before accepting the revision.
  • If you let AI write its own evaluation criteria, it goes easier on itself — providing the rubric prevents this gaming of the goal loop.
  • A REVIVE goal sent through 22 dormant projects, filtered out hello-world placeholder repos, and triaged which ones deserved resurrection.
  • A FORGE goal reads all JSONL session transcript files and mines which prompting patterns are reusable enough to be codified as new skills.
  • The /loop command combined with /goal creates a scheduled maintenance cycle — the OS cleans and sharpens itself on a recurring timer.
  • A /goal for non-technical tasks — optimizing markdown files, triaging projects, cleaning rule contradictions — is just as valid as using it for code migration.
  • Every time the primary agent thinks it has finished, the judge agent re-evaluates whether the terminal condition has actually been met.
  • A goal that takes three minutes to complete and removes 30 redundant skills represents hours of manual cleanup work eliminated.
  • Session transcripts stored as JSONL files are a raw archive of prompting intelligence — /goal can surface the patterns worth preserving before they are lost.
Takeaway

Your skills folder maintains itself now.

JoeFlow / MCN playbook

/goal is not a task runner — it is a quality loop you can point at your own system.

  • Write rubric.md before running /goal on any content-generation skill. AI grades itself easy without it.
  • Run CLEAN on ~/.claude/skills right now — if it has grown past ~20 skills, /goal can halve it in under 3 minutes.
  • The FORGE pattern applies directly to JoeFlow: point /goal at ~/.claude/projects/**/*.jsonl and let it find which patterns deserve skills.
  • The MAINTAIN + /loop combo is the unlock for MCN: scheduled health checks on skills and CLAUDE.md drift without manual review.
  • The worker/judge architecture means you can trust destructive operations (archiving, rewriting) you would not trust a single-pass agent with.
Glossary

Terms worth knowing.

/goal
A slash command in Claude Code and Codex that runs an agent in a loop against a stated objective, with a separate judge model verifying completion before exiting.
Slash command
A built-in shortcut typed with a leading slash that triggers a predefined workflow or behavior inside an AI coding tool like Claude Code or Codex.
Codex
OpenAI's coding agent product that runs tasks against a codebase from the command line, comparable to Claude Code.
Claude Code
Anthropic's command-line coding agent that reads, writes, and executes against a local codebase using natural-language instructions.
Agentic OS
A personal operating layer of skills, rules, and configuration files that an AI agent uses to act on your behalf across tasks.
Skill
A reusable, named instruction bundle that teaches an agent how to perform a specific task, stored as a markdown file the agent loads on demand.
CLAUDE.md
A markdown file Claude Code reads at session start that holds project or user instructions, conventions, and standing rules for the agent to follow.
AGENT.md
A markdown configuration file similar to CLAUDE.md used to give an AI agent persistent context, rules, and behavior guidelines.
Judge model
A separate language model that audits the primary agent's output against the stated goal, acting as a devil's advocate before the task is marked complete.
Terminal state
The point at which an agent loop decides its objective is satisfied and stops iterating.
Objective function
A clearly stated goal an agent optimizes toward, used here as the success criterion the /goal loop checks against each iteration.
Rubric
A predefined scoring guide listing criteria the agent must meet, used to evaluate outputs objectively instead of letting the model grade itself loosely.
Sub-agent
A secondary agent spawned by a primary agent to handle a focused subtask, such as simulating a skill's output or scoring it.
Iteration log
A running record of each loop pass an agent takes, capturing its reasoning, actions, and revisions across attempts at a goal.
Hello world project
A throwaway starter project with only minimal placeholder code, used as a stand-in for dormant or abandoned work with no real substance.
JSONL
A file format where each line is a separate JSON object, used by Claude Code to store full session transcripts for later parsing.
Session transcript
The complete back-and-forth log of a user-agent conversation saved to disk, including every prompt, response, and tool call.
/loop
A Claude Code slash command that re-runs a given task on a recurring interval, such as every thirty minutes, so it executes autonomously over time.
Hooks
Configurable triggers in Claude Code that automatically run a command on specific events, like session start, stop, or after a tool use.
Cron job
A scheduled task on a computer that runs a command automatically at fixed intervals, used here to keep a /goal routine firing on a timer.
Resources Mentioned

Things they pointed at.

03:10productClaude Code Zero-to-Hero course
Quotables

Lines you could clip.

02:05
We're giving the agent a mirror. We're telling it to go through all of the assets to see how it could best optimize itself. The target of the goal is optimizing the very system trying to achieve the goal.
Self-referential concept stated cleanly in two sentences — ideal standalone clipTikTok hook↗ Tweet quote
04:14
It will go easier on itself to try to increase the chances that it accomplishes the goal.
Counterintuitive AI insight — high share potentialIG reel cold open↗ Tweet quote
09:37
Way easier than having to remind yourself to do it every day.
Clean payoff line for the MAINTAIN use-case, no setup needednewsletter pull-quote↗ Tweet quote
The Script

Word for word.

metaphor
00:00So there's this newer slash command called slash goal, which is now native to both Codex and Cloud Code. Most people are using it exclusively for technical tasks, like migrating code bases or running batch tests.
00:12But the truth is you can use slash goal for pretty much anything, including optimizing your existing AgenTic operating systems. And whether you've been building your AgenTic OS for a while or you're just getting started, you will hit the familiar problems over and over again.
00:26The skill folder that keeps on growing, the claud m d's or agent m d files that keep expanding, and the rules that start contradicting each other. You mean to clean it up, but you never had the time to. So what if you didn't have to carve out time to do all the cleanup yourself?
00:39What if you could point slash goal at a folder or your entire computer and give it a series of nontechnical tasks that it executes to perfection? And what if you could bring every dormant side project you've ever had back to life? If you know how to use slash goal properly, these will no longer be hypotheticals.
00:56So in this video, I'm gonna quickly walk you through what slash goal is and how it works, and then apply it to five real agentic scenarios. One to clean, one to sharpen, one to revive, one to forge, and one to maintain. Just seeing me go through all of them should spark a lot of ideas.
01:11Let's get into it. Now just in case you don't know what slash goal is or does, I'll quickly walk you through it. So you start off by giving the AI a goal or objective function in 4,000 characters or less.
01:22Once it has this, it goes in a loop and it has a judge side by side. And most people don't know that this judge actually operates out of a different language model. So technically, you do have a devil's advocate looking at the work of the primary agent until it gets to its terminal state.
01:37So basically, every time the agent thinks it's done, it has another agent looking over its shoulders to confirm, has the condition been met? If it clears, the goal is done and is set to be accomplished, and these can take anywhere between a few minutes all the way up to an hour to complete depending on the complexity. Now most demos of this slash goal use it on something very formulaic, something like go and build me a snake game while I sleep or go and scrape all of these websites until you have a perfect CSV.
02:05In our case, we're giving the agent a mirror. We're telling it to go through all of the assets, the markdown files, the rules, the agent MDs to see how it could best optimize itself. So the target of the goal is optimizing the very system trying to achieve the goal.
02:19The overarching goal of the demos I'm about to walk through are taking a messy workspace and bringing order to it. So if we pop into a terminal here, you'll see that we have a hypothetical folder with 47 skills as well as seven different rule files.
02:32If we scroll down, you'll see that the goal is set here. If it wasn't set, it would tell you that you're over the character limit, and I think due to the specificity of our prompt, it actually completed this in under three minutes. So it took two minutes and fifty seconds, and the result is it took 47 skills and made them into 17.
02:4930 were archived, which is good because we wanted to store anything it removed just in case we disagreed with it. Then it took seven rules and made it into four. It found three rule contradictions, and then it said whatever removed from the ClotMD.
03:03And by the way, if it's your goal to go infinitely deeper on things like Cloud Code, AgenTix systems, and AgenTix OS systems for your business, then you wanna make sure you check out the first link down below for my early adopters community. My primary focus in there, just like on YouTube, is giving you all the magic without the hype.
03:20Maybe I'll see you inside, and let's get back to the video. So we covered the clean use case, but what about sharpening? Let's say we have a series of files, one of which is the rubric dot m d file, where you create evaluation criteria ahead of time before you even run slash goal.
03:35So let's say that this is all the criteria that I use, which is actually pretty similar to creating thumbnails, especially the little clawed mascot themed thumbnails. I can paste a goal like this.
03:46So I will paste this in, and then it reads the following. Go through this skill dot m d file for each input.
03:54Go through our test inputs so we can basically give it simulation criteria, or we can ask it to create its own simulation criteria. And then simulate the skill's output, then score it against the rubric. And then you can go even deeper here.
04:06You could say, go and use a series of sub agents to go simulate and run the skill, go through the outputs, and come back to me and tell me how well it does based on this criteria. But the one thing you accomplish by creating the rubric yourself is you can guarantee that the goal will be accomplished against your specific criteria, not some easy made up one by the AI itself.
04:28Because if you've tested using AI before, usually, it will go easier on itself to try to increase the chances that it accomplishes the goal. And you'll see it started an iteration log so we can take a look at all of the chain of thought and the rationale that it's using, and then it's rewriting the Skill MD to enforce all of the five rubric criteria.
04:47So let's say you work in a specific domain where all of your skills should always have some form of standard or specific set of outputs. You can make sure that all of the skills in your arsenal always abide by that by constantly optimizing them. Couple minutes, we have the fix and the optimized version of the skill ready to go.
05:05Let's say you're one of the many users in the world that have a series of projects, a plethora of half built bridges. You start a project, you work it to 70%, but you've never taken it all the way or at least not to a point where it's production ready.
05:19Our goal here is to try to revive as many of our existing 22 projects. So you can then send this prompt. We could do goal and then send this over.
05:29So the prompt reads as follows. You're essentially asking Cloud Code to go through every project and every subfolder to use, test, and see what existing Git commits, tests, Python functions exist to see how it could revive or resurrect any of these projects and if they deserve to be resurrected to begin with.
05:48Now in my case, I purposely created a series of useless projects so it could catch them as dormant hello world projects. And what it does is remove them and keeps the ones that are actually useful. Now we move from reviving to forging.
06:02And one interesting use case you can use slash goal for is going through all the transcripts between you and Claude code in every single session and having it pull out which prompting patterns deserve to be something like a skill. Because all these transcripts are stored in what are called JSONL files, which are basically fancy JSON files.
06:21And I personally even have a skill that's called slash convo review where I ask questions to go through all the conversations through all the folders if I can't remember which folder I worked on and a specific project. But if you wanna find the nuggets that deserve to be skills, then you can write something like this where we'll do slash goal and we'll paste this specific command.
06:41And this is basically asking it to go through all the transcripts in your folder. In your case, you'll want it to go through the tilde, little squiggly line, Cloud Code folder because it will go through all the session transcripts globally, and then it will go through back and forth to look between user and assistant.
06:58Assistant is Cloud Code, user is you, and it will try to find and extract three recurring prompts that deserve to be a skill and then naturally we're going to ask it to create said skills. And you'll see in our hypothetical world it found three recurring patterns that don't seem to have skills that should.
07:17One of them is the Excalidraw doodle canvases. Those are the images that you see on those canvases I show you in most videos. Now I actually do have a skill that I'm hiding from this folder, but for all intents and purposes, it can't see it on purpose.
07:30Number two is auditing content for patterns, obviously, my YouTube content. And number three, it doesn't see my LinkedIn skill that I have in a completely separate folder. So it's seeing a series of transcripts about LinkedIn posts generated from my transcripts with no associated skill.
07:45Within a few minutes, we have all three skills ready to go, and you can start testing them and seeing if it hits the mark. Now for this last use case, I'm gonna throw you a curveball. Because when you use Cloud Code, there are three main ways that you can have it run autonomously without you being at your computer.
08:01Them is to use slash loop, which will execute some task every x amount of time. It could be every five minutes, every thirty minutes, every hour. The next one is the one we've spent this entire video on, which is slash goal.
08:12And the very last one is the tried and true hooks where you can make it run on a specific event or at the very end of a session. So hypothetically, instead of creating something new or leaning things down right now, what if we wanted a regular maintenance of our infrastructure? You could theoretically go through the remaining skills we have where we went from 40 to 17, and we want this cleanup done on a regular basis.
08:36So we could write a compound prompt like this where we do slash, loop, and in this case, we put the time interval every thirty minutes. This could be every hour, every ninety minutes. It might be overkill, but it's more so to show you that you can combine them together.
08:50And then you could do slash goal. Then after that, we paste the rest of the prompt. So you could tell it to archive any skill that hasn't been used in the last thirty days and constantly check for different criteria.
09:02You can make this whatever you want, but the core idea is this sets it up to run-in the background so long as your session is open, and it can keep refreshing things, keep checking are your rules relevant given all the work that you've done? Is your CloudMD as optimized as it can be? And then we ask it to maintain all of the changes or all of the proposed changes in a maintenance log file.
09:25So it creates the cron job to run on your computer, and it has the scheduled goal to run every thirty minutes. And naturally, can use slash goal and slash loop for whatever you want. But when it comes to your Genetic OS, using it as a way to maintain your infrastructure or at least constantly auditing if your skills, CloudMDs, and any associated Agenic OS assets are optimized is way easier than having to remind yourself to do it every day.
09:51And here's an example of a hypothetical finding where it found one stale skill called drifted skill, which I actually don't know about, last used March 31, which is forty five days old and one contradiction. So imagine you have this system running and it keeps looping, it keeps writing this maintenance log, and then you can start analyzing your maintenance log to see how well is it maintaining my infrastructure.
10:13And then it goes through, and you can see it's looping here. It will keep running in this terminal, but you get the idea. So, hopefully, this gives you a good glimpse into how powerful slash goal can be, not just to all of these other tasks like build me a million dollar startup, don't make mistakes, but to actual practical use cases for your Agenic OS systems.
10:31If you wanted access to the prompts that I showed you so you could use them or take derivatives of them for your own use case, I'll make them available to you in the second link in the description below. And last but not least, if you wanna go infinitely deeper on things like Clawd Code, AgenTek OS systems, and looking at all the plumbing that needs to happen to run it perfectly, then you wanna check out the first link, and maybe I'll see you in my early adopters community.
The Hook

The bait, then the rug-pull.

Everyone has seen the build-me-a-snake-game demo. Mark Kashef opens by asking a different question: what if you pointed /goal at the system doing the building? Five live demos later, your skills folder, CLAUDE.md, and rules files have reorganised themselves — and you did not touch a single file.

Frameworks

Named ideas worth stealing.

01:07model

The /goal Worker-Judge Loop

  1. Give goal (<=4000 chars)
  2. Primary agent executes
  3. Judge on different model checks condition
  4. NO loops back, YES exits to terminal state

Two different models in a loop: one does the work, one audits it. Runs until judge says condition is met.

Steal forAny JoeFlow or MCN task with a clear success criterion: quality gates, rubric-scored outputs, batch cleanup jobs
00:56list

The Five OS Maintenance Modes

  1. CLEAN: consolidate bloated skills/rules
  2. SHARPEN: iterate skill against a rubric
  3. REVIVE: resurrect dormant projects
  4. FORGE: mine transcripts for missing skills
  5. MAINTAIN: schedule recurring health checks

Complete taxonomy of what /goal can do to your agentic OS. Each mode has a distinct objective shape and a different folder as the target.

Steal forJoeFlow skills system, MCN prompt library, any project with a growing skills or commands folder
07:57list

Three Autonomous Trigger Types

  1. /loop: fires on time interval
  2. /goal: fires when objective met
  3. hooks: fires on event or session end

Three ways Claude Code runs without you. Can be combined: /loop 30m /goal <prompt> runs a goal on a cron schedule.

Steal forJoeFlow session automation — /loop for recurring checks, /goal for complex one-shots, hooks for post-session cleanup
03:27concept

Write Your Own Rubric Before Running /goal

If you let the AI define success criteria, it sets an easy bar. Define rubric.md yourself with specific pass/fail criteria per criterion. Goal only clears when YOUR criteria are met.

Steal forAny content quality gate: thumbnail prompts, email hooks, sales copy review, skill output scoring
CTA Breakdown

How they asked for the click.

10:13link
If you wanted access to the prompts that I showed you so you could use them or take derivatives of them for your own use case, I'll make them available to you in the second link in the description below.

Dual CTA: prompts in description (high intent, immediate value) + community link (recurring engagement). Course upsell shown visually without hard pitch. Well-executed.

Storyboard

Visual structure at a glance.

open
hookopen00:00
problem
hookproblem00:19
/goal mechanism
promise/goal mechanism01:07
demo 1: CLEAN
valuedemo 1: CLEAN02:05
demo 2: SHARPEN
valuedemo 2: SHARPEN03:26
demo 3: REVIVE
valuedemo 3: REVIVE05:05
demo 4: FORGE
valuedemo 4: FORGE06:02
demo 5: MAINTAIN
valuedemo 5: MAINTAIN07:57
wrap + CTA
ctawrap + CTA10:13
Frame Gallery

Visual moments.