Big Idea

The argument in one line.

Giving AI coding agents direct access to the simulator, browser, and telemetry tools they need to verify their own fixes eliminates the manual feedback loop that makes solo development slow.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…

A solo iOS or web developer already using Claude Code or Cursor who still manually pastes logs, screenshots, or error messages back into the chat.
A developer who wants to walk away from a running agent session and come back to a finished fix rather than babysitting every step.
Someone paying for AI coding tools and unsure whether Opus max thinking is worth it over Sonnet medium thinking.
A solo developer shipping many PRs a day with no code reviewer and looking for an automated safety net.

SKIP IF…

You are new to AI coding tools — the video explicitly skips basics like plan mode, debug mode, and dictation.
You are not building native iOS or web apps; Xcode Build MCP is iOS-specific.

TL;DR

The full version, fast.

The highest-leverage change in this workflow is closing the verification loop: instead of fixing a bug and then manually checking it, you give Claude Code an iOS simulator via Xcode Build MCP or a Chrome instance via --chrome so it can test its own changes and keep looping until they pass. The same logic applies to production debugging — connecting Sentry, Supabase, and Axiom as CLIs or MCPs lets the agent investigate a crash report in three minutes instead of 45. For code review, Greptile auto-grades every PR and Claude Code is told to iterate until it hits a 5/5. The tool layer runs on cMux (low RAM, sidebar, notifications) and the model setting is Opus 4.7 at max thinking for most tasks, with GPT-5.5 extra-high reserved for complex multi-file bugs where Opus tends to fix one thing and break two others.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →

Chapters

Where the time goes.

00:00 – 00:58

01 · Context and model split

Sets up the video as an advanced update. Declares 70/30 split: Opus 4.7 in Claude Code vs GPT-5.5 in Cursor.

00:58 – 02:00

02 · Automated testing — iOS (Xcode Build MCP)

Xcode Build MCP lets Claude Code control the iOS simulator: build, tap, screenshot, read logs. Keeps Xcode closed 90% of the time.

02:00 – 04:02

03 · Automated testing — web (claude --chrome)

claude --chrome gives Claude a live Chrome instance. Cursor has a built-in browser. Old manual loop diagram vs. new autonomous loop.

04:02 – 06:26

04 · MCPs and CLIs for production debugging

Sentry CLI, Supabase MCP, Axiom CLI all wired into Claude Code. Crash investigation drops from 45 minutes to 3. CLI preferred over MCP for token efficiency.

06:26 – 08:24

05 · Greptile for automated code review

Sponsored. Every PR auto-reviewed and scored 1-5. Claude Code loops until 5/5. Beat Cursor BugBot across 60 real PRs.

08:24 – 10:01

06 · Remote control — code from your phone

/remote-control syncs session to the Claude mobile app. Boris (Claude Code creator) called this his top hack. Config tip: auto-start on launch.

10:01 – 11:23

07 · cMux as terminal

Project sidebar, per-instance notifications, runs 20 Claude Code instances on 64GB M4 Max without issue vs. 5 instances choking in Cursor.

11:23 – 13:33

08 · Claude Code settings: model and effort

Opus 4.7 + max thinking on $200/mo plan. Flag to auto-start max thinking. Most people complaining about quality are on Sonnet medium.

12:53 – 13:33

09 · No-flicker mode

CLAUDE_CODE_NO_FLICKER=1 beta flag: pinned input bar, clickable cursor position.

13:33 – 15:15

10 · When to switch to Cursor + GPT-5.5

Complex multi-file bugs where Opus fixes one thing and breaks two others. GPT-5.5 extra-high at 1M context wins here. Burns Cursor Ultra credits fast.

15:15 – 15:38

11 · Why not Codex

Has the $100/mo plan but prefers Cursor UX. Will revisit if Codex iOS app ships.

15:38 – 17:06

12 · Recap — what to steal

Four-point summary: automated testing, MCPs/CLIs, Greptile, slash config. Ends with dog cameo.

Atomic Insights

Lines worth screenshotting.

Giving Claude Code a running iOS simulator via Xcode Build MCP eliminates the need to open Xcode at all for 90% of development tasks.
Running claude --chrome hands Claude Code a live browser so it can test, screenshot, and read the console without any copy-pasting from the developer.
Connecting Sentry, Supabase, and Axiom as CLIs or MCPs compresses a 45-minute manual crash investigation into a 3-minute autonomous one.
CLIs are preferable to MCPs when both exist for the same service because they consume fewer tokens and agents navigate them more reliably.
Telling Claude Code to wait for Greptile to finish reviewing, fix every real comment, and loop until it scores 5/5 automates the entire post-PR QA cycle.
Over half the developers who complain Claude Code is not smart enough are running Sonnet at medium thinking, not Opus at max thinking.
cMux runs 20 parallel Claude Code instances on a 64GB M4 Max without issue; the same machine struggles with 5 instances inside Cursor.
CLAUDE_CODE_NO_FLICKER=1 pins the input bar to the bottom and makes the cursor position clickable.
GPT-5.5 extra-high with a 1M context window outperforms Opus max thinking on bugs that touch 50+ files across multiple repos.
At $400/month in AI tooling, the breakeven is whether the time saved building apps exceeds the cost — for a full-time indie developer, it does.
Remote control via /remote-control was the top workflow hack recommended by the creator of Claude Code himself.
Setting a config flag to auto-start remote control on every session means continuity from desktop to phone is always on.
The Greptile vs. BugBot comparison ran across 60 real PRs over six weeks — not a synthetic benchmark.

Takeaway

Close the loop before you walk away.

WHAT TO LEARN

The reason AI coding agents stall is not that they are not smart enough — it is that they cannot see whether their fix worked, so they stop and wait for a human who is not there.

Xcode Build MCP gives an AI agent the ability to build an iOS app, run the simulator, tap the UI, take screenshots, and read crash logs without the developer opening Xcode or pasting anything.
Running claude --chrome hands the agent a live browser session so it can test web changes, read console errors, and iterate without any manual log transfer.
Connecting Sentry, Supabase, and backend log tools as CLIs or MCPs compresses a multi-service crash investigation from 45 minutes of manual tab-switching to a 3-minute autonomous query.
When both a CLI and an MCP exist for the same service, the CLI is preferable: it uses fewer context window tokens and agents navigate it more reliably.
Automated PR review closes the quality loop the same way: tell the agent to open a PR, wait for the review score, fix every real comment, and loop until it passes.
Most complaints that an AI coding tool is not smart enough turn out to be a model and effort setting problem: Sonnet at medium thinking and Opus at max thinking are not comparable tools.
For bugs that touch 50+ files across multiple repos with a dozen interacting edge cases, a model with a 1M context window is structurally better suited than one with a smaller window.
Running 10-20 parallel AI coding sessions requires a terminal built for that use case — one with per-session notifications and a low memory footprint — not a general-purpose IDE terminal.
Remote session continuity matters most when a session takes 20-30 minutes and the developer cannot stay at their desk for the duration.
The /config settings in Claude Code are not defaults worth accepting: model choice, thinking level, remote control auto-start, and rendering mode all have better values than what ships out of the box.

Glossary

Terms worth knowing.

Xcode Build MCP: A free MCP server made by Sentry that gives Claude Code the ability to build iOS apps, run the iOS simulator, take screenshots, tap UI elements, and read logs without Xcode being open.
cMux: A terminal application for running multiple Claude Code sessions simultaneously, with a project sidebar, per-instance notifications, and a much lower memory footprint than running Claude Code inside Cursor.
No-flicker mode: A beta rendering mode for Claude Code (enabled via CLAUDE_CODE_NO_FLICKER=1) that pins the input bar to the bottom of the terminal and makes the text cursor position clickable.
Greptile: An AI-powered code review service that automatically reviews pull requests, surfaces issues, and assigns a 1-5 quality score when connected to a GitHub repository.
Max thinking: The highest reasoning effort setting available in Claude Code. The default when starting a new session is medium thinking, not max.
Remote control: A Claude Code feature (/remote-control) that syncs an active terminal session to the Claude mobile app, letting the developer continue or monitor a running session from their phone.
Axiom: A log management and analytics platform used here as a backend logging service, accessible via CLI so Claude Code can query server logs during debugging.

Resources

Things they pointed at.

01:38toolXcode Build MCP

03:00toolclaude --chrome

04:02toolSentry ↗

04:02toolSupabase ↗

04:02toolAxiom ↗

06:26toolGreptile ↗

10:01toolcMux

00:57linkHow I Run 6 Coding Agents at Once (My Actual Workflow) ↗

07:15linkBoop (open-sourced AI agent on Claude Agent SDK) ↗

Quotables