Modern Creator
Nick Saraev · YouTube

GLM-5.2 is Basically Opus (For 1/5 the Price)

A 14-minute benchmark rebellion: seven live side-by-side demos, one OpenRouter API key, and a four-path procurement map that makes Opus 4.8 look expensive.

Posted
2 days ago
Duration
Format
Tutorial
educational
Views
20.5K
771 likes
Big Idea

The argument in one line.

When benchmarks are saturated, model selection comes down to taste and cost — and GLM-5.2 wins both against Opus 4.8 for creative-coding workloads at roughly one-fifth the price.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • You are paying for Claude Max or Opus API calls and wondering if you are overpaying for creative-coding tasks.
  • You use Claude Code as your primary agentic harness and want to swap the underlying model without losing the workflow.
  • You are already on OpenRouter or curious about per-token pricing vs flat-monthly AI coding plans.
  • You care about model sovereignty and want to run a capable model locally or through providers that cannot unilaterally pull access.
  • You build web UIs, dashboards, 3D scenes, or interactive explainers and want the model with the highest visual taste.
SKIP IF…
  • You need deep reasoning, long multi-step planning, or complex code architecture — the taste comparison here is specifically visual/front-end generation.
  • You are on a Mac without at least 256 GB unified memory if self-hosting is your goal.
  • OpenRouter's routing abstraction makes you uncomfortable and you prefer dealing with one provider directly.
TL;DR

The full version, fast.

GLM-5.2 is a Chinese open-weights model that scores similarly to Claude Opus 4.8 on saturated benchmarks but produces noticeably better visual output — cleaner layouts, better font choices, higher-quality 3D scenes — for roughly one-fifth the API cost. Setup takes five steps: get an OpenRouter key, save it, add a shell alias that sets ANTHROPIC_BASE_URL to openrouter.ai/api, and type glm. The same approach works in OpenCode and Crush. Web search requires adding the Exa.ai MCP. Four procurement paths exist: z.ai flat-monthly plan, OpenRouter per-token, direct inference hosts (Fireworks/DeepInfra/GMI at ~$0.72-0.90/1M tokens), or a 2-bit quantized self-hosted version that runs on a 256 GB Mac at 82% accuracy.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0001:03

01 · Cold open — benchmark saturation thesis

States the claim immediately, previews 40 demo outputs across 7 categories, argues that benchmarks are saturated and taste is the new signal.

01:0304:20

02 · Seven side-by-side demos

Live walkthrough in Antigravity IDE: nebula spiral (Opus too bright), four-stroke engine explainer, rainbow physics, low-poly terrain, landing pages, mini-game, slide decks.

04:2007:47

03 · Setup in Claude Code via OpenRouter

5-step whiteboard diagram. Live demo: sign up for OpenRouter, create API key, let Claude Code agent auto-configure the glm alias in a fresh directory.

07:4709:16

04 · Web search gotcha — Exa.ai fix

GLM cannot use Claude Code native web search (Anthropic-specific). Fix: add Exa.ai MCP. Demo shows native search failing vs Exa working.

09:1611:43

05 · Open harnesses — OpenCode + Crush

Same OpenRouter model slug works in both. brew install commands, JSON config edits, live verification in each harness.

11:4312:55

06 · Why Claude Code still wins

Personal preference: Claude Code UX is best. All models converging means UX matters more than 0.01% benchmark delta.

12:5513:57

07 · Procurement map — four paths

Whiteboard mind-map: z.ai coding plan, OpenRouter per-token (recommended), direct hosts (Fireworks/DeepInfra/GMI ~$0.72-0.90/1M tokens), self-host quantized (Unsloth 2-bit, 82% accuracy, 256 GB Mac).

13:5714:34

08 · Outro + CTA

Maker School, LeftClick agency, Clarivo SaaS — stated as brief afterthought after 14 minutes of value.

Atomic Insights

Lines worth screenshotting.

  • Benchmarks are now saturated at the frontier — Opus 4.8 and GLM-5.2 score similarly, making visual taste the only real differentiator.
  • GLM-5.2 costs roughly one-fifth of Opus 4.8 API pricing for the same token count via OpenRouter.
  • You can run any OpenRouter model inside Claude Code by setting ANTHROPIC_BASE_URL to openrouter.ai/api and ANTHROPIC_MODEL to the target model slug.
  • Claude Code web search is Anthropic-specific and breaks when the underlying model is not Claude — Exa.ai MCP is the correct fix.
  • The same five-step OpenRouter setup works identically in OpenCode and Crush; model slug stays the same, config location differs.
  • GLM-5.2 700B parameters can be quantized to 2 bits, retaining 82% accuracy, and runs on a 256 GB Mac with unified RAM.
  • A local quantized model is the only AI setup where the provider literally cannot revoke your access.
  • OpenRouter auto-routes across 12+ providers to keep per-token cost as low as possible — no manual provider management needed.
  • z.ai offers a $80/month Max coding plan directly analogous to Claude Max but for their own model.
  • Side-by-side naively prompted outputs show GLM-5.2 choosing serif typography and richer visual hierarchies vs Opus 4.8 blander defaults.
Takeaway

Four ways to stop overpaying for frontier model access.

WHAT TO LEARN

When benchmark scores converge, cost and visual taste become the only meaningful criteria — and GLM-5.2 currently wins both against the leading closed-source model for creative-coding tasks.

  • Benchmark saturation is real: at the current frontier, leading models score within noise of each other, making side-by-side output quality the only honest comparison method.
  • GLM-5.2 produces visibly better visual outputs than Opus 4.8 on creative-coding tasks — richer typography, cleaner layouts, more polished UI — at roughly one-fifth the API cost.
  • Any OpenAI-compatible provider can be wired into Claude Code by setting ANTHROPIC_BASE_URL to the provider endpoint in a shell alias — no other changes required.
  • GLM native web search does not work inside Claude Code; the correct fix is the Exa.ai MCP, which gives any model structured web-search capability.
  • The same OpenRouter model slug works identically across Claude Code, OpenCode, and Crush — harness choice comes down to UX preference, not model availability.
  • OpenRouter auto-routing across 12+ providers is the lowest-friction per-token access path; direct hosts offer slightly lower latency at similar cost.
  • A 2-bit quantized version of GLM-5.2 retains 82% accuracy and runs locally on a 256 GB Mac — the only configuration where no external party can revoke access.
  • The cost difference between a flat-monthly coding plan and per-token API pricing depends entirely on usage volume; high-throughput users should calculate their crossover point before committing.
Glossary

Terms worth knowing.

GLM-5.2
A 700-billion-parameter open-weights language model by Zhipu AI (z.ai), competitive with top closed-source models on code and visual generation tasks.
OpenRouter
An API gateway that routes requests across 12+ model providers and automatically selects the cheapest available endpoint, presenting a single OpenAI-compatible API surface.
ANTHROPIC_BASE_URL
An environment variable Claude Code reads to override the default Anthropic API endpoint, enabling the harness to point at any OpenAI-compatible provider.
Exa.ai
A web-search API service designed for AI agents that provides structured search results via an MCP server, usable inside Claude Code or any harness.
OpenCode
An open-source command-line coding agent harness installable via brew, configurable to use any model via a JSON config file.
Crush
Another open-source terminal coding agent harness configurable for any OpenAI-compatible model endpoint.
Benchmark saturation
The condition where leading AI models score so similarly on standard evaluation tests that the tests no longer meaningfully distinguish their real-world capabilities.
2-bit quantization
A compression technique that reduces model weights from 16 or 32 bits to 2 bits, shrinking file size by roughly 84% while retaining 82% of the original model accuracy in this case.
Resources

Things they pointed at.

08:00toolExa.ai
09:16toolOpenCode
09:16toolCrush
13:10toolDeepInfra
13:10toolGMI Cloud
14:00productMaker School
14:10productLeftClick
14:20productClarivo
Quotables

Lines you could clip.

01:03
We're at the point now where benchmarks are effectively saturated.
Bold claim, no hedging, sets up the entire video thesis in one sentence.TikTok hook↗ Tweet quote
01:15
To really understand a model, you sort of need to take it from a taste perspective.
Reframes the industry conversation around benchmarks.IG reel cold open↗ Tweet quote
03:30
GLM 5.2 just has sort of like a higher style multiplier, if that makes sense.
Coined term — style multiplier — is a novel and memorable framing.newsletter pull-quote↗ Tweet quote
13:35
Nobody will ever be able to take away this from you unless they literally show up to your door.
Visceral image of model sovereignty. Funny and serious simultaneously.TikTok hook↗ Tweet quote
13:57
I do believe that GLM 5.2 right now has better taste than Opus 4.8.
Clean declarative closer on the main thesis — standalone opinion clip.IG reel cold open↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogystory
00:00Hey. So GLM 5.2 is pretty good, and you don't have to take it from me. I just had Opus 4.8, sort of one v one GLM 5.2 across something like 40 different scenes from creation of three d and WebGL scenes, to creation of interactive explainers, to a bunch of dashboards, to basically full stack apps, landing pages, games.
00:18And I'm just gonna spend a couple minutes going through those before showing you the simplest and easiest way to set it up within a Cloud Code harness, and then a couple of other harnesses as well. And then I'm moving to go as far as to give you guys the best, most cost effective vendors. Because I think that, you know, in an era where AI models are capable of getting taken down at a whim or, you know, getting significantly more expensive because of API pricing, we're at the point now where, you know, GLM 5.2 or other open models legitimately might make sense.
00:42So what I've done here is I've constructed seven demos just to show you guys how good GLM 5.2 is compared to the current state of the art, sort of like closed source models. And the first is this playable mini game up here, then the three d WebGL scene and interact explainer dashboard and and so on. Now you might be wondering, Nick, why the hell are you just going through a bunch of three d scenes?
00:59And the reason why is because we're at the point where the benchmarks are effectively saturated. GLM 5.2 scores pretty similarly to, you know, Opus 4.8 on a variety of them. And I think that benchmarks no longer really fully encapsulate what it is to use a model.
01:12To really understand a model and be able to pit one against the other, you sort of need to to to take it from a taste perspective, which is what I'm doing here. And so as you see here, we have two. One from Opus 4.8 making a nebula spiral, then one over here that's sort of making a a nebula spiral as well.
01:25And if I show you the GLM 5.21, I mean, looks gorgeous. Right?
01:29Super clean. You can recenter auto orbit, turn that on or off, increase the glow particle size. Look at what happens with the Opus one.
01:38We literally can't even see the galaxy to begin with because it's just too bright. So obvious error here. Right?
01:44Opus significantly underperformed relative to GLM 5.2. That should be telling you something. If we go over to interactive explainers, we see on the left hand side with the OPUS 4.8 ones.
01:53Now keep in mind, these are all prompted pretty naively. But even with said naive prompt, I think you guys will agree. The OPUS 4.81 really struggles in comparison to just the tasteful style of GLM 5.2.
02:05I mean, that is Opus 4.8. This is GLM 5.2. This is way higher quality, much sexier, much better, to be completely honest.
02:13And I think Realistic could probably be able to teach me how this thing works way more than than Opus'. Just to keep everything short on time, I can show you how this actually is set up. This is the Opus one, so how your rainbow forms.
02:24This is the GLM 5.21. And I I think you guys could see, even its selection of like serif fonts and stuff like that is significantly cleaner than what we have over here.
02:33Okay. So, I mean, yeah, nice. We're using these serif fonts and all, and and that was part of the idea, but it's nowhere near as clean, I would say, as just the way that GLM 5.2 did.
02:41This next test here is sort of like a low poly terrain flyover, so you can use your mouse and basically move it around like this. Let me just zoom in a little bit more for you guys. If I open this fully in a new tab, I mean, is pretty badass.
02:52Right? That's the Opus one. And this one over here is the GLM 5.2.
02:55So moving to the Opus one first, we have endless low poly skies that are being procedurally generated. And if I go over to, you know, GLMs flyover, we see that what it's done is produced what I would consider to be, genuinely speaking, a higher quality output. Now it's done this with low poly, which is why I think that cloud up there sort of looks similar to the ground.
03:13But, I mean, like, taste wise, again, I think GLM 5.2 actually won the procedurally infinitely generated terrain challenge.
03:22So we can see landing pages, same sort of idea. I just feel that GLM 5.2 just has sort of like a higher style multiplier, if that makes sense. I'm keeping in mind that it does so for way cheaper than Opus 4.8 here.
03:34Uh, hopefully, you guys will see why I'd be happy to design a page like this with GLM. And then finally, we have some mini games. And, um, interestingly enough, this is probably the area that I think GLM struggled the most.
03:44Now, on the left hand side is sort of like a tower stacker game. You guys may have seen this at like a, I don't know, a casino or like one of those cool arcades. And as you see, Opus did a pretty good job.
03:52We can stack these no problem. Um, you know, technically, the one by GLM works as well, but it's really fast. And I don't know if it fully understands how difficult this is.
04:01Oh, boy. There we go. Only made it for three.
04:03Okay. So yeah, we have some more, some slide decks and whatnot. I think you guys get the idea.
04:07GLM's pretty good, actually. And what I'm gonna do in this video is just run you through everything you need to both set it up, then apply it to other harnesses, like OpenCode, Crush, things outside of the Cloud Code structure. I'm sure you guys have some other plans and providers, and then testing on RealQueries, I guess I kinda did that already because I just wanted to hook you with the video.
04:25The very first thing I wanna do is show you guys how to set up GLM 5.2 in Cloud Code. And the reason why I'm doing it in Claude code is I just think it's probably the most intuitive for most people that are watching my channel. Just wanna give it to you guys really easily and simply.
04:35Um, so I'm just gonna head over here to anti gravity, which is currently running two Claude code instances. And as you see here, we currently have Opus 4.8 running with one mil context inside of Cloud Code. You know, I could say, hello, how's it going?
04:45And it'll, you know, respond to me, which is pretty neat. Well, what we wanna do is we basically wanna have that exact same structure. We just wanna have it with GLM.
04:52And if you go over here to the side, if you just type GLM because I've already done the setup, this is what that's gonna look like. So now we basically have this running in Claude codes harness, but the actual intelligence of the model is underscored by GLM 5.2. So I'm actually gonna say, hey, what model are you running?
05:09And I should note the reason why Opus executed so quickly versus GLM here is just because I have fast mode enabled. They're actually pretty, uh, similar in terms of tokens per second. So, you know, as you see it saying I'm doing GLM 5.2 via OpenRouter.
05:20It's one of the three terminal harnesses you set up under GLM, not a cloud model under the hood right now. What it's doing is it's using cloud code through GLM. Okay.
05:27So how do you actually do that? Well, it's pretty easy. You just need to use a service like OpenRouter.
05:30I'm gonna start with that. It's simply because I think it's the easiest. Um, we're gonna head over here to the top right hand corner where it says sign up.
05:36I'm gonna insert some information. Once you're done, you'll get a verification code on your email. So I'm gonna head back over here, pump that puppy in.
05:42And now you can see we actually have it in the API. And you can see twenty eight seconds ago, I spent about $4. So from here, you'll need to add a new key.
05:49Hey. This is a new key. We'll leave the credit limit as unlimited.
05:53Um, in my case, I'm gonna set an expiration for one hour because I don't wanna leak this and then have you guys, uh, ping me a million times. And I'm just gonna copy it over. Okay?
06:00It's it's basically that easy. Once you're done, all you really have to do is just copy this screenshot right here, the one that I just gave you, literally.
06:08K? You could pause this video, take a screenshot of this, and then, you know, continue. Head over to your anti gravity IDE or wherever the the other agent is.
06:16I'm just gonna do it in Claude code, so it's a similar as what you guys are realistically probably gonna be using. Paste this in and say, I want you to set up GLM in a Claude code instance.
06:26Note, we've already done this before. I'm just doing a demo for the video. So run this in another directory.
06:32And I'm just doing that because I don't want this to say, well, you've already done it. You don't need to do it anymore. So I literally just screenshotted this, k, which you guys just saw a moment ago, and I pasted it in here.
06:43And you can actually already do this. So it's now gonna spin up a fresh directory and verify that the whole path actually works end to end. It's that straightforward.
06:50Um, now, a couple people will probably yell at me because I'm doing my API key directly in the text. It's not as good to do the API key or or stick it directly in the text. But to be honest, for most purposes, it's not that big of a deal either.
07:00Um, if somebody gets access to your device, they will inevitably get access to your API key as well. Once all of this is done, and I did this in almost entirely real time, scrolling back here, it's a GLM 5.2 is set up and verified end to end in a fresh demo directory. So we have everything over here.
07:14We just need the GLM collect command. Um, and basically, the way that it works is Anthropic has this key called Anthropic underscore base underscore URL equals. And then, um, you can swap any model you want in there as long as you have the API and the API spec is similar to Anthropix and sort of expects it.
07:30So we did right over here is we we basically did that. Um, all I need to do now is just open up a fresh directory in terminal, c d to g l m demo, and then type g l m.
07:39And guess what? Boom. We're here.
07:41We actually have glm. So that's how you run it as easily as that inside of Cloud Code. It's very straightforward.
07:48A couple of gotchas that some people have pointed out. In order for glm to perform web searches, because it doesn't have, like, built in web search functionality within the model, it tries to use Anthropix or Cloud Codes, and that doesn't work a 100% out of the box.
08:00So if you wanted to, you could integrate this with Exa AI. Exa AI is another software platform that basically provides AI agents the ability to search the web.
08:10And right now, it's sort of at, like, the frontier, the cutting edge. So, you know, European series b fintechs with 50 plus employees, recent rounds, just pumping that into a search term, and then, know, that goes and and and actually does the web call. That's in contrast to, like, search web for Nick Seraf.
08:23If I just sort of type this out natively, we don't have the exa.ai integration. It's not really gonna work.
08:28And so that's what's going on right now. Okay. So, yeah, I mean, that's very fairly straightforward.
08:33All you need to do if you wanna set that up is just say, set up exa.ai for web search within and then making it a little bit easier for you guys to see here. If I just type in set up exa.ai for web search within this new GLM directory, then it'll go through, ask you for your exo.ai API key.
08:51And I think they give you, like, $10 in free credits or something. So you can actually replicate that fairly easily for very little money. Alright.
08:56Next up, let's talk about how to set up GLM 5.2 in various open harnesses. So in open code, all you need to do is type brew install open code. For crush, all you need to do is type brew install crush.
09:05There are a couple of edits that you have to make in order to have this run for you. But what I'm gonna do, you know, just to show you guys how easy and simple it is, so I'm just gonna take a screenshot of this. I'm gonna go back here, and then I'm gonna clear my chat.
09:17And I'm just gonna pump this in. I'll say, I want to set up GLM5.2 on both of these harnesses.
09:23Go get the harnesses, then set them up with GLM5.2 according to this screenshot.
09:29Same thing that we did before, you know. Hopefully, you guys see the reason why I'm doing this because I wanna show you how simple and easy it is for you to take any resource on the Internet, whether it's a video, a still, a piece of text, paste it directly in here, and then it'll go and build. In my case, we've actually already done the installation for both, so it'll occur, uh, fairly fairly quickly.
09:46Yours will probably take a little bit more time because you actually have to go and download it. Um, but in this case, I've actually had it go and then test something. Reply with exactly one short sentence stating which AI model you are.
09:55It's powered by this. It's confirmed. Now it's gonna do it on Crush.
09:58And they're actually gonna like walk through how to use Crush in OpenCode as well. Great. So now I'll say open these up in new tabs, so I can show my audience.
10:06K. It's gonna ask me some for some permission here, so that's what's going on here. And then it's trying to control my thing using accessibility features, basically to take screenshots.
10:15That's okay. We don't need it. And now I'm just gonna say, would you like to initialize now?
10:19Oh. Hey, what's up? I think I might have selected a bunch of stuff, which is why it looks so silly and dumb.
10:26Zooming way in, as we as you see, this is the crush harness. K? And right now, um, this is running GLM 5.2.
10:34I don't personally like the crush harness very much. I don't really like other harnesses but clogged coat to be completely honest. But, you know, we we we have it over here, so we could talk to it just like we're talking to basically anything.
10:44You know, it's it's purring right now. Okay. I could say, hey.
10:48What model are you using? And then it's gonna go do a bunch of calculations. Let me check the model.
10:53I'm using z dash g l m 5.2 via OpenRouter for both small and large models. Very cool. K.
10:58Same thing with open code. If we just go back here to this, open this puppy up. If I zoom way in, k, you know, note that the text is black on a black background simply because I'd never used this terminal, and so I haven't actually set up any themes.
11:12But, hopefully, you guys could see if you did set up a theme with light mode or something like that. It'd be fairly straightforward to see. I could say, hey.
11:19What model are you running? And I could zoom right in, and then you could see it's now going through the GLM 5.2 builds. In the bottom left hand corner, it's doing a lot of thinking, and boom, it's done.
11:30Thank you. Wonderful. So this is also tracking spend on the right hand side.
11:34It's connected to Exa sort of out of the box, assuming that you set it up. And it's it's honestly, it's really that straightforward and easy to run at whatever harness you want with GLM 5.2. Now personally, I think Cloud Code is just the best and the easiest.
11:46Some people would disagree with me on this, and that's fine. There are also some statistical reasons why people would disagree with me. And I think at the end of the day, what one needs to realize is basically all of these models are converging on a similar level of intelligence.
11:58And so it's just more about the usability and the UX at this point than it is about necessarily scoring point 01% more on a benchmark or being two or five tokens faster. But that that's just my thoughts on the matter. Now that we've set up GLM 5.2, uh, the next question is where do we get it to keep it as cost effective as possible?
12:13And really, there are four main routes right now. The first is the z.ai coding plan. Now this is very similar to like your Claude Max plan or something.
12:22There's a Lite, a Pro, and then there's an actual $80 a month flat monthly Max plan. You do that on z.ai, which is where they came up with the model.
12:29That's like the team that came up with GLM. You can also use OpenRouter. So here's where you pay per token.
12:33It's what we just used. Essentially, what they do is they try and auto route to various providers that have API endpoints where they can call in order to get, like, the results back, and then they will try and arbitrage them to keep them as cheap as humanly possible for you. So I really like OpenRouter because for me, it's just plug and play, and this is personally what I use.
12:48I'm not sponsored by them. I have no reason. Nobody's holding a gun to my head.
12:52Um, but I I just think they're pretty solid and pretty straightforward. Then you have some direct hosts of Fireworks deep in for a GMI. These are just dedicated providers as opposed to OpenRider picking the best one.
13:01And your last option is sort of self hosting. Um, self hosting GLM 5.2 will be pretty difficult because it is quite large, you know, 700,000,000,000 parameters. But I did just find a way in case somebody here wanted to.
13:12This fella here has set up GLM 5.2, uh, like, quantized the hell out of it, made it really, really small, so it's two bits, and it still retains 82% accuracy and can be run on a 256 gigabyte Mac or even some RAM or VRAM setups. I'm easily the strongest local model that I think you can run at the moment, and you can do so despite the fact that it's not as smart, um, at an 84% size reduction literally on your computer.
13:37And if you think about it that way, like, nobody will ever be able to take away this from you unless, you know, you they take away they literally show up to your door, knock on it, take away your computer too. Okay. So, hopefully, I showed you guys how feasible running actual high quality open source models both on your computer and then through API call, um, to arbitrage inference costs can be.
13:53I do believe that GLM 5.2 right now has better taste than Opus 4.8. Um, you know, when Fable comes back and so on and so forth, things might be a little bit different. But, uh, right now, it's it's definitely my daily driver, and I'm coring the hell out of it basically twenty four seven.
14:06But highly recommend that you guys give it a try as well. There are lot of free API tokens out there for the grabs. If you guys like this sort of thing and wanna learn how to use these models for financial purposes, definitely check out Maker School.
14:17It's my daily accountability program that shows you how to acquire a customer that pays you money for one of these services within ninety days of your money back. We also offer done for you and done with you implementation services at LeftClick. And I also run a SaaS called Clarivo that helps you double your pickup rates.
14:31Thank you very much for your time. Have a lovely rest of the day.
The Hook

The bait, then the rug-pull.

The claim lands in the first four seconds, unhedged: GLM-5.2 is better, and there is proof. Before the viewer can object, forty side-by-side outputs are already loading — seven demo categories, two models, zero setup time wasted. The title does the math so the video does not have to.

Frameworks

Named ideas worth stealing.

01:03concept

Benchmark Saturation and Taste Test

When all frontier models score similarly on eval suites, the only meaningful comparison is subjective visual/aesthetic quality. Side-by-side naive prompts become the new benchmark.

Steal forAny model comparison video or selection argument
12:55model

4-Path GLM Procurement Map

  1. z.ai coding plan (flat monthly)
  2. OpenRouter (per-token, auto-routes)
  3. Direct hosts: Fireworks / DeepInfra / GMI
  4. Self-host quantized local

Four distinct ways to access GLM-5.2, ranging from managed flat-monthly to fully local.

Steal forAny model access or cost-optimization breakdown
07:00concept

ANTHROPIC_BASE_URL Swap Pattern

Any OpenAI-compatible API can be injected into Claude Code by overriding ANTHROPIC_BASE_URL plus ANTHROPIC_MODEL in a shell alias. One alias per model equals instant harness switching.

Steal forDeveloper tutorials on multi-model Claude Code setups
CTA Breakdown

How they asked for the click.

VERBAL ASK
13:57product
If you guys like this sort of thing and wanna learn how to use these models for financial purposes, definitely check out Maker School. It's my daily accountability program that shows you how to acquire a customer that pays you money for one of these services within ninety days or your money back.

Triple-CTA stacked at the end: Maker School (flagship program), LeftClick (agency), Clarivo (SaaS). Stated as an afterthought after 14 minutes of pure value delivery — no pressure, just a mention.

MENTIONED ON CAMERA
Storyboard

Visual structure at a glance.

open
hookopen00:00
demo gallery
valuedemo gallery00:12
7 categories
promise7 categories01:03
nebula demo
valuenebula demo01:15
roadmap
transitionroadmap03:18
setup diagram
valuesetup diagram06:01
exa fix
valueexa fix07:47
harnesses
valueharnesses09:16
where to get
valuewhere to get12:09
CTA
ctaCTA13:57
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

Chat about this