A 23-minute supply-chain autopsy explaining why Elon's reckless GPU overbuy is now the most valuable compute position in the world.
Posted
2 days ago
Duration
Format
Essay
educational
Views
69.4K
2.5K likes
Big Idea
The argument in one line.
The AI compute crisis is not one bottleneck but five that must scale in lockstep, and whoever locked in allocation years early — SpaceX and OpenAI — now holds leverage over every company that didn't.
Who This Is For
Read if. Skip if.
READ IF YOU ARE…
You work in software and want to understand why AI rate limits and API costs keep tightening despite companies raising billions.
You are curious why Anthropic's Claude limits feel stingier than OpenAI's despite similar model quality.
You build on cloud infrastructure and want a mental model of why GPU availability and pricing are so volatile.
You follow the AI industry and want a grounded explanation of the Google/Anthropic SpaceX compute deals.
SKIP IF…
You are looking for investment analysis or stock picks — this is an explainer, not financial advice.
You already have a deep background in semiconductor supply chains.
TL;DR
The full version, fast.
Every major AI company — Microsoft, Google, Anthropic — is currently capacity-constrained not by model quality but by raw compute availability. The supply chain runs through five choke points that all have to expand together: TSMC silicon fab (8-10 year lead times), high-bandwidth memory (only three manufacturers, all pivoting away from consumer), hard drives (Western Digital sold out through 2026), power grids (US electricity growth lags China's by decades), and manufacturing time itself. Elon bet on this bottleneck early, overbought compute for SpaceX and xAI, and when xAI underperformed he started renting the capacity to Google and Anthropic for roughly $1B/month each — making the Colossus cluster self-financing in under four months.
Free for members
Chat with this breakdown — free.
Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.
Opens with the shared constraint across Microsoft, Google, and Anthropic; WhisperFlow dictation sponsor demo
03:16 – 04:13
02 · The news trigger
Google paying SpaceX $920M/month for compute is the event that prompted the video; Anthropic's $1B/month deal is the backdrop
04:14 – 05:04
03 · Framing: the compute crisis
Names the core problem: demand has outpaced supply at every layer of the stack
05:04 – 07:21
04 · Layer 1: TSMC fabrication
How TSMC works, Apple's early lock-in, Nvidia's growing share, 8-10 year build timelines
07:21 – 09:45
05 · Layer 2: High-bandwidth memory
Only three manufacturers; Micron shut down Crucial consumer brand; consumer allocation shifted entirely to data center
09:46 – 11:41
06 · Layer 3: Hard drives
Western Digital sold out through 2026; prices doubled since 2024; consumers priced out by data center demand
11:41 – 13:15
07 · Layer 4: Power grid
US power grid growth lags China's dramatically; commercial demand will exceed residential in 2027 for first time on record
13:15 – 17:58
08 · Why can't we just make more?
8-10 year fab lead times; TSMC's RROD history shows how hard this is; all five layers must scale in sync
17:58 – 21:03
09 · Why SpaceX isn't affected
Elon overbought compute for xAI; Grok underperformed; now renting Colossus to Anthropic and Google at $1B/month — self-financing in 4 months
21:03 – 23:21
10 · The only real winner: Nvidia
OpenAI also bet early; Anthropic and Google are paying for their conservative bets; Nvidia wins regardless of how individual bottlenecks resolve
Atomic Insights
Lines worth screenshotting.
Microsoft, Google, and Anthropic are all revenue-constrained by compute availability, not by market demand or product quality.
Micron shut down its Crucial consumer memory brand entirely, reallocating all production to data center and GPU memory.
Western Digital was sold out of hard drives for all of 2026 as of February — before the year even started.
The US adds less new power capacity in a full year than China adds as a matter of routine since the mid-2010s.
New semiconductor fab capacity takes 8-10 years to build; TSMC allocation is sold years before the presses exist.
Apple locked in TSMC allocation so aggressively early that it remains relatively insulated from the GPU shortage while everyone else scrambles.
SpaceX's Colossus cluster cost $3-4B to build and now rents for $1B/month — a full payback in under four months.
Google, which manufactures its own TPUs and even rents compute to Meta, still had to go buy capacity from SpaceX.
Anthropic's rate limits are stingier than OpenAI's because OpenAI bet on compute scarcity years ago while Anthropic was conservative.
All five supply-chain layers must expand in sync — if TSMC 10x's production but HBM doesn't, the bottleneck just moves downstream.
The Xbox Red Ring of Death was caused by TSMC using a chip-sealing compound that failed at temperature — an early signal that their process wasn't production-grade.
Nvidia cannot make enough GPUs to meet current demand and benefits whether individual bottlenecks resolve or not, making it the clearest winner in the current moment.
H100s on standard PCI are simply unavailable on the open market; demand has completely absorbed supply.
Hard drives that cost $170 for 16TB in 2024 now run $360 — more than double in under two years as data center demand crowds out consumer supply.
Takeaway
Five bottlenecks that must all move together.
WHAT TO LEARN
The AI compute shortage is not a single supply problem — it is five interdependent constraints that only clear when every layer scales at once.
Semiconductor fabrication has an 8-10 year lead time from decision to volume production, meaning today's shortage reflects bets not made a decade ago.
High-bandwidth memory production is controlled by only three companies worldwide, all of which have pivoted sharply away from consumer products toward data center supply.
Hard drive and storage availability has become a binding constraint on GPU deployments — companies cannot run GPUs they own if they can't source enough storage to feed them.
US power grid growth has fallen so far behind that commercial electricity demand is expected to exceed residential demand for the first time in 2027 — a structural limit on data center expansion.
When all five layers are constrained simultaneously, any single company that solved one layer early captures outsized leverage over those who didn't — as SpaceX's $1B/month rental income illustrates.
The companies with the most generous AI API rate limits today (OpenAI) are the ones that made aggressive compute bets years before demand materialized, not the ones with the best models.
Glossary
Terms worth knowing.
TSMC
Taiwan Semiconductor Manufacturing Company — the dominant contract chipmaker that fabs silicon for Apple, Nvidia, AMD, and nearly every other chip designer. Companies submit designs; TSMC handles the physical manufacturing.
High-bandwidth memory (HBM)
A type of stacked RAM that sits directly on a GPU die and provides the massive memory bandwidth AI workloads require. Only SK Hynix, Samsung, and Micron produce it.
Colossus cluster
xAI's large-scale GPU supercluster built to train and run Grok. The initial phase reportedly cost $3-4B and is now being rented to Anthropic and Google for ~$1B/month each.
Allocation
A pre-purchased commitment to a set amount of manufacturing output from a fab or memory maker, often bought years in advance. Companies like Apple hold multi-year TSMC allocations that insulate them from spot shortages.
Fab
Short for fabrication plant — the physical facility where semiconductors are manufactured. Building a new fab takes 8-10 years from decision to volume production.
“Everyone's compute constrained except for one company — SpaceX.”
Clean standalone thesis, zero context needed→ TikTok hook↗ Tweet quote
21:10
“In four months, it pays for itself.”
Punchy financial punchline on the Colossus rental math→ IG reel cold open↗ Tweet quote
20:20
“Google sells compute and they sell models. So despite the fact that they sell and rent compute to companies like Meta, they are still so low they have to go buy it from companies like SpaceX.”
Irony lands hard — the compute seller is itself supply-constrained→ newsletter pull-quote↗ Tweet quote
21:53
“That means that 3% of Google's total revenue is now going to SpaceX — one of their competitors.”
Concrete percentage makes the absurdity tangible→ TikTok hook↗ Tweet quote
22:33
“The only real winner in all of this... is Nvidia.”
Satisfying payoff to the entire essay→ IG reel cold open↗ Tweet quote
The Script
Word for word.
Read-along
Don't just watch it. Burn it in.
See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.
17px
metaphorstory
00:00Do you know what Microsoft, Google, and Anthropic all have in common? I'll give you a hint. It's the title of the video.
00:05They're all massively constrained by compute. The CEO of Microsoft said directly that they grew a ton q one, but the problem that they have is that even with the additional data center capacity they're bringing online, they expect to remain capacity constrained through the first half of the fiscal year. That literally is them saying we cannot make more money because we don't have enough compute for it.
00:26Here is Sundar, the CEO of Alphabet and Google, saying directly that they are supply constrained even as they're ramping up capacity. This is a company that makes their own chips and they still don't have enough. That's why Anthropic partnered with a company they hate, SpaceX.
00:41The same Anthropic that banned x AI and SpaceX from using their models they were concerned about distilling is now paying a billion dollars a month to use SpaceX's spare compute. And apparently, Google thought this was a good idea because now they're paying SpaceX $920,000,000 a month for compute as well because everyone is desperate for GPUs.
00:59Everwrite checks, standard h one hundreds on PCI are just not available anymore because the demand is so insane. What's even crazier is this goes beyond GPUs with companies like Western Digital, the hard drive company, sold out for all of 2026 as of February.
01:14Things are pretty crazy with compute right now, and I don't think most people understand the severity of the problem. I wanna do my best to break this all down for you guys, but if I wanna be able to afford the RAM that I just bought, I need to take a quick break for today's sponsor.
01:28Today's sponsor is without question the single product that people ask me about the most in my comments section. You've seen me use it before. It's WhisperFlow.
01:35Wait, that's ChatGPT. Isn't there already a voice to text thing there? There is, but it's nowhere near as useful as WhisperFlow.
01:41Just watch. I'd like you to explore the t three code code base at t three code URL and tell me what you think. Notice what it did there.
01:50It replaced the words t three code URL with the GitHub URL. That's not a thing it will do by default, although it does do a lot well. Like it auto capitalizes things correctly, it knows proper nouns, it learns from the mistakes.
02:01When you go back and make changes, it saves those. It's awesome. What it's way cooler for is the ability to set up snippets, and I use these all the time.
02:09I can even do silly things like, who is today's sponsor? Today's sponsor is WhisperFlow, is what it puts out instead. I'm a fast typer, so I didn't think WhisperFlow would be for me, but the more I use it, the more I realize it makes me better at prompting.
02:21I'll actually talk a bit more and go in detail in ways I wouldn't have when I'm using my keyboard, but even cooler is when you combine that with the power of their snippets. I also don't like loading up my projects with skills, because I find when I do, the agents reach for them when I don't want to. So treating snippets as a way to save a bunch of skills has been really cool.
02:39For example, grill me skill. It just pasted the Matt Pocock grill me skill. So I don't have to have it saved in the project, I can just type it, hit enter, and now it's going to grill me.
02:48I could go a bit further if I want. For example, g stack office hours. Yes, I pasted the entire g stack office hours skill into my WhisperFlow so I can say three words and have it run.
02:59Because you should feel a little guilt before you run it. There's so many little things WhisperFlow gets for developers, like their ways of hacking cursor to make the file paths work is crazy.
03:09They're a really good team, they built really good software, I didn't think I would like it, and I ended up loving it. See why I'm so hyped at soydev.link/whisperflow. The reason I'm making this video today is because of the Google SpaceX deal.
03:20The Anthropic SpaceX deal surprised me, but like Anthropic's not a compute company, they're a model company. Makes sense that they didn't have infinite GPUs.
03:31Google is making their own compute. Google manufactures their own TPUs.
03:36They build the chips that they plan to run their inference on. In fact, in February of this year, Meta signed a multi billion dollar deal to rent chips from Google. So while Anthropic was buying from a competitor, because like SpaceX is Croc, they make AI models, Anthropic is Anthropic, they make AI models, they weren't buying the thing they compete with.
03:55They're buying the tools they need to increase the competitive nature of their product. Google sells compute and they sell models. So despite the fact that they sell and rent compute to companies like Meta, they are still so low they have to go buy it from companies like SpaceX.
04:11And that is how we got here, the compute crisis. The amount of compute available has gone up meaningfully year over year, but nowhere near as fast as the demand has gone up.
04:23There are many layers to this problem. Obviously, there is the massive demand, but there's also the complicated supply chain problem here. Because making GPUs is not trivial.
04:33One of the other severely underrated problems here is actually power availability. Because as more compute comes online, we need more power for it.
04:41I'm gonna do my best to visualize this problem, but it's admittedly going to be difficult. So bear with me as I try to figure this out. We're gonna go through the layers of how SAND effectively becomes the prompts that you're writing and getting responses to.
04:54We're gonna start a little higher up the stack than we probably should like. I'm tempted to go into courts, but we'll avoid it. And we'll start with where most of the things that matter do.
05:03TSMC. TSMC is the Taiwan Semiconductor Manufacturing Company. It was formed by a person who used to work at Texas Instrument in The US who left back to his home country of Taiwan in order to build better manufacturing of semiconductors as a generic layer for other companies.
05:18Previously, companies would make their own semiconductors and fab their own, like, process and also make the processors themselves.
05:26But TSMC doesn't sell something you buy as an end user. You can't put a TSMC chip into your computer.
05:32You give TSMC the plans on how you wanna manufacture your chip and then they help you with the manufacturing process for it. So every company doing compute now from Apple to Nvidia to AMD to Intel works with TSMC to fab the silicon that they use for their chips.
05:49Apple is one of the companies that bet on them biggest initially and others have slowly started to realize TSMC's manufacturing is just far beyond anywhere else in the world and have relied on it more and more heavily as a result. As I mentioned before, some out of their allocation is already purchased upfront by Apple. Apple's historically been such a big customer of TSMC and has so many crazy deals with them.
06:10They have managed to hold strong with their allocation for a while. That's why they are not having the issues with making new computers or manufacturing new phones that a lot of other companies have. Because this particular spot in the pipeline and in the supply chain is really strongly purchased and agreed upon for them.
06:29So Apple is still relatively marked safe, at least in this layer. Don't worry though, that will change as we go. The rest of this was split across lots of other companies.
06:39But over time, the section of this that is for Nvidia has grown massively. So every couple months, Nvidia wants to increase their manufacturing more, and as a result, the amount of this that belongs to them gets bigger and bigger.
06:53Let's just say it's like the majority here, and then whatever's left is everyone else. So Apple gets their little share here, NVIDIA has a big chunk here, And then there is whatever's going on up here. This is just one of the things Nvidia needs to make a GPU though.
07:07So right now, the size of how much Nvidia can do is at best this big. Because this is the amount of TSMC manufacturing they have. This is how much they can do at best.
07:17But there are other things they need in order to make their GPUs. Because not only do they need all of the TSMC manufacturing for it, they also need memory.
07:27And high bandwidth memory manufacturing is a very interesting space. Because historically, like, it was important to have good RAM, and there was lots of companies that would purchase from the high bandwidth memory manufacturers. Most of the time though, their work was going into consumer devices.
07:44They would sell NAND chips that could be used for RAM or for SSDs, and they would sell that to companies that needed RAM for their devices, whether it was Qualcomm to put in phones or Apple to put in phones, or if it was to companies like Crucial or SK Hynix or whoever else that makes memory for users to put into their computers, or if it was to Dell to make memory chips that they would put in their computers.
08:06The high bandwidth memory chips were a very diverse set of places they would go, but there was really only three manufacturers that mattered.
08:16SK Hynix, Samsung, and Micron. Historically, they have split their allocation across lots of different groups.
08:23But now the demand for things like GPUs is so absurdly high that they have reallocated entirely.
08:31SK Hynix maintained a consumer brand of memory called Crucial, and most of the RAM in most of my computers here is from Crucial. Crucial no longer exists.
08:41Sorry. It was Micron, not SK Hynix. My bad on the memory there.
08:45There's three of them. Sorry. I made a mistake.
08:47Micron was the owner of Crucial. And Micron has decided to shut down Crucial. Micron has made the difficult decision to widen down the Crucial consumer business.
08:56Micron will ship Crucial consumer products through February of this year with warranty and support continuing. Micron Crucial consumer products may continue to be available for purchase from distributors and resellers for some time. That is the case.
09:07I bought 64 gigs of crucial memory a few days ago. It was very expensive. So all of this allocation used to be split across consumer, manufacturing of consumer hardware, lots of other devices, and then data center use cases and GPU use cases.
09:23Since AI requires so much RAM to run, even smaller models like DeepSeek v four Flash require over a 100 gigs of RAM. Memory is super valuable to these businesses. So the need for it has skyrocketed and with such, prices have skyrocketed as well.
09:38So Nvidia needs an allocation of this as well. And the price of that allocation is skyrocketing massively.
09:46I will say that the numbers in these charts, like the the percentage split here, is not super accurate. TSMC has reported that Nvidia is only about a fourth or so of their manufacturing allocation right now.
09:57So this is a huge chunk here. This is just meant to emphasize the point not to be literally TSMC's majority Nvidia.
10:03Just trying to make this as easy to visualize as possible. Give you some creative wiggle room, okay?
10:08So we have these two key components that are necessary for NVIDIA to be able to make GPUs. But there are other layers here as well. And not all of them are directly in front of NVIDIA either.
10:18Because in order to run those NVIDIA GPUs, you need a few other things. First, need hard drives.
10:24You need somewhere to store the data that these GPUs are actually operating on. And apparently, the demand for hard drives is skyrocketing like never before. Last year, I bought four sixteen terabyte hard drives refurbished from server part deals, one of my favorite sources for hard drives, for about a $170 each for 16 terabyte drives.
10:42Worse 16 terabyte drives than what I have in my NAS right now are going for $360, more than two x the cost when my purchase was in 2024. Insane.
10:54I just bought a handful of 28 terabyte drives, and they were, like, 600 plus each.
11:01I spent like $3,500 on hard drives recently. It's crazy.
11:05So hard drive manufacturing is also very tight right now and something that like consumers have barely even needed for a while because we all moved to flash storage.
11:14Hard drives are still useful for lots of big archival stuff, especially when they were cheap. But now hard drives are like as expensive as SSDs were not long ago, which is unbelievable to me.
11:26So if a company is able to buy all the NVIDIA GPUs they need, but they can't get enough hard drives to actually run them, then they're not gonna buy those GPUs because they're making all of these decisions upfront. So if there aren't enough hard drives, then NVIDIA's manufacturing capabilities barely even matter anymore.
11:41But then, all of this gets bottlenecked by yet another layer, power. How much power is available? Power grids are struggling right now.
11:49Electricity demand growth is led by an increase in the commercial sector, which is expected to outpace residential demand in 2027 for the first time on record. We are now at the point where industry use of power in The US is higher than consumer residential use.
12:06This is why these big compute companies like Microsoft with Azure are starting to invest in power, like actual introduction of new power plants to the grid, trying to get nuclear energy unblocked and more. And when you compare the rate of electricity generation in The US compared to in China, you see how bad we have to catch up here.
12:28We are not introducing more power to the grid anywhere near fast enough. The amount of increase to the grid that China does in any given year from 2016 onwards is higher than we have done since the nineties.
12:42We should still be focused on making things more efficient, but we also need to have more energy and ideally more clean energy from resources that we know we can get it from. This chart scares me. I don't think we've properly prepared our nation for the increasing demand of power to get where we want to in the AI race.
13:00And if any one of these sections gets any smaller, it effectively works as a filter preventing NVIDIA from selling more GPUs and preventing AI businesses from being able to grow and increase the number of customers they have. Everyone's compute constrained except for one company, SpaceX.
13:18And this raises a couple important questions. One is, why isn't SpaceX affected?
13:24Why are they so capable of having all this spare compute that none of their competitors have? And then we have question two, which is why not just make more? Like, why can't we just create more high bandwidth memory?
13:36Why can't we just fab more silicon? Why can't we just make more GPUs or create more hard drives or increase the power availability by making more power grids? Why can't we just make more?
13:46Well, as much as AI has accelerated our ability to shit out new software, it has not made it particularly faster to break ground and create new manufacturing for things like silicon.
13:58Estimates that additional fabrication capabilities can take as much as eight to ten years to build up. When they decide they wanna make more chips and they wanna do more manufacturing, they have to start planning eight plus years ahead, and they're often selling the allocation off of these theoretical presses that don't exist yet, six to eight years ahead.
14:20Apple's deals are insanely long term in that regard. Despite the insane growth we've seen at companies like Nvidia, TSMC has only seen about a 40% growth in their revenue year over year.
14:32Not because the demand isn't massively skyrocketing, simply because they can't supply the demand as it skyrockets. People are buying out allocation years in advance because they have to, because they can't get it right now.
14:46There's a reason TSMC is the only company doing this well, and it's because they invested really heavily, really early, and it took them a decade and a half of failures to get there. Ready for a really funny fact about TSMC that not a lot of people know?
15:00Remember this? Bet you didn't know this was TSMC's fault. Microsoft and Nvidia were two of the first companies to bet heavily on TSMC manufacturing.
15:10TSMC used an external vendor for the, effectively, the glue that they would use to seal the chip.
15:18And that manufacturer was not correct about the thermal range in which that piece actually operated properly.
15:26And the cause of the Red Ring of Death was that getting really hot and then cooling over and over again would cause that glue to loosen, causing the chip to come slightly off of the slot that it was meant to be in. That's also why you could wrap your Xbox in a towel and turn it on until it overheated or throw it in your literal oven, and that would temporarily fix it.
15:45Because when the glue got warmed up again, the chip would fall back into the slot it's supposed to be in. But then when the glue got cold, it would pull the chip out of the slot. And that was because TSMC didn't have good enough process to detect these types of failures in their manufacturing.
16:00This was early in their history. So the Xbox failed so Nvidia could win. As silly as that is, they just took a long time for this company to get their shit together.
16:10Not that long ago, TSMC's process was not thorough enough because they were still new and getting all this shit right is hard.
16:17And the result was that they weren't even reliable enough for game consoles. Now we're relying on them to power the entire fucking world. And the same goes for all of these other sources of manufacturing.
16:28Things like high bandwidth memory is not easy to produce. There's a reason only three companies in the world can do this. And those three companies are also making all of the chips that go into all the SSDs that we use, all of the phones that we use, all the SD cards and CFast cards that we use for our cameras and things.
16:44All of that is made by three companies. And those three companies can make the same thing for Nvidia instead. Why would they sell us consumer chips that sell okay at reasonable prices when they could sell way more to Nvidia?
16:59So to put it simply, we are trying our hardest to manufacture more, but it's going to take a long time. And if you get your bets wrong here, you're kind of screwed. If TSMC ramps up production, so they're making 10 times more silicon, but the hard drive sector doesn't pick up enough, or HBM doesn't pick up enough, then they just spent billions of dollars spinning out fabs for demand they no longer have because everybody is constrained.
17:25I guarantee you, if Microsoft could snap their fingers and spend three times more money to get two times more compute, they would do it immediately, but they can't because every single one of these bottlenecks needs to be resolved together. If TSMC does 10 x production, we just get constrained on HBM.
17:41If high bandwidth memory also 10 x's, then we get constrained on hard drives and power. We need to bump everything up. And if any company bets too hard on one specific piece that they are in this puzzle, they get screwed.
17:55And this also ties into the first question of why isn't SpaceX affected? Because SpaceX and Anthropic are kind of opposites here. Anthropic had the concern if they overbuy GPUs and they don't have the demand for their inference or the models don't work as good with scaling laws as they hope where more compute means better model.
18:14If any of that goes wrong and they purchase too much compute, they're out of money and they fail. Anthropic was a little conservative with their compute bets last year, and that has screwed them because now the compute they could have bought last year isn't available anymore.
18:27Elon had a lot of conviction about compute becoming a bottleneck. He believed this was going to be a really big deal. So he overbought compute for SpaceX and Grok and XAI.
18:38That went kind of poorly for them because Grok just didn't do great. A lot of the best researchers that were at xAI have since left. The progress they're seeing just isn't great.
18:49But they already bought all of this compute. Thankfully, they learned how valuable that is and that if they can't use it, someone can.
18:56And now this compute that they spent a lot of money on let's see. Much did x AI spend on Colossus one?
19:04Apparently, building the initial phase for the Colossus deployment was only 3 to $4,000,000,000. They are now renting that compute for $1,000,000,000 a month.
19:14In four months, it pays for itself. Another one of the reasons XAI was actually kind of well equipped for this is the power constraints, because Elon, with Tesla, knows a lot about power and was able to make deals with Tesla battery manufacturing in order to make sure that their power would be reliable. And if they were ever constrained by the grid, they would have enough backup power to last for some amount of time.
19:34They also did crazy stuff like gas powered generators, which is funny from the guy who made the electric car company that made electric seem much more viable, to then go burn a bunch of gas in order to power his GPUs to make racist AI. But at least they're making money off it now. So it turns out that compute was as much a bottleneck as Elon had predicted, but their need for it didn't go up as much as he predicted.
19:56So in order to make money off that bet, he's now reselling it to companies like Anthropic and like Google, which is still just so crazy to me. Like, I honestly thought it was a troll when I saw this post today that SpaceX is now doing a billion dollars of compute a month through Google. That Google's paying them a billion dollars every single month, 12,000,000,000 a year almost, just for compute.
20:16Google's revenue last year was $400,000,000,000. That means that 3% of Google's total revenue is now going to x AI, is now going to SpaceX, is now going to one of their competitors. This is up there with Google's deal with Apple, where they pay Apple billions of dollars a year to be the default search engine.
20:35Like, this is that level of crazy. The point I'm trying to make here is that however bad you think the compute crisis is, it is probably worse.
20:43Whenever you think you see these companies pinching pennies because they wanna try and squeeze more money out of you, it's probably not that. They're probably just dealing with the compute crisis because they just don't have enough compute available for all the demand they're getting. And if you think they can just fix this by making more stuff, they kind of can if everybody makes exactly more enough and the demand sustains for long enough.
21:04But right now, the winner is whoever has the GPUs. And at this point in time, that winner is funny enough of all companies, apparently SpaceX, but also OpenAI, who I've managed to not mention at all so far.
21:17Because OpenAI made the bet a couple years ago that compute would matter, that scaling laws would matter, and would go and buy all the compute they possibly could. This is the real reason why Anthropics rate limits are so much less generous than OpenAI's are.
21:31They have been considerate of the compute crisis since before it even really started. And that put OpenAI in a really good spot. Google pretended they could work around it by making their own chips.
21:41That didn't go great for them, and now they're stuck paying Elon for their mistake. Anthropic didn't want to overspend on compute. They screwed up, and now they're paying Elon for their mistake.
21:49There's only really one winner in all of this. And that winner's NVIDIA. Because they know how in demand their stuff is and they literally cannot manufacture enough to keep up with the demand that exists.
22:00They can't really make less money right now because as long as any of these other bottlenecks get resolved, their amount of GPUs they can sell just keeps going up. They cannot make enough.
22:11And if everything resolves, it does great for them. If most things resolve, it still does pretty great for them. So Nvidia, congrats.
22:18You're gonna be holding your position in the stock market for a while it seems, and this is not meant to be financial advice. This is just my read of how chaotic things are. I hope this breakdown is helpful as you question why your quad code keeps running out of usage every couple hours.
22:32I think I've said all I have to on this one. If you wanna build a computer, now might not seem like the time, but it's going to get a lot worse before it gets better. So if you're staring at an SSD that you've wanted for a while or some RAM or a GPU, and you've been holding off hoping that prices go down, this is not financial advice, but realistically speaking, I don't expect this stuff to get cheaper anytime soon.
22:51The demand is just too insane and the world is changing around us. I don't think our phones are going to keep getting faster and more powerful. I think they're gonna rely on the cloud more and more as the compute that we use every day gets centralized in these big players hands.
23:04The world's gonna look very different than it did when I was a kid building computers for my neighbors, and I don't know if I like that. I'm just trying to do my best to share where things are going so we can all understand and have better conversations about it. Let me know how y'all feel about the compute crisis and if you think I'm overblowing it.
Every major AI lab is short on compute, and the title of this video is the punchline: the one entity that isn't constrained is SpaceX, because Elon overbought GPUs years ago on a conviction most people thought was reckless.
Frameworks
Named ideas worth stealing.
04:14model
The Five Bottleneck Stack
TSMC silicon fabrication
High-bandwidth memory
Hard drives / storage
Power grid
Manufacturing lead time (8-10 years)
Each layer of the AI compute supply chain represents a separate bottleneck. All five must scale together — if one expands while the others don't, the constraint simply moves downstream.
Steal forany systems-thinking breakdown of an infrastructure problem
18:38concept
The Overbuy Paradox
Elon's aggressive GPU overbuy for xAI/Grok looked like a mistake when Grok underperformed. In a constrained market it became a leverage asset worth $1B/month in rental income.
Steal forcontrarian bets on infrastructure scarcity
CTA Breakdown
How they asked for the click.
VERBAL ASK
22:18next-video
“Let me know how y'all feel about the compute crisis and if you think I'm overblowing it. And until next time, peace nerds.”
Soft close — no hard ask, no subscribe push. Invites comment engagement.
A 33-minute first-take from a developer who spent $3,000 on inference in 24 hours — benchmarks, real demos, session math, and the hidden safety intervention that silently degrades the model without telling you.
Theo goes all-in on Claude Code over the holiday break — six parallel instances, no IDE opened, two projects from scratch — and comes back with a changed worldview on writing code.
A first-look review of Claude Fable 5 and Mythos 5 from someone with early access: benchmarks, pricing, firsthand quirks, and two live multi-agent demos.
A 45-minute walk through Anthropic's internal data showing AI crossed from coding assistant to primary engineer — and a frank read on what that means for humans.