Modern Creator
Theo - t3․gg · YouTube

I don't have time to build these things, will you

A 44-minute wishlist from a burned-out builder who wants solo devs to tackle the infrastructure problems that have gone unsolved for a decade.

Posted
3 days ago
Duration
Format
Essay
comedic-rant
Views
73.1K
2.5K likes
Big Idea

The argument in one line.

AI agents have finally lowered the cost of building foundational infrastructure enough that a solo developer with patience can now rebuild npm, Git, mobile platforms, or team chat from scratch -- and someone probably should.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A developer who feels stuck picking what to build next and wants a curated list of genuinely hard, genuinely needed problems.
  • A builder interested in developer tooling, package management, source control, or mobile platforms who wants to understand where the current tools are broken.
  • Someone who has already shipped a few projects and wants to level up to infrastructure-scale ambitions.
  • A solo founder who uses agents heavily and wants to understand the gaps agents expose in existing tooling.
SKIP IF…
  • You want a tutorial or step-by-step how-to -- this is a rant-wishlist, not a spec.
  • You are looking for beginner-friendly project ideas; every idea here requires substantial prior knowledge of the ecosystem being disrupted.
TL;DR

The full version, fast.

The video argues that the cost of rebuilding foundational developer infrastructure has collapsed thanks to AI agents, making previously impossible solo projects viable. It walks through six specific gaps: npm lacks security primitives, revocable releases, and meaningful risk signaling; Git has no concept of private files or granular permissions; no tool syncs a code folder across machines the way Dropbox syncs files; mobile development is so hostile that a generation of potential platform builders went to the web instead; team chat still uses the wrong primitive (messages instead of posts); and the benchmark ecosystem is too thin to capture how models actually fail in practice. The call to action is direct: pick one and build it.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0001:50

01 · Cold open -- the list

Personal framing: what we build matters more than how. Teases a hidden list of ideas on a Tldraw canvas.

01:5004:00

02 · Sponsor -- CodeRabbit

CodeRabbit demo: change stacks that break large PRs into readable layers.

04:0013:05

03 · Idea 1 -- Better npm and npx

Problems: security, unpublishable releases, name squatting. Solutions: revocable releases under a threshold, paid AI audits, richer NPX risk info, private registries.

13:0523:07

04 · Idea 2 -- Better git

Git has no granular permissioning, no private files, commits are a bad abstraction. References JJ, worktrees, APFS performance disaster on Mac, and a move toward in-memory file systems for agents.

23:0726:27

05 · Idea 3 -- Dropbox for devs

A unified code folder synced across all machines with lazy-pull semantics, so env vars and project structure are identical everywhere without submodule hell.

26:2737:02

06 · Idea 4 -- New mobile platform

CyanogenMod and Paranoid Android as proof of what an open experimental mobile OS looks like. BlackBerry 10 as proof Android runtime compatibility is achievable. App Store and Play Store both hostile. Window closing as Android tightens.

37:0241:30

07 · Idea 5 -- Better Slack

Slack optimizes for sending, not reading or prioritizing. Posts (not messages) are the right primitive. Facebook Workplace had it right and just shut down. Agents need a participatory context model.

41:3044:17

08 · Idea 6 -- More benchmarks

Community-built benchmarks measuring real failure cases are underproduced. SkateBench as example. GitBench as a new example. Labs will optimize for any published score.

44:1744:34

09 · Close

Acknowledges he probably won't try the things people build -- but if it gains adoption he will. Encourages boiling the ocean.

Atomic Insights

Lines worth screenshotting.

  • npm has no way to revoke a published package even if it was a typo, which is why a wrong version number on TanStack Query is now permanent.
  • The TanStack namespace on npm is not owned by Tanner Linsley -- a squatter holds it and sold it to a third-party company after Tanner refused to pay.
  • When you run npx, the only info you get before executing unknown code is a version number -- no size, no author, no permissions, no risk score.
  • Git is built around the assumption that the repo is the permission boundary, not the contents of the repo -- and that assumption is what makes .env files dangerous.
  • Linux security patches are exploited before they are announced because every agent scanning the public repo can identify a fix and reverse-engineer the vulnerability.
  • An M4 MacBook Pro takes 31 seconds to recreate a fresh pnpm install from cache; the same task takes 6.8 seconds on a mid-range AMD machine running Ubuntu with a standard SSD.
  • APFS, Apple's file system, performs catastrophically on workloads involving many small file writes -- the kind that cloning and installing npm packages produces.
  • CyanogenMod-era Android made it easier to fork the entire OS than to submit an app to the Play Store -- and that hostile environment pushed great developers to the web.
  • Paul Henschel (creator of Zustand, React Three Fiber, Poimandres) built a custom Android ROM before he ever shipped a web package, because app distribution was harder than OS hacking.
  • BlackBerry 10 proved that a non-Android OS can run Android apps natively, which means the ecosystem lock-in argument against a new mobile platform is no longer valid.
  • Facebook Workplace, which had the closest thing to a correct team-chat primitive, shut down two weeks before this video was recorded.
  • Slack is built for sending messages, not for reading them, prioritizing them, or letting agents participate meaningfully in the same context as humans.
  • A community-built skateboard-trick benchmark revealed meaningful differences between models on 3D spatial reasoning -- proof that weird niche benchmarks have real research value.
  • The best way to push a lab to fix a model weakness is to publish a benchmark that measures it -- labs will optimize for any score that exists.
  • Ideas are cheap; the bottleneck has always been the cost of execution, and agents are collapsing that cost for infrastructure-scale projects.
Takeaway

Six infrastructure problems that are finally small enough to build.

WHAT TO LEARN

The cost of rebuilding foundational developer tools has collapsed, but most builders are still thinking in terms of apps and features rather than platforms and primitives.

03Idea 1 -- Better npm and npx
  • npm has no revocation mechanism, no meaningful risk signaling at install time, and no enforcement against name squatters -- each of these is a discrete product, not just a complaint.
  • Agents executing NPX commands from skill files are a new attack surface: a malicious package takeover can silently execute arbitrary code inside an agent's workflow with no warning.
04Idea 2 -- Better git
  • Git's core failure is that it treats the repo as the permission boundary rather than individual files or changes -- env file hacks, secret managers, and split repos are all workarounds for this missing primitive.
  • APFS performs catastrophically on small-file-creation workloads; a mid-range Ubuntu machine with a standard SSD clones and installs a project up to five times faster than an M4 MacBook Pro.
05Idea 3 -- Dropbox for devs
  • Syncing a code folder across machines with lazy-pull semantics -- load the files only when touched -- is unsolved and distinct from git submodules or cloud IDEs.
06Idea 4 -- New mobile platform
  • The reason a generation of talented developers left mobile for the web is not because the web is better -- it is because the barrier to distributing something on mobile was higher than forking an entire OS.
  • Android's runtime has been successfully embedded in a third-party OS before (BlackBerry 10), which removes the ecosystem argument against a new open mobile platform.
  • The window for a new open mobile platform is closing as Android tightens its bootloader policies.
07Idea 5 -- Better Slack
  • Slack's thread model buries active conversations behind a time-ordered feed -- the right primitive is a post that resurfaces when replied to, with nested comments rather than flat threads.
08Idea 6 -- More benchmarks
  • A benchmark measuring a specific model failure -- even a niche one -- creates pressure on labs to fix the weakness faster than any bug report or forum post.
Glossary

Terms worth knowing.

NPX
The executable runner bundled with npm that lets you run a package as a CLI command without permanently installing it -- e.g., npx create-react-app.
Name squatting
Registering a package name on a public registry with no real intent to maintain it, then holding it hostage to extract payment from the legitimate author or project.
JJ (Jujutsu)
An experimental source control system that replaces branches and commits with snapshots and tags, designed to eliminate much of the ergonomic friction of daily Git use.
CyanogenMod
A community-built custom Android distribution popular from roughly 2009-2016 that offered a cleaner, more customizable Android experience than manufacturer builds; succeeded by LineageOS.
Paranoid Android (ROM)
A custom Android ROM created by Paul Henschel (0xCA0a), later known as the creator of Zustand and React Three Fiber, before he transitioned to web development.
BlackBerry 10
BlackBerry's 2013 mobile operating system that included a full Android runtime, allowing Android apps to run on BlackBerry hardware without native ports.
APFS
Apple File System, used on macOS and iOS devices, which performs poorly on workloads involving creation of many small files simultaneously.
Just Bash
A JavaScript/TypeScript layer that emulates Bash in memory, allowing AI agents to run shell-like commands without requiring a real Linux kernel or file system.
Socket.dev
A security company that uses AI to detect malicious npm packages, often identifying exploits before npm's own security team does.
GitBench
A benchmark for measuring how well AI agents perform Git-related tasks, created by researcher Centimeters Griffin.
Facebook Workplace
A now-discontinued enterprise communication product from Meta that used a post-and-nested-comments model rather than Slack's channel-and-thread model.
Resources

Things they pointed at.

12:32productSocket.dev
18:06productDelta DB (Zed)
18:06productOrigin (Cursor)
19:13toolJust Bash
38:43productFacebook Workplace
42:37toolGitBench
01:50productCodeRabbit
Quotables

Lines you could clip.

01:02
Ideas are still cheap. I don't think I'm special for having a whole bunch of ideas. I just like building shit.
standalone opener -- no setup neededTikTok hook↗ Tweet quote
09:11
All I get is this random fucking version number. That's insane.
visceral reaction shot during NPX demoIG reel cold open↗ Tweet quote
14:30
The fact that all of these companies for managing secrets exist, even though in the end what it resolves to is just a fucking random file on your computer, shows that Git is failing us.
tight thesis -- works as standalone clipnewsletter pull-quote↗ Tweet quote
29:48
Customizing Android itself was easier than building an app.
one sentence, punchy, universally relatable frustrationTikTok hook↗ Tweet quote
32:01
Paul would've become a mobile dev if mobile dev didn't fucking suck.
emotional anchor for a great story -- lands hard even without contextIG reel cold open↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

analogystory
00:00I spend a lot of time talking about the best ways to build. It's kinda my whole thing. I'm really nerdy about the details of how we architect software.
00:07But things have changed, and how we build matters less in a lot of ways. What still matters and arguably always has is what we're building. Deciding on good projects to work on is hard and I know a lot of people are scared that if they pick the wrong thing, they're just wasting their time and energy.
00:21I've always loved giving advice on how to avoid this and I usually focus on one specific thing. Solve problems that you actually have. When you start solving those problems, you'll find more and then the rabbit hole continues until eventually you find something actually really useful.
00:35And that's how I have managed to build such cool things throughout my career. I start with one problem and then I find another and then I find another and I keep yak shaving until eventually I've built something kinda cool. But in a world where ideas suddenly matter a little bit more, getting them right might seem even scarier than it did in the past.
00:51And I've read my comments section. I know how many devs are struggling to figure out what they should be building right now. I have good news for you guys.
00:58I have a list. I have been keeping a list of ideas that I wished somebody would build for a long time. And as much as I want to build them myself, I have to be realistic.
01:08I do not have the time. As powerful as agents are, they make any one of these projects viable for me. They don't make all of them viable for me.
01:16And I wanted to share some of these ideas because ideas are still cheap. I don't think I'm special for having a whole bunch of ideas.
01:23I don't really think I'm that special in general. I just like building shit. The point of this is that I've seen what you guys do when you're given ideas.
01:32I watched the chaos that unfolded when I mentioned how much I like Nieri and how I want a environment like that to use for coding on my Mac. I've seen all of the insane things you guys are capable of building when given hints of the types of stuff that I wish existed. So I'm gonna stop giving hints.
01:47I'm just gonna give you all of it. I have all the ideas written down, but hidden behind this green wall. And to give you a taste of what I have in mind here, The first idea is better NPM.
01:59I have so much to say about all of these ideas, and maybe, just maybe, one of these will become an awesome product some of y'all can build. But in order for me to justify giving all of this away for free, I need to make a little bit of money off someone else's good idea, which I'm gonna do with a quick break for today's sponsor.
02:13Here's a hot take you might not expect from me. I think it's important that you continue to read your code, especially if you have agents writing it. But let's be realistic here.
02:20No one's reading a PR like this. 11,000 lines added?
02:24Come on. AI's resulting in way more of these big PRs, and sure, the AI can review it, but you need to know what's going on. Finding the things that matter in a pile of slop like this is nearly impossible.
02:34At the very least, this PR was reviewed already by CodeRabbit. Wait. What's that button?
02:39Review change stack? I've been waiting for this for so long. I've been complaining forever about how unreadable these giant PRs are and how stupid it is that we get the code in an alphabetically listed order.
02:51It makes no sense whatsoever. We've been dealing with this for years for no reason, and now CodeRabbit solved it. They break your PRs up into layers that prioritize the things that actually matter to make it way easier to read through what happened.
03:03You can mark sections as viewed as you go, which makes it way easier to track what you've actually looked at. There's even a mini map on the right side that shows you where different parts that matter are and describes them, which makes it so much easier to look through this type of work. And it's honestly just so nice to hover over the different things that PR did and get a summary of what's actually going on.
03:22Honestly, pretty much anything is better than GitHub's code review platform, but this is night and day difference. Read more code with less pain at soydev.link/coderabbit.
03:30We need to talk about better NPM, and also specifically better NPX, because I have so many thoughts here.
03:39NPM has problems. It is also an incredible piece of software, and I'm very thankful it exists. We should talk a bit about the problems with NPM.
03:48I'm gonna go through this a little fast, because remember, we have a lot of other ideas to go through as well. Obviously, one of the biggest right now that we are seeing every day is security.
03:58NPM, by nature, is going to be a big target for hackers and malicious actors. And every time they find a new way to exploit NPM, it becomes harder for good intentioned, good faith devs to use it as well.
04:11Every additional layer they put in makes life harder for us as devs, and I have been fighting so much to get packages added to NPM recently.
04:21It's regularly becoming the hardest part of the projects and the products that I'm building. That's kind of the other issue if we're being real here. Publishing is too hard.
04:30And it also has a lot of negative consequences if you get it wrong. For example, if you accidentally publish the wrong version number, you're never taking that down. That's out there forever.
04:41And even some really big package maintainers that have accidentally typoed a number on a version release, for example, something like TAN Stack Query has screwed this up, and now React Query's latest version is just not what it's supposed to be, and they are not able to revoke that publication because NPM is so paranoid about an old app that had specific dependencies not being able to be rebuilt because some of those dependencies vanished years later.
05:09These are all absolutely solvable problems. It's just that changing them right now with the way NPM works today would come with a lot of risks and potentially damage the ecosystem and the things we rely on.
05:20Just a handful of some things I would like to see in a new NPM platform. First off, it'd be really cool for revoking releases at a threshold.
05:30For example, if my package has been installed under a 100 times or it's been up for under five hours, I should be able to revoke it, obviously. Similarly, due to the nature of these more malicious things that are being published, I should be able to pay to audit every release.
05:45Why can't I put in an API key for Anthropic or put in a credit card number in NPM so that they will audit and compare the diffs of every release and give their vibe check as to whether or not that release is safe or intended. On that note, we need way more visibility on what the packages do.
06:04Not just like what permissions they need, but things like is this an obfuscated package or is this de obfuscated? Is this readable JavaScript or not? Is it open source or not?
06:14Is it backed by people we know who published the last release? There should be more metadata associated with given packages, both in, like, the NPM site when you view it, but also in your own CLI when you install it.
06:25Let's say somebody published a malicious package on NPM named, I don't know, isodd with a zero instead of an o. There's, of course, the real isodd package, which you also shouldn't install.
06:38But the fact that it would be possible for somebody to make an isodd with the zero, that is a truly, at its core, malicious package that, like, reads your file system, that accesses your network, that does a bunch of sketchy shit it shouldn't. And you would not see anything different installing that than you would see installing this, is a fundamental design failure in NPM itself.
06:59Different packages that have different needs and different complexity come with different risk.
07:06And that risk should be upfront to you when you decide what you're doing. And on that note, people are already noticing something.
07:14Name squatting should be killed. There should be a person or a series of agents that will verify submissions requesting names that exist to be handed over to them that does a good enough job vetting that you can actually deal with this stuff.
07:29Fun fact, the TanStack package on NPM isn't owned by Tanner. It's owned by some dirtbag who was squatting on it, trying to get Tanner to pay a whole bunch of money, or when Tanner refused, he sold it to some cringey company, thirty Tools. Might even be his company.
07:42I wouldn't recommend clicking the link. It's probably a scam. This should be illegal.
07:47And any real open source platform that knows a fucking thing about the ecosystem should hard ban this. But NPM is too busy. Shut the fuck up.
07:56They're useless. There's no excuse for how NPM is not doing their jobs lately. The Ninja Squad needs to be absolutely decimated.
08:03And on that note, I wanna talk a bit about NPX, the executable part of NPM. Because I would argue, potentially controversially, that this is actually the more fun entry point to rethink things here.
08:15If I build a random script or a tool that my agents need that's a part of their skills or whatever, distributing that code is annoying. It's really annoying. And I have a bunch of packages that I don't actually expect people to ever install, but they are very useful when you use them over NPX.
08:33The idea of NPX as a shared executable layer, similar to how the browser lets you go to different sites to do different things, NPX lets you use different code to solve different problems.
08:46I really like the idea of going further with NPX. What do I mean by this? Well, if you run NPX now on any given thing, let's say NPX slot slop, for example, Since I've already used this package, it doesn't have to confirm anything and it's good to go.
09:04But if I change this to, like, at latest or something, it asks for permission. It needs to install the following package. Is it okay to proceed?
09:11I have to choose yes or no. How the fuck do I know anything here? What info do I get here?
09:16This is so insultingly useless. There is nothing I can do here to get any info on if what I'm about to do is safe or not.
09:24Imagine it gave how big this app is. Imagine if it gave the author who most recently changed something. Imagine it gave a score for how likely it thinks it's safe or not.
09:35Imagine it gave you the permissions that it has when it runs. All I get is this random fucking version number. That's insane.
09:43This is such a useful thing, and not just for us as humans running these things, by the way, for our agents too. One of the real concerns we have now, and we should hold this concern because it's very, very real right now, is that if your agents have commands they can run-in a skill MD, for example. There's a skill that says, here is a thing you can do.
10:03Here's the command that you run, and the command is an NPX command, and then somebody maliciously takes over that package, your agent can entirely unknowingly execute malicious code.
10:14Imagine that this put out a little more info and your agent could read it and make a decision or highlight to you as the user, hey, I was supposed to run this command. I got a heads up, this might be an insecure thing. What do you wanna do about it?
10:27And again, since the value of this is largely small open source scripts, imagine that you can pay a small amount of money to have these things audited so that a security score comes up when people run them.
10:39If you ship a small bit of open source code and you pay 50¢ for an agent to read it and give a rough idea of what likelihood it is to be good or bad, and that is run on a third party that is verified, because again, you can't just run this on your own computer because you can fake those results, a rough idea of how risky every install is would be so useful.
11:00It would be so useful. And it would make me want to use NPX way more. If you combine this with private registries where you can run a command to get access to a bunch of packages that I have in my environment, that are not publicly released, but you're getting from me and my bucket, That would be so cool.
11:16Like, if you and I could both have our own TANSTAT package or something that might be different for our use cases, if I could publish to my private registry, and now that is the default over the public registry, that'd be great.
11:29There there's so many little things here. The idea of shareable software in the form of packages is incredible, but the architecture we have for it assumes that every single package was really expensive to make, and has a maintainer that's willing to spend a lot of time dealing with NPM.
11:44That is stupid and wrong and really, really needs to be addressed. Doing this right requires a lot of pieces. You have to build the integrations for publishing.
11:52You have to build the place that they're published. You have to build the CDN where all of this code exists. You have to build the platform for verification, you have to build the registries, you have to build the CLIs, you have to build a lot of shit for this.
12:03But I think you can now. That's why I did the video about building bigger, because I want you guys to think this way. Rebuilding NPM made no sense before because it would be too expensive and probably wouldn't get users.
12:13Now it's a lot cheaper. Maybe go do it. One more note on this.
12:17There are already companies like Socket that figured out how to use AI to audit new releases and find these exploits when they happen. Socket figures out when NPM exploits happen before NPM themselves do. There is so much room to build better things on top of and instead of NPM.
12:35I hope more people take the opportunity. And that's just idea number one, guys. I know that one seems bold, so let's get to something a little more reasonable for the next one, like, I don't know, reinventing source control from scratch.
12:48I've been complaining about this one for a while. I don't think git is the right abstraction for a lot of things. Git was so much better than all the source control we had before it that it became the standard, and it became a standard for a good reason.
13:02It is so much better than almost anything else that existed. But a lot has changed since Git was introduced, and both git and github feel like they're rotting at the core, because the needs they were built for are very different from the needs we have today.
13:16This is gonna sound like a really silly question, but hear me out. Why can't we commit dot ENV files? Think about this.
13:22Think about it deeply. There's an obvious answer here, which is because then everybody who has access to the repo has access to all of your sensitive environment variables. But why?
13:32Well, their answer is because that's just how Git works. When something's in the repo, now everyone has access. And once it's in the repo once, it's in there forever.
13:40If you decide to open source later, those environment variables are there. If you hire and then fire somebody, they had those environment variables and they probably still do because you probably had give them the file anyways. When you let another team work on your project, even if they don't need the environment variables, if they're in the repo, that team has them.
13:55You could try to solve this in a gluey way by adding a service on top of your code base that just manages environment files and environment variables. And I know a lot of people are doing this, I know there's a lot of cool companies that have tried to build solutions there. But my argument is a bit different.
14:10The fact that all of these companies for managing secrets exist, even though in the end what it resolves to is just a fucking random file on your computer, shows that Git is failing us. This is just one of the ways in which Git is failing us right now. Why can't I have private files in my Git repo?
14:28Why can't I have some files that only certain people have access to and everyone else doesn't? Why can't I have a branch that is private? Why can't I have a pull request that is private until it merges?
14:38Or even better, why can't I delay when my merges go public or are seen by other people on the team? Why is there no concept of granular permissioning whatsoever on top of Git? There's a reason for that.
14:51The reason is because it was built for Linux. And none of that was necessary for the development of Linux. Now it kind of is necessary for Linux, though, because when there are critical bugs or safety and security issues in Linux and they get patched, everyone has agents running that read every patch and say, hey, is there anything in here that might have been a security fix?
15:09And now you're getting zero days before they're even announced. Imagine if the Linux team could merge a security fix, cut a release, send it to all of the people who are maintaining Linux distributions that are vulnerable to the exploit, and get it all patched before the code itself is even public.
15:24Is that true open source? Probably not. But I don't fucking care anymore.
15:29We're in the middle of a security crisis, and we're bickering over where we should store files still. What the fuck went wrong? Sorry for the slight crash out.
15:36I'm just really mad these problems aren't solved and could kill open source in spirit if they're not fixed. We need ways to securely merge code and cut releases without all of that code being visible to the whole world. The idea of private and public being a repo level setting instead of a change level option is insane, and Git itself is built deeply around the assumption that the repo is what has permissions, not the contents of the repo.
16:02There have been lots of attempts to explore what this could look like, from Delta DB over at zed, to the new origin stuff cursor just released, But most of it is trying to do stuff like add more context for the agents to have, or stuff like making it easier to clone the repo so multiple agents can work on it in parallel.
16:19None of them are trying to address the fundamental problems within git. I actually made a thread about this a few weeks ago, and I'm gonna read through it quick because I think it really showcases what I'm talking about here. I'm using my AI psychosis to fix clouds for agents.
16:32Someone else needs to use their psychosis to fix source control. I would do it myself, but I'm too deep on the cloud thing. GitHub is dying and Git is not the right primitive.
16:39I'll dump some thoughts here. First, I said open source should not always mean 100% of our code is public 100% of the time. How much energy do we have to put into preventing ENV leaks in source control?
16:49How many miserable ways have we reinvented sharing of those instead? How many projects would be open source if they could hide in flight PRs? I know, for example, ClaudeCode would be much more likely to be open source if they didn't have to show all of the things they were working on all of the time because half of the time, it doesn't actually end up shipping and people would have seen that and been annoyed.
17:07Like it You wanna hide the work that isn't done, and Git does not let you do that. How many security fixes are sitting unpublished because they will be exploited as soon as they appear in the tracker? How much better would life be if I could have a mono repo with some sub packages that are private without having to split into multiple repos?
17:21Personally, I have had a lot of projects that I had to break up into multiple repos because I wanted to open source a bunch of them, but I couldn't open source the whole thing for various reasons. The fact that I have to shape the way I do work around what I want to share instead of using my tools to shape what is shared is just stupid.
17:40And it's silly that I fixate so much on this environment variable thing. But we have just become normalized to this as an obvious thing for no good fucking reason. It's just dumb, but it's how it works, so we accept it.
17:52Going a bit further here, I think commits are bad. I don't think they're terrible. I think they're a reasonable base unit, but they don't really work well the way that we're building today, and branches are even worse.
18:05I do really like how JJ does this. I I am resisting the urge to go all in on JJ at this point because it doesn't solve the problems I care the most about, but the ones it does solve, it solves so well that it feels much better to use. JJ solved a lot of ergonomic issues of source control management for devs, and I love it for that.
18:22It's what got me thinking more deeply about what would it look like if we unfucked Git. And a lot of the pieces there are great. The idea of snapshots and tags instead of branches and commits is so strong.
18:35In a world where we're used to thinking of commits all the time and worrying about our history constantly, JJ was a breath of fresh air and showed that we're wasting so much of our time thinking about things that don't matter. On that note, work trees are atrocious. It is actually hilarious how bad work trees are.
18:50I I had a cloned repo that was annoying to work with a few days ago because one of the work trees with an agent checked out main, and now I can't check out main in the actual main directory because one of the random work trees happened to have taken it hostage. Insanity. Holy shit.
19:06It's so bad. I I really don't like work trees that get primitive at all. And one last piece, this is where it starts to get even more controversial, I don't think source control should require real operating systems or file systems.
19:18The fact that you're expected to interface with Git via a CLI in a real environment with real files is stupid in a world where we have awesome tools like Just Bash. If you're not familiar with Just Bash, it is a full JavaScript or TypeScript layer that emulates Bash so that you can run an agent like a ClaudeCoder Codex type thing without having a real Linux kernel and a real file system.
19:42Instead, it can run entirely inside of memory, inside of JavaScript, and not know any better. It's a lot easier to clone shit randomly within memory than it is to move files around a whole bunch on your system.
19:54Slight tangent, but I wanna vent about this because I really want this fixed and I don't know where else to complain. Nullvox Populi shared this benchmark with me earlier this year, and it has haunted me since. This is a benchmark on disk performance for tools like Git and PnPM across different platforms.
20:12He was replying this to a post I made about how fast the SSDs are on my m five Mac. SSD is super fast. I was really excited for, like, bulk reads and writes and shit on this machine.
20:21And then I looked at his benchmark. The way this benchmark works is you clone the project, which has a bunch of sub frameworks in it that are boilerplates that have a bunch of random shit installed.
20:34And the benchmark is cloning all of this, PNPM installing all of this from cache, and measuring how long it takes for the files to be created.
20:43Because there is no network access being done here. Everything's already cached. It's just recreating the contents in the directories.
20:50And the results haunt me. With an old middle range AMD CPU and admittedly a lot of RAM, the clean install took six point eight seconds on Ubuntu, to be clear, with a normal Western Digital SSD.
21:04On an m four chip with the really fancy Apple SSD, the exact same thing took thirty one seconds.
21:13This appears to be a massive problem with APFS, which is Apple's file system, where creating a lot of small files sucks, The point where an m one Ultra could take upwards of a hundred and forty seconds to do this.
21:26Hundred and forty seconds for a task that a similar MacBook running Ubuntu instead can do in three to twelve seconds. That's insane. That is actually insane.
21:36APFS is garbage. At the very least, it's garbage at these types of small file readwrites, and all of these numbers show it. This type of thing just sucks.
21:45It's really bad. And it makes spinning up lots of small environments for your agents to work in, it is just bad. There is no excuse for it to be this bad.
21:54This means that crazy solutions like a RAM disk that are using other file system technologies actually make sense. Apparently, this is all f sync causing problems.
22:04I am not deep enough to know. I don't care. All I know is it sucks, and whenever I move to one of my Linux machines, cloning, installing, and all those things feels so much better than it does on a Mac right now.
22:16This is one of the many reasons I think we should be moving away from file systems. They're a rat's nest full of weird problems and assumptions that are platform specific, and something that works great on your Ubuntu machine might run like shit on a Mac just because they have some weird thing happening in the file system layer.
22:31Do you know what doesn't have these problems? Node isolates. If all of the content here just lived inside of memory, inside of something else, You don't have to worry about the weird implementation details of the file system on your fucking computer.
22:44Yeah. I I'm annoyed about all this.
22:46I hope someday someone fixes the problems for Mac OS. But I I'm done with file systems. I almost put file systems in this list.
22:54I know better.
22:58So instead, I put Dropbox for devs, because I'm stupid. And it's basically what I had in mind anyways. I have a handful of different machines that I'm using for building with agents right now.
23:08I have my Mac Mini in my other room. I have another Mac Mini downstairs as opposed to just doing like home automation shit. I have a GMK Techbox that should actually be arriving today, I could set up a similar thing on Ubuntu instead, and not deal with macOS's bullshit, and also have a little bit more RAM.
23:22I have so much hell just managing the content of all of those machines.
23:29I can't tell you how many times I spun up a work tree and forgot to pull the latest main, and now it's building on a stale base. I can't tell you how many times I didn't have the right environment variables on one computer, but I did on another. I can't tell you how many times I didn't know where a project was on one machine because I architected my directory with all my code different than somewhere else.
23:48I do my best to clone things in the same places all of the time, but I don't always succeed. Do know what I don't have this problem with? Dropbox.
23:55Because Dropbox is one structure that exists on all of my machines that use it. On my NAS that backs up all my Dropbox content, on my editor's computer that downloads all of the video content for it, and on my laptop where I do all my graphics work, the structure and the contents of all those folders is the same across all of them.
24:12But Theo, we already have Git. Well, you know how I feel about that. But more importantly, how do you manage your Git repos without making another Git repo and then dealing with submodule hell, which no one's actually going to deal with?
24:24Imagine you had your code folder, the folder where all your projects are on your computer, and then you go spin up your Mac Mini, and everything is there. It's all there the same way it was. Your environment variables can sync totally fine.
24:36You'll have to do some weird hacks around node modules because they're different on different OSs and whatnot. But imagine you could just have your code folder on all of your machines without actual effort. Nothing is built to do this right right now, and you have to build a lot of pieces.
24:50I actually started with a project called FS two, meant to be file system two. But even that wasn't going to go far enough for what I have in mind. My dream here would be that I have my code folder structured one specific way with all the different subfolders and whatnot.
25:05And then when I spin up an agent in the cloud, or my Mac Mini, or anything else, the contents are all there, or at the very least, the structure is there, and once you navigate to or try to explore or touch any of the files in a given section of it, it will pull that part down in the moment at that time. None of the existing solutions come close to what I have in mind here, which is let me use the stuff I'm already using, but take over this directory so new things will appear in it automatically without additional effort.
25:32Tavy and Chat touched on like a rough piece of what I'm imagining here, which is imagine something like Google Drive or Dropbox having their own equivalent of a dot gitignore. And Robert in chat here said, dude, I've been dreaming of exactly what I'm describing for so long, but you don't have the skills to make it yourself.
25:50Have you proven that to yourself yet? Have you proven that the cool models and ages and dev tools we have can't get you over the line here? Because what matters to build something like this isn't necessarily your capability or your knowledge.
26:03It's your token budget and your patience. If you're patient enough to go through the hard bars to do this right, Robert, you can absolutely do it. And I really hope multiple people in my audience go and try to do this.
26:13So what is next on my list? We're definitely gonna ramp down a bit. Right?
26:17Right? With a a new mobile platform? This one hurts for me to even share.
26:26But I'm scared if we don't do this now, we'll never be able. A lot of people don't know this, but I used to be an Android fanboy. I was like the Android guy in my high school.
26:36I used to give people shit for using iPhones when I was a kid. Obviously, that has since changed.
26:42I got the sinful orange iPhone now. But I am what I am, and what I am is a person who likes good experiences on their devices, and I am a person who uses their phone heavily enough and relies on it often enough.
26:54Having a phone that works and works well with all the applications I rely on is important. But there are things that I can't stop thinking about when I think about mobile. Things like Apple's inane, god awful policies about what's allowed to be distributed and paid for on the App Store.
27:12The fact that I can use my credit card to order an Uber or a DoorDash, but I can't use it to buy a game unless I do it through Apple Pay, because Apple's arbitrarily restricted digital goods is something they get 30% of and has to go through their payment systems. The fact that they are banning all sorts of shit just because they don't like it.
27:29The fact that the newest expo release isn't allowed on the App Store, and if you want it, you have to get it set up yourself manually with an account that you're paying a $100 a year for, and then have to spend hours setting it up. Or here's a silly one that's one of my favorites. The info plist file that describes your application and its permissions needs to have the developer and the team in the file, which means your source control has one user's config hard coded in it.
27:55It's insane. The developer experience for iOS is so horrible that I can't imagine anything worse until I try building an Android app.
28:05Android makes it a little easier to start building and get it on your phone, but the chaos of getting it actually distributed on the Play Store in the absurd opaque nature of when they decide to ban your shit, at least Apple gives you a, admittedly bullshit, reason why they're not letting you release. Google just arbitrarily says, nope, no release, and doesn't give you enough info to fix.
28:23Both of these are terrible. And I forget how bad it is until somebody on my team asks me for my home address so they can put it into the filing to get approval so they can start building the app again. It's so bad.
28:36I'm gonna tell a real crazy anecdote here, and somebody touched on it in chat. I'm gonna talk about CyanogenMod a bit. Most of you guys are probably too young to have any idea what CyanogenMod was.
28:47CyanogenMod was a custom build of Android.
28:51Android is the operating system on the majority of mobile phones, but it's also an open source OS and platform. And most phone manufacturers add a bunch of junk that most people probably wouldn't want.
29:02CyanogenMod was a community effort to make Android better. It was meant to be vanilla Android with a couple niceties and things like the ability to more easily overclock your system, change your status bar colors, get rid of bloatware, make your phone faster.
29:19You can even install custom modded kernels and shit. It was so fun. And I was really involved in CyanogenMod back in the day.
29:26And here is where I will drop my spiciest take about where mobile was when I was a kid. The reason I got so into CyanogenMod and customizing and writing code for the OS itself is because, as stupid as this is, customizing Android itself was easier than building an app.
29:42What the fuck? Imagine a world where it's easier to fork Chrome and build new features in it than it is to put up a website. Do you know how insane that would be?
29:52Can you fathom a world where building a browser and editing your browser is easier than getting something online?
29:59That was the case when I grew up on Android. If you had one of the phones that could be easily rooted or have its bootloader unlocked, or one of the ones that somebody discovered the right way to do that with, flashing a new OS was so easy, and building your own was relatively trivial.
30:16I remember the era where there were, like, 20 different flavors of custom Android ROMs being made by independent devs doing it for fun. Ready for a real trippy one? Very few people here have heard about Paranoid Android, I'm sure.
30:29Not the song by Radiohead, the custom ROM. Drop ones in chat if you ever heard about Paranoid Android before.
30:37More ones than I expected. Now drop a two in chat if you know the person who made it. I don't think any of you guys do, because this one blows me away every time I learn it.
30:48Paranoid Android was founded by Paul Henschel. Paul Henschel's also known as 0xCA0a, the creator of Poimanders, the creator of Zustand, the creator of React Three Fiber, one of the best React community devs in the world, started with open source romhack design and development.
31:12I ran an operating system this guy created before I installed a package that he created. And here is my spiciest take. Paul would've become a mobile dev if mobile dev didn't fucking suck.
31:23I haven't talked with him about this in-depth, but if he's anything like me, and having talked to him before, I'm pretty sure he is. He made Paranoid Android because he wanted to build things on his phone, and building apps sucked.
31:34Other ROMs pissed him off, so he built his own. And it went really well. And then he kept trying to build apps and other things, and it still sucked.
31:42And he found the web sucked less, so he went there instead. And Android lost one of the greatest developers I've ever met, because the platform was too hard to build for. It was easier to rebuild the platform than it was to build apps for the platform.
31:56And now, have to do one more rabbit hole here. I promise this one's worth it. Will you talk about BlackBerry for a second?
32:03BlackBerry was the first winner of the smartphone wars, and it has since died hard and been bought by TCL, the panel company in China, in order to experiment with mobile screen development.
32:15The reason I wanna talk about BlackBerry is BlackBerry 10. Fun fact, I worked at Staples as a salesperson and technician when the BlackBerry phones using BlackBerry 10 came out, and it was very hard to explain it to people. Because it was a BlackBerry.
32:30It ran their own proprietary BlackBerry OS, but it could also run Android apps. They had a complete Android runtime built in for running Android applications.
32:41This was a huge deal because it meant that the incredible ecosystem of apps that were available on Android would work on your BlackBerry even though there wasn't much software available specifically for BlackBerry.
32:53You had BlackBerry Special Apps, which were really good at the time, but you also had Android's ecosystem too. And that combo made it seem really enticing, but there were some problems here.
33:04First off, there just wasn't much reason to go with this instead of an Android phone, especially because the Android apps performed a bit worse. The CPUs available for phones weren't as good either, so the virtualization through the runtime was at higher cost than it would hypothetically be today if someone did something similar.
33:19And then there's just the fact that BlackBerry itself kinda sucked, and the software that they built was closed source and only ran on BlackBerry devices, and it didn't offer anything new. So why am I talking about BlackBerry now? Well, I'm talking about it because they proved you can build a different OS and still support Android apps.
33:36So the historic problem that existed in building a new mobile operating system, which is that you would lose the whole ecosystem I'm not one to pretend Android apps are just as good as the iOS equivalents.
33:46Believe me, I know. I have a folding phone. I've experienced just how bad Android can get.
33:50But I still need a place to put apps. And the fact that it is so much work to even try to distribute an app other people can use is insane right now.
34:00And what I'm imagining, what I'm dreaming of, is a future where we do something similar to what we did in the Paranoid Android and CyanEngine mod era, the peak of custom OSes, before we got to the era of LineageOS, where we're just trying to maintain a good minimal private open fork.
34:17I want something experimental. I want something that works with Android and Android apps, but is something fundamentally different. Something that encourages people to build on the platform.
34:27Something that makes it easy to customize and experiment and build new apps that can do new things. Something that makes it easy for me to see something someone is demoing, scan a QR code, and have it on my device working. Something like NPM, but for mobile.
34:41But to do that, you need to go to the OS. And I think now might be the time to do that. In fact, I think now might be the last time to do that, as Android is getting more and more closed.
34:52What would it look like to have a mobile OS that encouraged you to develop on it, to customize it, to build whatever you want, both as a developer and as a user? Imagine a world where you had access to all the apps you rely on every day, but also a platform where you could build new things on top. Imagine an app ecosystem that encourages you to fork and modify within the apps themselves.
35:13An app ecosystem that doesn't block you from doing just in time compilation, that lets you do crazy shit. Ready for the hottest take?
35:20We already know what this looks like. It looks like the internet. It looks like Linux.
35:25It looks like Windows and Mac OS to an extent. And it looks a lot more open and a lot more progressive than what we have on mobile right now.
35:34I dream of a world where mobile feels accessible to do cool things on, and I'm scared we might never see that world because we have a duopoly of people who just don't care. Apple benefits too greatly from their 30% cut to ever make software distribution easier.
35:51Android benefits greatly from having basically no money put into it by Google at all, and just slowly languishing and dying. So the likelihood Google does anything that Apple isn't doing first is near zero. It was a joke on the Android team back in the day that the best way to get your good ideas to actually ship an Android was to leak them to Apple so Apple would add them, and then suddenly Google would give you permission to do Android's kind of become a fucking joke, and that sucks because it shouldn't be.
36:19It's an open platform. And thankfully, there are still enough devices shipping with open bootloaders that you have a real chance to do something better. What would it look like to rethink the mobile platform to support Android, but to be something else?
36:33I don't know, and I'm scared that I never will get to. So hopefully somebody here will be inspired enough to go do it yourself, because now is the chance. Now is the last time.
36:41Speaking of things where we might have our last chance right now, I wanna complain about Slack a bunch. Oh, Slack.
36:48Slack has a real lock in problem that's gonna be really hard to defeat, because Slack's connection system, where I can have a shared channel between two companies, is really powerful.
37:00And almost every channel I have in Slack right now is just there so I can talk to another company. But there are so many problems in Slack right now that it feels miserable to use. The lack of in line replies is absurd.
37:14You have to do a thread to reply. Threads themselves are pretty bad too, because they just fall back in the history, even if they're still active, and finding them's even harder as a result.
37:23I can't reply to one message inside of a thread. I have to reply in the thread, and maybe manually quote parts myself. Don't get me started on the code blocks and shit.
37:32But then we have a new user of Slack where it doesn't work really at all for them. Agents. We have been trying to brute force agents into Slack for a long time, and all it has done, at least in my case, is remind me just how bad of a platform Slack itself is.
37:48Slack is built for sending messages, nothing else. It is not meant for reading messages.
37:54It is not meant for prioritizing work. It's not meant for getting status updates. It's not meant for using.
37:58It's meant for sending. And I dream of a world where that is not the case. I dream of a world where I have a chat app that helps me prioritize what I'm supposed to be doing.
38:07That brings up recent things, even if they're happening in an old thread. That makes it easier to branch off context, take a sub comment, and send an agent to go explore, and then come back with feedback. I want infinite nesting.
38:20I want threads that make sense. I want replies that make sense. I want agents to be able to come in and be part of the same control plane I'm in in a way that is logical.
38:29And what I want, and this hurts me, this really hurts me, I want Facebook Workplaces. We've all used Facebook at some point.
38:38When you make a post on Facebook, it's now there. It could be in a group, it could be on your wall, it could be a lot of different places. You can post on somebody else's wall even.
38:46Once that post is there, you can leave top level comments for things you want to respond to on the immediate post, but you can also nest comments. You can do threading within a given comment on a post on Facebook.
39:00You can sub nest within that too, where if one person leaves a comment saying, hey, I'm not sure about this, and then two people reply with different takes, You can reply to both of them individually without it clogging up the main thread. And most importantly, when someone leaves a comment on an old post, that post gets brought to the top.
39:19Why the fuck don't threads work that way in anything else? Why is it that when there's an old thread and I leave a reply in it, the thread stays old unless you happen to have notifications on for it in literally every other app? Facebook Workplaces is the closest thing I've ever seen to a good context management project and product for working with a team on real work.
39:43Hosts were much better primitive than Slack messages. The problem is that we have a weird breakdown with chats right now between messages, replies, threads, channels, and companies.
39:56And none of those are the right abstraction, and we're stuck fighting them all of the time. I think posts are a much better primitive because they fit somewhere between something like a channel and something like a thread.
40:09And then threads are the sub primitive on a post that makes them very easy to interface with. And not just for humans, for agents too. So why don't I just use Facebook Workplace?
40:20It's because they shut it down two weeks ago. The one platform that could've done what I wanted doesn't even care enough to keep iterating. They announced that they were ending all development in August of last year.
40:31I want this so bad. I want this so bad. I even started building this one myself, but I've been too busy to go anywhere with it.
40:38I want something like Slack that feels more like Facebook, that is built to be way easier to interface with agents as well.
40:47Now imagine combining this with something like Hermes Agent, where instead of having a bunch of threads spun up inside of Discord that are impossible to manage, still better than doing it in fucking Telegram, by the way, instead of that, you have an actual content system. You have a group where you post the things you wanna work on, and then when your agent replies to the post, it gets bumped back up to the top.
41:08So good. And I wish it existed. Apparently, Teams has some of these ideas baked in somewhere, which is cool, but it's also Microsoft Teams, so it'll never be useful.
41:17Let's be real. I want this as an open source standard that is easy to adopt and play with, not to replace Slack, but to slowly replace Slack.
41:29I have one last thing I have to talk about, and I'll keep this one short. Benchmarks. We need more benchmarks.
41:35We need weird benchmarks. We need benchmarks that are written by people other than researchers and labs because we need better ways to measure the capabilities of models. It's silly, but my stupid SkateBench, the benchmark where I measure how well models can name a trick given a description of a skateboard trick, has turned out to be really useful, and a number of researchers in labs have hit me up asking questions because the numbers fascinate them.
41:58Because it's somewhere between a complex grammar bench, a niche English, like, language bench, and a three d spatial reasoning bench. And we need more benchmarks.
42:09We need people to take the work that they try to use AI for that fails and save it in a reproducible way so they can try it again. We need benchmarks that measure how good agents are at stuff like Git, which Centimeters Griffin just made GitBench, which I'm really excited about.
42:24We need benchmarks that measure everything from weird hypotheses to real work, and everything between as well. We need benchmarks that measure how well models can determine what a picture from the sky is of. We need benchmarks that can determine which models are best at diagnosing cancer given random screenshots of like random scans that people had done with MRI machines.
42:45We need more ways to measure the capabilities of models. We need a lot more of them.
42:51And we just don't have enough. Go build some weird benchmarks, especially if you have a problem that agents suck at. Building a benchmark that shows that all agents suck at it is one of the best ways to incentivize the labs to fix it.
43:03If you really love an obscure programming language like Crystal or something, and you notice that the models suck at it, make a bench that measures it to show the world and to show the researchers that models are bad at that language. As soon as there's a way they can measure it, they'll go hard to try and ramp up their scores.
43:17Go build some benchmarks. You'll be surprised how much you learn, and also how valuable those measurements can be. I think I've covered all the random ideas that I really want to will into existence here.
43:28I just want these things to happen. And I will give you guys the warning in advance, if you do build one of these things, the chances that I try yours are relatively low. But if you make it successfully enough that I see others using it, that I see my team using it, that I see people talking about it and posting it, I will absolutely hop in to give it a look myself.
43:47I want these things to exist way more than I want to build them, and I'm hoping someone else will step up and build a handful of them, even just to push software forward as we challenge more of our existing assumptions about how things are supposed to work. It's time to build bigger stuff, and I hope this helps give you some ideas on cool things to build.
44:05And if you have your own different ones, you should go do that instead. The point here is to try and push you to build bigger solutions to harder problems than would have made sense before. So what are you waiting for?
44:15Go kick up an agent and try one of these things out. See what you can build. I bet you'll be surprised just how far you can go.
44:21I know I have been myself. Go experiment, go build, go challenge things that you didn't think were possible, go boil the ocean. It's a really fun experience to do.
44:30Let me know how it goes, and until next time, peace nerds.
The Hook

The bait, then the rug-pull.

The title is a dare. For 44 minutes, a builder with a list he has been keeping for years walks through six pieces of developer infrastructure that are broken, missing, or dying -- and explains exactly why he cannot fix them himself, but thinks you probably can.

Frameworks

Named ideas worth stealing.

05:50concept

Revocable release threshold

Allow package authors to revoke a published version if it has fewer than N installs or has been up fewer than X hours -- protecting against typos without breaking the ecosystem.

Steal forany publishing platform with a public artifact store
06:24concept

Paid AI audit layer for packages

Pay a small per-release fee to have a third-party agent read the diff, compare it to prior releases, and publish a risk score visible to all installers.

Steal fornpm alternative, private registry, or security product
38:43model

Posts as the team-chat primitive

Replace Slack's message/thread/channel hierarchy with Facebook-style posts that support nested comments and surface back to top when replied to -- making old context findable and agent participation natural.

Steal forteam chat product, async collaboration tool
CTA Breakdown

How they asked for the click.

VERBAL ASK
43:20next-video
So what are you waiting for? Go kick up an agent and try one of these things out. See what you can build.

Soft close -- no subscribe ask, no link. Pure challenge framing. Consistent with the video's tone of assuming the audience is already capable.

MENTIONED ON CAMERA
FROM THE DESCRIPTION
Storyboard

Visual structure at a glance.

open
hookopen00:00
ideas list reveal
promiseideas list reveal01:56
npm problems
valuenpm problems04:10
git problems
valuegit problems13:05
dropbox for devs
valuedropbox for devs23:07
mobile CyanogenMod
valuemobile CyanogenMod29:14
better slack
valuebetter slack37:02
benchmarks + full list
valuebenchmarks + full list42:03
close
ctaclose44:17
Frame Gallery

Visual moments.

Chat about this