Modern Creator Network
Better Stack · YouTube · 06:25

Developers Finally Got an Open-Source Voice AI Platform (Dograh)

A 6-minute dev tutorial reverse-engineering the open-source VAPI alternative that gives you visual workflow building, full observability, and self-hosting — without the platform tax.

Posted
2 days ago
Duration
Format
Tutorial
educational
Channel
BS
Better Stack
§ 01 · The Hook

The bait, then the rug-pull.

You shipped a voice AI agent. It worked. Then the bill arrived — LLM, STT, TTS, telephony, platform fee — stacked four layers deep. That is the problem Dograh is trying to solve, and Better Stack walks through the entire platform in under seven minutes: from Docker spin-up to live test call to a landscape comparison that names every major competitor by name.

§ · Stated Promise

What the video promised.

stated at 00:18Today I will show you Dobre and an open source VAPI alternative you can self host, inspect, and control.delivered at 01:29
§ · Chapters

Where the time goes.

00:0000:25

01 · Stop Renting Your Voice AI Stack

Hook: stacked fees and no ownership. Sets up the core developer pain before the product is named.

00:2600:59

02 · Why AI Phone Agents Get Expensive Fast

Animated pipeline diagram (phone call to STT to LLM to TTS). Looks simple from the outside — reality is messier.

01:0001:28

03 · Voice AI Is Not Just ChatGPT With a Phone Number

Real calls: interruptions, silences, topic pivots, weird questions. When it breaks, the bot gave a bad answer is not enough.

01:2901:56

04 · Dograh Demo: Build a Voice AI Agent Locally

Clone GitHub then cd then docker compose up. Docker-first as a developer credibility signal.

01:5703:35

05 · Creating a Lead Qualification AI Phone Agent

Visual workflow builder: prompt node, qualification step, API tool call, branch, transfer. Live test call with AI agent Sarah. Post-call observability: transcript, trace, tool call log, recording.

03:3604:10

06 · What Is Dograh?

Three things: Voice Engine plus Visual Workflow Builder plus Platform Layer (testing, tracing, recordings, analytics).

04:1104:34

07 · Voice AI Agent Workflow

Animated: Map the flow. Skip the boilerplate. BYOP — bring your own LLM and TTS providers.

04:3504:46

08 · Testing, Tracing, Recordings, Analytics

Open source means inspect, change, self-host. Low GitHub stars signals an early-stage find.

04:4705:22

09 · VAPI, Bland, Retell: Fast but Less Control

Hosted platforms move fast but pricing, limits, and deployment options are out of your hands.

05:2305:49

10 · Pipecat and Vocode: Flexible but More Glue

Raw frameworks give control but require building everything — no UI, no workflow editor.

05:5006:25

11 · Where Dograh Fits for Devs

Write code where code matters, use the builder where your flow matters. Subscribe CTA.

§ · Storyboard

Visual structure at a glance.

hook
hookhook00:00
pipeline
promisepipeline00:26
docker demo
valuedocker demo01:29
agent build
valueagent build01:57
what is it
valuewhat is it03:36
comparison
valuecomparison04:47
cta
ctacta06:10
§ · Frameworks

Named ideas worth stealing.

04:47model

The Three-Tier Voice AI Landscape

  1. Hosted platforms (VAPI, Bland, Retell) — fast, locked in
  2. Raw frameworks (Pipecat, Vocode, LiveKit) — flexible, high glue
  3. Open-source platforms (Dograh) — builder UX plus self-hosting plus observability

A positioning triangle for any developer tool category: speed vs. control vs. ownership.

Steal forPositioning JoeFlow against SaaS alternatives — replace the three categories with your own tier labels
03:36list

The Three Things Product Explanation

  1. Voice Engine
  2. Visual Workflow Builder
  3. Platform Layer

Dograh reduces to three named components, each solving a distinct layer of the problem.

Steal forAny product with multiple components — lead with the three nouns, then expand each one
§ · Quotables

Lines you could clip.

00:00
That's not even the worst part. The worst part, you still don't really even own the system.
Strong emotional escalation — sets up pain before the product is namedTikTok hook
01:08
A voice agent is not just ChatGPT with a phone number, it is a live system with a bunch of moving parts.
Debunks a naive assumption developers actually holdIG reel cold open
02:48
The value is not no code. The value is not wasting code trying to tie everything together.
Tight reframe of what no-code means for developersnewsletter pull-quote
05:57
Write code where code matters, use the builder where your flow matters, inspect the runtime when things break, and swap providers when costs change.
Four-part maxim — quotable thesis statement for the whole videoIG reel cold open
§ · Pacing

How they spent the runtime.

Hook length21s
Info densityhigh
Filler5%
§ · Resources Mentioned

Things they pointed at.

04:47productVAPI
04:47productBland AI
04:47productRetell AI
05:23productPipecat
05:27productVocode
05:29productLiveKit
§ · CTA Breakdown

How they asked for the click.

06:10subscribe
If you enjoy coding tools like this, be sure to subscribe to the BetterStack channel. We will see you in another video.

Clean verbal close with on-screen SUBSCRIBED animation. Mid-roll subscribe ask also appears at ~1:26. No product upsell or link CTA in closing.

§ · The Script

Word for word.

analogy
Speaker 1
00:00You just built a voice AI agent, it works, then the bill shows up and you're paying for the LLM, the voice, the phone call, and then another platform fee on top of that. That's not even the worst part. The worst part, you still don't really even own the system.
00:14Today, I'll show you Dobre and an open source VAPI alternative you can self host, inspect, and control.
00:26Voice AI nowadays can look somewhat simple from the outside. Take a phone call, turn speech into text, send it to the LLM, turn the answer back into speech, it's done.
00:37That's easy. Right? Well, as any of us know who've tried this, not really because real calls are messy.
00:44People interrupt, people go silent, they're gonna change topics, they can ask really weird questions. Your agent needs to call APIs and when it breaks, you need to know why.
00:56That is where most voice AI projects become more of a pain. A voice agent is not just chat GPT with a phone number, it is a live system with a bunch of moving parts, right? That's speech to text, LLM, text to speech, state, tool calls, a boatload of other things, you get it.
01:12There's a lot of moving parts that we don't actually see happening. And when the call fails, the bot gave a bad answer is not enough. Was it the prompt?
01:20Was it the model? What was it? Why did it fail?
01:23And this is where Dogebra comes in. If you enjoy coding tools that speed up your workflow, be sure to subscribe. We have videos coming out all the time.
01:31Alright. Now, let's look at this in practice. I'm gonna start locally because if a tool says it's built for devs, I wanna see Docker before anything else, this was super easy to spin up.
01:40I'm gonna clone it from GitHub, I'm gonna c d into the folder and then I just have to run Docker compose up. That's simple enough, easy enough for us. Once the containers are running, we can jump into the Dobre UI.
01:53Now I'll build a simple lead qualification agent. So what do I mean by that? Someone's gonna call in, the agent asks what they wanna build, then it asks about the company, the size, the budget, small things like this.
02:06It'll then call an API tool to create or update a CRM lead if we embed that. And maybe I could even say if the lead was qualified, it transfers to a human. So I add a prompt node, then a qualification step, then an API tool call and then I can add a branch and a transfer.
02:28There's no custom orchestration code yet and that's kind of the point here. This looks like a no code canvas but for devs and the value is not no code. The value is not wasting code trying to tie everything together.
02:41Now, let's try to run a test call here.
Speaker 2
02:44Hi. This is Sarah from inbound calls. Are you still there?
Speaker 1
02:49We're looking for an AI phone agent for inbound demo requests.
Speaker 2
02:55That's great. I can definitely help you with that. To make sure I connect you with the right solution, could you tell me a little more about what you're looking to achieve with an AI phone agent for your inbound demo requests?
03:06Let's say around twenty thousand minutes. Thanks for sharing that. And what is your company size and industry?
Speaker 1
03:12Now we can see the transcript here. We can see the trace. We can see the tool call that actually happened and we can see the state changes.
03:22Plus here's the recording which I wanted in the first place and that is what I want as a dev, not just the bot worked, I wanna know why it worked, when it fails, I want evidence of this actually happening. So what is Doga? Doga appears to give us three different things out of all this.
03:40We get a voice agent, a visual workflow builder and the platform layer you usually have to build yourself. The voice engine is the part that connects the caller, the phone provider, speech to text, the LLM, and text to speech. That is what makes the call actually happen.
03:57The workflow builder is where you design the logic of this whole system, so instead of hard coding every prompt, branch, API call and transfer, you can map out the flow visually. So huge win here, I like these kind of maps.
04:10Ask this question, wait for the answer. That's kind of what we're mapping out here.
04:14I can call this API branch here, transfer there, that kind of logic should be easy to change. Then to all this, there's the platform layer, testing, tracing, recordings, analytics, that is the boring stuff every series voice project eventually needs.
04:30With all this, you can bring your own providers, your own LLM and your own TTS. Because Doga is open source, you can inspect the code, change how it works and self host it. As of this recording, GitHub stars are low.
04:42So this is a super new find that I found but it's honestly a rather cool one. Now let's compare Doga to other things we already have out here. You have three main ways to build voice agents.
04:53First is hosted platforms, VAPI, Bland, Retail. These are good when you wanna move fast and you don't wanna run infrastructure. You get clean dashboards, APIs, transcript, testing tools, all that's really useful, but you start to lose control right there.
05:07If the platform changes pricings, you deal with it. If the platform changes limits, deal with it.
05:14Right? If you need custom deployment, anything like that, again, you might hit a wall.
05:19Hosted tools are fast though, so I guess that's a win. You have some of these raw frameworks like I came across PipeCap, Vocode, LiveKit I think is one of them.
05:30These give you a lot more control, you can build almost anything. But now, you're building everything around this framework, off UI workflow editor, so that's a big trade off using things like that.
05:42Now, Doga is still way too new but it's here, so I think their bet is kinda simple. What if you could use a visual voice agent builder without giving up the self hosting, choosing a provider, tracing and control? That's what this appears to be.
05:58Write code where code matters, use the builder where your flow matters, inspect the run time when things break, and swap providers when costs change. Self hosting gives us a lot of control which is huge. VAPI bland retail are best for fast hosted deployment, but the trade off cost lock in and less control.
06:19If you enjoy coding tools like this, be sure to subscribe to the BetterStack channel. We'll see you in another video.
§ · For Joe

Steal the format.

Better Stack playbook

Pain hook then problem depth then live demo with observability layer then landscape positioning — this is a repeatable formula for any dev tool reveal.

  • Open with the financial or control pain, not the product name — let the problem breathe for 20+ seconds before the solution appears.
  • Show Docker first if your audience is developers — it is a credibility signal, not a friction warning.
  • The demo must include failure-state tooling (trace, logs, recording) — showing only the happy path reads as marketing, not engineering.
  • Use a named three-tier landscape comparison to position against both over-controlled and under-controlled alternatives.
  • The title formula Developers Finally Got [category] [tool name] signals arrival and relief — test it for JoeFlow or any tool reveal.
  • Subscribe mid-roll at ~1:26 (after pain is established, before demo) is well-timed — viewers are engaged but not yet at peak value delivery.
§ · For You

What this means if you are building with voice AI.

For devs evaluating the stack

If you are paying platform fees on VAPI or Bland and feeling locked in, Dograh is worth an afternoon evaluation — it is open source, self-hostable, and spins up with one Docker command.

  • Try the Docker demo locally before committing — it takes minutes and costs nothing.
  • Dograh is early-stage (low GitHub stars at recording) — treat it as a beta bet, not a production default.
  • Bring your own LLM and TTS providers to avoid double-stacking fees.
  • Use the workflow builder for call flow logic and drop to code only for custom integrations.
  • The observability layer (trace, recording, tool call log) is what separates a real voice AI platform from a chatbot with a phone number.
§ · Frame Gallery

Visual moments.

§ · Watch next

More from this channel + related dossiers.