Modern Creator
Dan Kieft · YouTube

My AI Avatar Clone is So Realistic It Replaced Me

An 18-minute walkthrough of building a photorealistic AI video clone from a selfie, an audio clip, and a structured timeline prompt.

Posted
1 weeks ago
Duration
Format
Tutorial
educational
Views
73.1K
Big Idea

The argument in one line.

AI video generation has crossed the realism threshold where a single selfie and a 13-second audio clip can produce talking-head clips that fool real people, and the bottleneck is now credits, not technical skill.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A solo creator who wants to produce video content without being on camera for every shot.
  • Someone evaluating whether current AI avatar tools can pass the indistinguishable-from-real test.
  • A builder who wants a step-by-step breakdown of timeline prompting for CDance/Higgsfield.
  • Anyone assessing whether Higgsfield, HeyGen, or ElevenLabs is worth the monthly credit cost for short-form production.
SKIP IF…
  • You need clips longer than 13 seconds as the platform hard limit will frustrate you.
  • You are looking for a free workflow as usable output quality requires a paid Higgsfield plan.
TL;DR

The full version, fast.

AI video cloning has reached a point where a selfie and a short audio recording are enough to generate talking-head footage that fools real people. The video covers the full pipeline on Higgsfield CDance: capture a selfie, record or clone your voice via Audacity, ElevenLabs, or Higgsfield built-in cloner, then write a structured timeline prompt with eight defined fields that removes every ambiguity the model would otherwise guess. Clips are capped at 13 seconds, credit costs scale steeply with resolution, and 720p is the pragmatic default for social output.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0001:04

01 · Hook and clone reveal

AI clone walks and talks outdoors; real Dan reveals the swap in studio and drinks water as proof.

01:0401:42

02 · What this video covers

Roadmap: make your own clone, use cases, voice tricks. Tool: Higgsfield CDance. Link to Skool community for prompts.

01:4204:51

03 · Why CDance looks so good

Detail (skin texture, pores, imperfections), natural dialogue (pauses/ums/glances), natural movement (body language). Side-by-side with old v0.3 and HeyGen.

04:5106:00

04 · Credits and accessibility

Anyone can now create without being on camera. Only real limitation is credits.

06:0007:43

05 · Making your reference

Why a selfie beats an AI character sheet for skin detail. Top-down selfie tip for outfit consistency.

07:4309:20

06 · Making the AI sound like you

Three methods: Audacity recording, Higgsfield built-in cloner (under 13s), ElevenLabs professional clone.

09:2012:35

07 · Timeline prompting breakdown

Eight-field prompt structure: FORMAT, SUBJECT, WARDROBE, ENVIRONMENT, STYLE ANCHOR, DELIVERY, LOGIC RULE, NEGATIVE PROMPT, ACTION.

12:3513:57

08 · Output quality and tradeoffs

1080p used for YouTube. 720p significantly cheaper. MD file template for Claude/ChatGPT to auto-generate prompts.

13:5716:00

09 · Real-world use cases

Fictional AI accounts (130K-400K followers), entertainment characters, podcast clips.

16:0018:01

10 · AI VFX, AI ads, credit costs

Minecraft VFX intro. Uniqlo ad (4 shots, ~500 credits, max $25). Credit cost philosophy. Final CTA.

Atomic Insights

Lines worth screenshotting.

  • A selfie produces better skin detail than an AI-generated character sheet because real photographs preserve imperfections that generative models smooth over.
  • The 13-second clip limit in Higgsfield is a hard ceiling; stitching clips for longer content multiplies credit cost linearly.
  • Natural micro-pauses, ums, and off-camera glances are the primary tells for AI-generated video and matter more than visual resolution.
  • A timeline prompt with eight defined fields outperforms freeform prompting by constraining every degree of freedom the model would otherwise guess.
  • 1080p costs several times more credits than 720p per generation; for social feeds, 720p is the pragmatic default.
  • Uploading a prompt-template MD file into Claude or ChatGPT reduces prompt writing to a single plain-English sentence describing the scene.
  • AI accounts built around fictional characters have reached 130K-400K followers and monetize through courses without the creator ever appearing on screen.
  • Higgsfield built-in voice cloner requires audio under 13 seconds due to an undocumented platform limit.
  • ElevenLabs professional voice clones outperform Higgsfield native cloner on the same input audio for naturalness and consistency.
  • CDance strongest differentiator over HeyGen-era avatars is replication of unconscious body language during pauses, not just lip sync.
Takeaway

Why the gap between real and fake closed so fast.

WHAT TO LEARN

AI avatar realism is no longer a resolution problem: it is a behavioral problem, and the tools that solve natural micro-pauses and eye movement are producing indistinguishable output.

  • Natural micro-pauses and off-camera glances are the primary tells for AI-generated video; an avatar that holds eye contact without variation reads as synthetic regardless of visual fidelity.
  • A reference selfie captures skin imperfections that AI-generated character sheets smooth over, making the selfie the better input for photorealism despite being lower resolution.
  • The 13-second clip ceiling forces a scripting discipline: planning content in 9-13 second units before generating, not after, is what separates usable output from wasted credits.
  • Timeline prompting reduces AI guesswork by specifying eight parameters upfront; every degree of freedom left undefined gets filled with generic defaults.
  • Credit costs scale steeply with resolution; 720p output is functionally indistinguishable from 1080p at social-feed viewing sizes and costs a fraction of the credits.
  • Uploading a reusable prompt-template file into an AI assistant converts complex timeline prompting into a single plain-English sentence, removing the skill barrier for consistent generation.
Glossary

Terms worth knowing.

CDance
The video generation model inside Higgsfield AI that accepts image and audio references to produce lip-synced talking-head video clips.
Timeline prompt
A structured prompt format for AI video generation that breaks the request into labeled fields to reduce model guesswork.
Character sheet
A set of AI-generated reference images showing a person from multiple angles, used as input for consistent character generation across scenes.
Higgsfield
An AI video platform hosting the CDance model along with voice cloning and video editing tools, priced on a monthly credit system.
Resources

Things they pointed at.

Quotables

Lines you could clip.

00:16
I used to have an AI clone and everyone called me out on it. They all noticed it was fake. But if you can't tell, this is AI too.
Before/after payoff in two sentencesTikTok hook↗ Tweet quote
03:34
If you don't have pauses, ums, and ahs, you have a big chance that it's AI generated.
Counterintuitive insight: the tell is behavioral not visualIG reel cold open↗ Tweet quote
05:21
Anyone can now be a creator. You don't have any limitations anymore. The only limitation you have is credits.
Democratization claim plus honest caveat in one breathnewsletter pull-quote↗ Tweet quote
The Script

Word for word.

00:00Okay. This is crazy. Come a bit closer because we need to talk about AI clones.
00:06What you're seeing right now is my very own clone. This is not real. This is all AI.
00:12Just look how realistic the movements are. Yeah. That's scary.
00:17Right? Like, this technology has come a far away because I used to have an AI clone and everyone called me out on it. They all noticed it was fake.
00:25But if you can't tell, this is AI too. Now this is what my clone used to look like when I tried to do this with v o three, like, nine months ago.
00:34And this is what we got right now. What this means is that you don't have to be in the studio anymore because all I have to do is upload an audio clip of me talking and a reference image, and this is what you get.
00:47Okay. So now it's the real then. And to prove this, I'm going to drink this cup of water because we all know AI cannot do that.
01:04Yeah. All jokes aside, this is actually crazy. Right?
01:07Like, did I fool you or not? Let me know in the comments down below. In this video, I will breaking down exactly how you can make your own AI clone that looks exactly like you to make fun videos like I've just shown you.
01:19I'll also break down a few different use cases of how you can make something useful out of this. Now for this video, we're using CDance through Higgs Field. It has all the tools that we need to make these kind of videos.
01:30If you wanna follow along, click the link in the description down below. I'll also add in a link to my school community where you can find all of the prompts I used and some other files that might be useful. Okay.
01:42So the big question is, what is so good about these AI clones? Well, there are three reasons why I think this looks good. First, we have the amount of detail.
01:52If you look at these shots, you can see everything from my skin texture to my imperfections, pimples, freckles, like, all of that. You see it on my skin.
02:00If we actually take a look at this other example right here, then let me walk you through it why I think this looks amazing. We got a few things going on and I wanna discuss the detail. My face looks exactly like the image input.
02:12If I show you the image input that I used here, it's this image and it used that quite well. We can even see the bags underneath my eyes because there have just been too much work to do. I also like how this character is walking to the screen.
02:24She's taking a video and it looks like, of course, if you really zoom into it, you will notice, but I I do think it looks realistic. The other thing that I wanna point this guy right here, that's exactly the type of, like, look or the exactly the type of face I would have at Influencers in the Wild. Like, that's why I'm in this studio, locked away from everyone else.
02:43I don't wanna film in public like this because I will have people like this dude looking at me, and that's why I'm just being in the public with my AI. The scooter right here, if we take a look at the light that is reflecting on some of these things. So we see the vending machine right here.
02:57We have the light. Then also on this pole right there, we have the reflection of the light too. Overall, this is becoming so good that people are getting fooled by it.
03:16The other reason why I think these AI clones are so good is the natural dialogue. Just have a listen to this. One of the easiest ways to tell something is AI generated is the dialogue, specifically the flow of it.
03:29Like, we're talking about pauses, ums, and ahs. If you don't have them, you have a big chance that it's AI generated.
03:37That's crazy. Like, tell me that doesn't sound like me. Tell me that doesn't look like me.
03:41Like, the amount of detail and even, like, how I look away where I'm when I'm taking a break, where I'm pausing, like, that's something I do quite often. Like, you you don't see me looking into the camera enough. Like, I as as soon as I start to think about something or, like, I'm I'm, like, looking up or looking to the side I'm doing that here too.
03:58We just take a look at this. Something is AI generated is the dialogue. There you had it.
04:03Just looked at the side. Like, we're talking about The hand movements, it all looks and sound natural. I use a lot of pauses in my speech.
04:11Even here, it is using that. But the ability for Cdans to understand that there's pauses, the ability for Cdans to not cut out that pause, to to also have my character take a little break when I'm taking a pause and look away maybe. That is just incredible.
04:26If we compare that to what Heijen made, for example, in this example, not even that old video that I did about Heijen. I'm going to show you how I generate avatars that look and sound exactly like me. No matter the background, the clothing, or the camera angles, you will remain completely unshakable.
04:44Like the dialogue, the movement, all of it just feels more thick. That also brings me to the last reason why I think this looks so good, which is the natural movement.
04:57That comes to, like, natural movements. This AI understands so well how good the natural movement is. Here's a little vlog that I made, more of a travel thing, but look at the natural movement here.
05:08Most people think of this as scary and inauthentic and all of that, but I'm here to say anyone can now be a creator. You don't have any limitations anymore.
05:28The only limitation you have is is credits. And that is, like, that is a big thing, but I will show you how to save on some credits later. Now that you understand what it is that it makes this look so good, it is time for you to learn, like, how to actually make your own cloak.
05:41And it is way simpler than you might think. I will walk you through all the different steps to create your own cloak. Because first, you have to make your reference.
05:49Now for making a reference, I literally just use my phone and I took a selfie of me. So this is the selfie that I made. This already is good enough for 99% of regenerations.
06:00The only tip I have is I wished I took, like, a a top down selfie to show my outfit, like a selfie like this. To explain the reason why is in this video right here, I had some outfit issues. As you see, in the first shot, I'm wearing black pants, and in the second shot, I'm wearing beige pants.
06:17Even though I prompted it to be black pants, um, because it doesn't have a reference of me wearing black pants, it can sometimes fuck up. So that's the tip that I wanna give you. If you are doing this as a reference and you wanna include more of your outfit, make sure you maybe also include another reference of your full outfit.
06:33But all in all, the main thing you need is a selfie view. And I hear you thinking, why are we not using a caret sheet? Now let me explain.
06:40This is the exact caret sheet I made with g t image two that looks just like me. The reason why I don't use these character sheets for my vlog type videos is because of the level of detail. Right now with this character sheet, we have, like, my full image and we have, like, the side angles, all of that.
06:56But if we zoom in, it's not as detailed as a actual self of me. To show you a comparison, I've used the same prompt, but one video had the selfie as a reference and the other video had the character sheet as a reference. If we take a look at them side by side and if we zoom in on my face, then we can see that there's a lot more detail and a lot less smoothness, which also can be good if you have some imperfections.
07:18But on my selfie image, it looked a lot better in my opinion. So that's why I went with the selfie approach. So that is how you make your clone or how you clone yourself.
07:26Okay. The next challenge we have is making the AI sound like you. Now there are three different ways of how you can approach it.
07:33I will first show you the best method. If you have a good mic, then the best thing you can do is record your own audio. I've done this two different ways.
07:41I've used Audacity. This is literally a free tool where you can just plug in your microphone and then literally say anything you want. Then you take that audio and you upload it as a reference inside of CDance.
07:54Now that's the beauty of c dance. It can take in audio references if you didn't know that. Make sure that it is less than thirteen seconds.
08:01For some reason, you can only do thirteen seconds and not fifteen seconds, so keep that in mind when you're generating or recording your voice. The only issue we have is it still alternates and changes your voice. Here's a side by side comparison of my real recording and what C Dance made out of it.
08:18Okay. So this is crazy. Come a bit closer because we need to talk about AI clones.
08:23Okay. This is crazy. Come a bit closer because we need to talk about AI clones.
08:29Now, as you hear, the CDense version is different from my original audio output, so expect it to change your voice a little bit. If you wanted to exaggerate a bit more, then make sure that your input audio is also, like, way more expressive. The other thing it does, it can add in some background noise.
08:45Now this works best for talking heads videos that are, like, in a studio environment, like this example. Yeah. That's scary.
08:52Right? Like, this technology has come a far away because I used to have an AI clone and everyone called me out on it.
08:58They all noticed it was fake. If you're not in position to record each and every clip or just takes too long, then you can use this different method, which is cloning your voice. The easiest and cheapest way to do it if you already have Higgs Field is to use audio and then voice over.
09:13Here, you can create your own voice clone by submitting a audio file that is max two minutes long of you speaking. I did that right here. If you want to have the best possible clone, I would recommend using Eleven Labs, making a professional voice clone right there, and using that to make your voice overs.
09:31For example, this is what I sound like with an Eleven Labs voice. This is my voice clone on Eleven Labs. What do you think?
09:37Now that we got the two input references, so we got the image of us, and then we also have the voice segment, we can now start generating. And let me explain how I prompt for this. So the first thing I do is I go to video, then I go to CDance, and then over here, I'm uploading both of these things.
09:53The next thing that you wanna do is you wanna add in your prompt. Now there are so many different ways to prompt it. There's no right or wrong, but the method I'd like to use for this is called timeline prompting.
10:04And the reason why is because it's very specific. It also allows you and any eye tool that you're using to think about, like, what you're actually putting in there. If I go over and break down the prompt, it's like this.
10:14I add in the format. So I'm basically explaining it's a nine seconds single continuous shot. I know you got the setting inside of Higgs field where you can select, like, how many seconds you have.
10:24Make sure you match that with what you have in your prompt. So then we go over the subject. That's where I add and tag image one.
10:31So in Higgs field, make sure you add in and tag that image one. Then for the wardrobe, I'm going over, like, same. It's the exact same as image one.
10:39Environment, exact same because it's one continuous shot. It's not that, like, difficult for this one. Then the style anchor.
10:45So I'm going in with a locked studio talking head, podcast YouTuber creator aesthetic. Here, you can change this up to what you have in mind. If it's not a continuous shot, then you can change something like, oh, outside in the park, we have a handheld camera shot, like all of that.
10:58You can basically add in all the different details of how you want it to look. This shot was quite easy, to be honest, because a talking headshot is a static shot. It's all set on a tripod, so we don't have anything difficult going on.
11:09The delivery, gonna it's be conversational, reflective, mid tempo with natural micro pulses, lip sync driven, and it's captured by a podcast mic. Clean direct micro tone, no reverb. Here, you don't wanna simply copy what I did.
11:22If you're outside, if you have any type of background noise, you can add that in here. Now for the logic rule, it's basically to prevent that the AI does anything weird. So I'm just repeating myself with a single continuous shot.
11:33No cuts, no jumps, no zooms. I don't wanna see that in this shot. If you do wanna do that, then prompt it that you wanna have happen.
11:39Negative prompt, no music, no captions. Sometimes it tends to add in captions and music. Then for the action, this is where the timeline starts.
11:47It's not really a timeout prompt because we have just one prompt. That's the only shot in the timeline. So for zero to nine seconds, it is a talking headshot where I am in the studio and I've tagged myself again here, and I am prompting, like, how I deliver these words.
12:03So I'm saying, like, it's relaxed conversational energy. But while he's saying that, his right hand lifts up and he casually points toward the right side, not sure which side that is, gesturing into open space. Then I repeat what I have said, and then I turn back to the lens and I continue, and this is what we got right now.
12:20That's just to give the AI a bit more of a breakdown of what we want to see. So this is what it generated for me. Now this is what my clone used to look like when I tried to do this with v o three, like, nine months ago.
12:32And this is what we got right now. Honestly, this is quite impressive. The first time I saw this, I showed it to my friends, and they all thought it was the real me.
12:41The main thing I did here is I did use ten eighty p quality, and that is because I'm using this for YouTube videos, which are upscaled to four k. Ideally, you might even wanna upscale this video, but although sometimes when you upscale, tends to look a bit fake. Um, but, yeah, this is very usable already.
12:55If you do wanna save some credits though, then you might wanna switch over to seven twenty p because as you see, it's a lot lot cheaper, um, than ten eighty p. So that is one tip for you there. Now you don't have to reinvent the wheel and prompt all of this yourself.
13:09Um, you don't get as a polished of a result as I just had. So to save you some time, I've made this MD file, which you can find in the school community. Then you upload this MD file into your cloth or into your Chachi Pit.
13:21Then you also upload a reference image of you, and you can also even add in your audio file. Then you just start prompting it. It can be a super simple prompt.
13:29Like, I just did a very short prompt where I said, using the MD file and my reference images and my audio file, prompt me a handheld camera scene of this man giving a tour in a boutique small but nice hotel room. It has to be, like, thirteen seconds long, one continuous tick. Then I also give a breakdown of what the transcript is saying, like, the audio file is saying, so you don't leave that guesswork up to the AI.
13:52Then it will spit out a prompt. If you copy the prompt, put it into Higgs field, also adding your references there, and then you hit generate, it will make you something like this.
14:00So this is where I'll be staying for the time being. Here's the bathroom, pretty fancy, and here's the bed, and then there's the huge TV.
14:10And, yeah, I'd say it's pretty nice. Yeah.
14:14Everything was AI generated, even the voice. I used 11 apps for that. Now I hear you thinking, how can I use this in real life?
14:21Like, what is actually the benefit of having this AI clone? I've already shown you a number of different examples, but now let's get a bit specific. The main use case of using this right now would be to generate short clips.
14:33Unfortunately, you can only do thirteen second videos, but you can stitch them together to make something longer, but it will cost you a lot of credits. So how are people using this in real life? Let me showcase you a few examples.
14:44So, for example, the first one that I wanna share with you is this page right here. This guy literally went viral for making AI videos of some random guy. Like, it doesn't even have to be you.
14:55Like, I use me as an example right now, but it can be anyone. You use this guy, put him in a suit, and it's, like, completely fake. They make money through this, like, WOB course, I reckon, and they already got, like, 130 k followers.
15:07Now here's another one. I don't know why they're picking old people, I think, for credibility, but this person has almost 400 k followers, even has their own website. And it's probably some dudes just making these clips.
15:18Now, there are also people making more entertainment style videos like here. Again, they didn't use a image of themselves. They have built this imaginary character.
15:27These already have, like, 2,000,000 followers and they're making real money with clips like that. One last example I wanna give you is people using this for podcasts. So, uh, I think the AI creator space is very much a, um, show, don't tell.
15:41Yeah. So what I'm trying to tell you here is that you can make entertainment videos of yourself. You can put yourself on a podcast.
15:46You can make any type of video you can imagine. To get a bit more practical, I've made a few examples of what I think this could be used for. So here, for example, I use it as AI VFX.
15:57If you have played Minecraft, then you'd know exactly what this is. Man, I'd never need to walk ever again. Now this can be a fun intro opener for my reel or for my YouTube videos, and you can do this too.
16:09The other one that I wanna share with you is more of an AI ad. Now here, I made a Uniqlo ad. So there you go, Uniqlo.
16:30That was some free promotion for you. I will send the invoice later. I think this this is, like, four shots.
16:36I generated everything, like, two or three times. In total, this will cost you, like, 500 credits max on Hicksfield, which comes down to depending on which plan you have. Max, that shot would cost you $25, which still feels pretty expensive.
16:50But the way I look at it, like, spend a lot of credits and I burn a lot of money on it, I look at this as, like, a tool. Like, instead of you buying a expensive camera or instead of you buying, like, a literal set and purchasing, like, all the people that can help you with that, renting out all the equipment, you can make something looking half decent with AI, and big brands are already doing it.
17:11My main point is that there are just so many different ways to make cool stuff with AI right now. I used to not even want to be on camera, and now that's even more justified because I can just use a clone of me. I've seen so many people, so many characters that already are doing this.
17:27They are benefiting from it. They are using this in their business, in their social media, in every type of aspect. The main question I have to you is, are you gonna be one of these early adapters or are you just gonna be mad about this?
17:39I can see both ways, but still, I would give it a try. Again, the link to Hicksfield and the link to my prompts are in the description down below. Check it out.
17:47If you wanna see more implications of how I use CDense, then click the video that's on the screen right now. I have a few cool videos that literally explain all kind of different use cases of how you can master CDNs, but I also have one about AI filmmaking.
The Hook

The bait, then the rug-pull.

The video opens outdoors with a man speaking to camera on a busy street. He invites you closer, cuts to studio, and reveals it was his AI clone the whole time. Then he drinks a cup of water to prove he is the real one.

Frameworks

Named ideas worth stealing.

09:20list

Timeline Prompt 8-Field Structure

  1. FORMAT
  2. SUBJECT
  3. WARDROBE
  4. ENVIRONMENT
  5. STYLE ANCHOR
  6. DELIVERY
  7. LOGIC RULE
  8. NEGATIVE PROMPT plus ACTION

A structured template for CDance prompts that specifies every parameter the model needs, reducing ambiguous generation.

Steal forAny AI video generation workflow where consistency across clips matters
07:43list

Three Voice Input Methods

  1. Manual Audacity recording plus upload
  2. Higgsfield built-in voice cloner under 13s audio
  3. ElevenLabs professional voice clone

Ranked from most authentic to most scalable: manual recording wins on authenticity, ElevenLabs wins on generated voice quality.

Steal forDeciding which voice pipeline to invest in for an AI avatar workflow
CTA Breakdown

How they asked for the click.

VERBAL ASK
17:40link
The link to Higgsfield and the link to my prompts are in the description down below.

Standard verbal CTA at end of video; affiliate link to Higgsfield plus free Skool community. Low pressure, no countdown.

MENTIONED ON CAMERA
FROM THE DESCRIPTION
PRIMARY CTAWhere the creator wants you to go next.
AFFILIATECommission earned if you click.
OTHER LINKSAlso linked in the description.
Storyboard

Visual structure at a glance.

AI clone outdoor open
hookAI clone outdoor open00:00
Real Dan studio reveal
hookReal Dan studio reveal00:16
AI vlog clip demo
demoAI vlog clip demo00:36
Real Dan labeled
proofReal Dan labeled00:48
Higgsfield platform
promiseHiggsfield platform01:19
Natural dialogue demo
valueNatural dialogue demo03:34
Selfie vs character sheet
valueSelfie vs character sheet06:00
Timeline prompt on screen
valueTimeline prompt on screen09:20
Claude prompt builder
valueClaude prompt builder13:57
Higgsfield pricing table
ctaHiggsfield pricing table16:46
Hooded figure at laptop
ctaHooded figure at laptop17:36
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

Chat about this