Modern Creator
Sightseeing Stan · YouTube

I mixed AI with Real Footage and it's actually scary

Six concrete techniques for blending AI generation into real cinematography — from building collapses to impossible transitions to logo animation.

Posted
1 months ago
Duration
Format
Tutorial
educational
Views
282.3K
14.2K likes
Big Idea

The argument in one line.

Giving an AI video model a visual start frame and an end frame gives filmmakers more control than text prompting alone — and that single principle underlies every compelling AI filmmaking result in this video.

Who This Is For

Read if. Skip if.

READ IF YOU ARE…
  • A filmmaker or solo content creator who wants cinematic AI effects without a full VFX budget or crew.
  • Someone who already knows DaVinci Resolve basics and wants to understand how AI generation slots into a real post-production pipeline.
  • A creator skeptical of AI tools who wants a grounded, practical demonstration rather than hype.
  • Anyone curious whether the start/end-frame prompting method actually works in practice — the video shows multiple generation attempts, not just the best take.
SKIP IF…
  • You want a philosophical debate about AI replacing human artists — this is a workflow video, not an opinion piece.
  • You have never touched a video editor; this assumes basic compositing familiarity.
  • You are looking for free or open-source tools — every tool shown is a paid platform.
TL;DR

The full version, fast.

The most useful AI filmmaking technique right now is giving a video generation model both a start frame and an end frame instead of relying on text alone — this gives you director-level control over what the model produces. The video demonstrates six workflows built on this principle: replacing a background with a collapsing building, swapping and animating an indoor environment, creating VFX like white-eye morphs and arm-to-water effects, generating seamless location-crossing transitions, AI motion control that transplants you into an entirely different place, and animating static logos. Every workflow runs through Higgsfield AI or Kling 3.0 for generation, with DaVinci Resolve handling the compositing.

Free for members

Chat with this breakdown — free.

Sign in and you get 23 free chat messages on us — ask for the hook, quote a framework, find the exact transcript moment, generate a markdown action plan. Bring your own key when you want unlimited.

Create a free account →
Chapters

Where the time goes.

00:0000:50

01 · Hook + promise

AI-mixed footage cold open, creator intro, promise of six techniques

00:5002:51

02 · Technique 1: Building collapse

Tripod shot + clean plate → Nano Banana image gen → Cinema Studio start/end frame video → DaVinci magic mask composite

02:5104:12

03 · Technique 2: AI environment swap

Indoor shot → Nano Banana inpaint → paint out actor → Cinema Studio animate background → DaVinci power window

04:2706:38

04 · Technique 3: AI VFX (white eyes + arm-to-water)

Still frame → Nano Banana extreme close-up → Kling 3.0 push-in transition; second demo: arm poses to arm-as-water morph

06:3808:22

05 · Technique 4: Impossible transitions

Pre-visualized matching focal-point shots → last/first frame as start/end → Kling 3.0 generates seamless location jump

08:2208:47

06 · Technique 5: AI motion control / location swap

Walking video + generated location image → Cinema Studio 3.0 transplants actor into new environment preserving motion

08:4709:36

07 · Technique 6: Logo animation

Static logo → Cinema Studio prompt (refined with LLM) → animated logo with optional sound design

09:3610:12

08 · Closing thoughts

Reflection on how fast tools are evolving; invitation for audience to share their opinions in comments

Atomic Insights

Lines worth screenshotting.

  • Giving an AI video model a visual start frame and end frame produces more predictable results than any text prompt alone.
  • Treat each AI generation like a take on set — sometimes you get it first try, sometimes you need twelve attempts.
  • A clean plate (the same shot without the actor) is the foundation of every AI background replacement workflow.
  • Shooting with the transition in mind — matching focal point and center framing across two shots — is what makes AI-generated transitions feel natural.
  • AI VFX today is not better than professional VFX artists, but it has opened the door for creators who never had access to those tools at all.
  • The same start/end-frame logic that works for location swaps also works for impossible morphs: arm to water, normal eyes to white eyes.
  • AI motion control in Cinema Studio 3.0 can transplant a walking subject into a completely different location while preserving their motion — something that previously required a motion control rig.
  • Logo animation via AI is now a one-prompt task — and using an LLM to refine the animation prompt before sending it to the video model measurably improves results.
  • The creative ceiling for these techniques is, in the creator's words, as high as your imagination — but the floor is set by how clean your mask and clean plate are.
  • These tools are not at the ceiling yet. Every few months they take another leap, and what seems impressive now will look primitive in two years.
Takeaway

The one rule that makes AI video generation actually controllable.

WHAT TO LEARN

Every reliable AI filmmaking result in this video comes from the same discipline: show the model where to start and where to end, then let it fill in the motion.

  • Text prompts alone give the AI too much imagination room — anchoring it with a real or generated start frame and a real or generated end frame narrows the output space to what you actually want.
  • A clean plate (the same shot without the actor) is not optional — the quality of your composite lives or dies on how cleanly you captured the background before the actor stepped in.
  • Treat AI generation like a camera roll: you may need 3-12 attempts to get the take you want, and building that iteration budget into your workflow removes the frustration of failed first attempts.
  • Shooting with the transition in mind — matching the focal point and framing across both sides of a cut — is what separates a jarring jump from a seamless impossible transition.
  • AI VFX today is not competitive with professional VFX pipelines at the high end, but it has removed the barrier to entry entirely for creators who previously had no access to those tools.
  • Logo animation and environment replacement are the two lowest-effort entry points: they require only a still image input and a text prompt, making them realistic day-one experiments.
  • Using a general-purpose LLM to refine a generation prompt before sending it to a video model measurably improves results — the two tools are complementary, not redundant.
Glossary

Terms worth knowing.

Clean plate
A second recording of the exact same shot with the actor removed, used as the background layer in compositing. Locking focus to the same spot for both takes is critical.
Magic mask
A DaVinci Resolve feature that automatically rotoscopes a moving subject out of a shot, isolating them from the background so a different background can be dropped in.
Start/end frame prompting
A video generation technique where you upload a still image as the opening state and a second still as the closing state, letting the model interpolate the motion between them rather than hallucinating it from text alone.
Nano Banana Pro
An image editing and generation tool inside Higgsfield AI used for inpainting — selectively changing parts of a still frame (a background, a body part, an eye) while preserving the rest.
Cinema Studio 3.0 / 3.5
Higgsfield AI's video generation model, used in this video for background animation, location swaps, and logo animation.
Kling 3.0
A separate AI video generation platform used here specifically for transition generation and VFX morphs between two still frames.
Power window
A masking tool in DaVinci Resolve that lets you isolate a geometric region of the frame — used here to cut out a window so an AI-generated background can show through it.
Motion control
A filmmaking technique using a programmable camera rig to reproduce the exact same camera move repeatedly. AI Cinema Studio approximates this by matching camera motion from a reference video and applying it in a generated environment.
Resources

Things they pointed at.

05:08toolKling 3.0
01:10toolDaVinci Resolve
09:05toolClaude (LLM)
00:00toolFrameset
00:00productAudiio
00:00productEpidemic Sound
Quotables

Lines you could clip.

01:45
Instead of just relying on a pure text-based prompt, I'm giving the model a visual start and end state. And that gives me much more control and the results are going to be noticeably better.
The key principle of the whole video stated cleanly in two sentencesTikTok hook↗ Tweet quote
02:20
You can think of each generation like a different take on set. Sometimes you nail it from the first go and sometimes you're on take 12 wondering what the heck went wrong.
Reframes AI frustration in a filmmaker's mental model — immediately relatableIG reel cold open↗ Tweet quote
07:30
The golden rule here is to shoot with the transition in mind. The more intentional you are on set, the better these things are going to work.
Practical directive that applies to any AI filmmaking workflownewsletter pull-quote↗ Tweet quote
09:25
Whether you like AI or not, this is pretty nuts.
Disarming one-liner that works for any audience position on the AI debateTikTok hook↗ Tweet quote
The Script

Word for word.

Read-along

Don't just watch it. Burn it in.

See every word as it's spoken — crank it to 2× and still catch all of it. The same dual-channel trick behind Amazon's Kindle + Audible.

metaphoranalogy
00:00What you're looking at right now is real footage twisted through the power of AI. And look, I'm not here to tell you that AI is the answer to everything. I'm just a content creator and filmmaker testing a tool.
00:12And honestly, the speed at which this is evolving is kind of insane. So I just wanted to show you what's actually possible right now. In this video, we're going to explore six different techniques to mix AI with real footage.
00:25And I'll be straight with you, some of this stuff is a lot more impressive than I expected. And some of it might even make you a little bit uncomfortable. Now I already know that some of you are going to love this and some of you are going to hate it.
00:39So while you're watching this video, why don't you tell me what your opinion is down in the comments because I would really love to know what you guys think of this. The first shot we're looking at is all real footage mixed with a pretty simple AI technique.
00:55Here's how it works. I put the camera on a tripod and filmed a shot of myself doing some of the finest acting scene on YouTube. Then I filmed the exact same shot without me in it with the focus still locked onto the same spot.
01:09That's our clean plate. I brought everything into DaVinci Resolve, did a basic rec seven zero nine grade on it, and exported a still frame from that clean plate. Then over in Higgs Field, I headed over to Nano Banana Pro, uploaded that still frame, and prompted it to keep everything the same but make the house collapse.
01:29This is what I ended up with and now I can start creating the background video. Now there are a few ways you can do this inside of Higgs field, but what worked best for me was Cinema Studio. I basically uploaded my original still as the start frame and the newly generated collapsed house as the end frame.
01:46And I think this is the part right here that really makes the difference because instead of just relying on a pure text based prompt, I'm giving the model a visual start and end state. And that gives me much more control and the results are going to be noticeably better.
02:00I gave it a prompt saying that I wanted to keep everything the same as in the original shot and after a few seconds the building would collapse and end up like my end frame. And I actually got this result from the first try which is pretty good I'd say. Now it definitely does not always work from the first try.
02:17Sometimes you'll have to generate a few times to get exactly what you want. You can think of each generation like a different take on set. Sometimes you nail it from the first go and sometimes you're on take 12 wondering what the heck went wrong.
02:31From there, I brought that footage back into DaVinci Resolve, placed it underneath the original clip of me and used DaVinci's magic mask to mask myself out. And that's pretty much it. Of course, the cleaner your mask, the cleaner your result is going to be.
02:46But with this technique, you can really take it in a lot of different directions and the creative ceiling is basically as high as your imagination. Now you can also use similar techniques to change parts of your environment by mixing parts of your real footage with AI generated surroundings tailored specifically for your shot.
03:05Here we've got a shot where I'm sitting on the couch and walking over towards the window. I want to keep the couch, the window, and of course myself, but I want to make the room a bit more interesting. I'm going to take a still frame from this shot and bring it into Nano Banana Pro again.
03:20I tell it what I want to change and crucially, I also tell it exactly what I want to keep. If you skip that part, the AI can go rogue and change too much.
03:30Once I've got something I'm happy with, I paint myself out of the frame directly inside Nano Banana and this gives us a clean background plate. Now you could stop here and use this as a static background, however, we can take it up a notch. I'll bring this empty shot into Hicksfield Cinema Studio and again tell it everything I want unchanged and ask it to animate the TV screen for example.
03:53And now instead of a static background image, I've actually got an animated background video. I'll bring that into DaVinci Resolve and instead of using magic mask to mask myself out, I'm going to use a power window to mask out the entire window because that's what I want to keep in this shot.
04:18Finally. I'm alive. Oh, that feels good.
04:22Anyway, moving on. This next section is all about visual effects,
04:27but this guy here will explain the rest. Before AI, if you wanted to get any kind of visual effects in your shot, you had to learn all kinds of complicated software and in some cases, you also needed a lot of computing power.
04:40Now AI hasn't replaced that, but it's opened the door for creators who've never had access to those tools before. Let me show you what I mean. Again, we're starting off from a real shot.
04:51I'm gonna grab a still frame right here when I look straight into the camera. I bring that into Nano Banana Pro and prompt it with something like extreme close-up of the person's eyes but their eyeballs are white. And as you can tell, you sometimes do get weird results.
05:08But this one was exactly what I was looking for. Next, I'll head over to Kling three point o video generation. I'll upload my original still frame as my start frame and that close-up shot we just generated is going to be my end frame.
05:21My prompt basically describes a continuous push in towards the eyes where the person closes and opens them and when they open them, their eyes have turned white. You can see it did take a few generations to get what I wanted but this is the one I went with. Everything goes quiet and then I see it all.
05:43Now are actual VFX and a pro VFX artist going to be better than this? Well, yes, of course, for now at least. Anyway, let me show you one more.
05:52I've got two still frames here, one with my arm turned in one direction and one with it turned the other way. I'll take the second frame into nano banana pro and ask it to turn my arm into water. Once I've got an image that I'm happy with, I'll upload that as my end frame in clink three point o and the first still frame as my start frame.
06:12Then all we need is a simple prompt like the person's arm turns into a liquid state. And because I've got my arm turned in one direction as the start frame and in the other direction as the end frame, there's natural motion built into the transition, which is going to help sell the effect.
06:37So this is going to be a quick technique, but it's using AI to create impossible transitions. The kind of moves that would normally require a motion control rig or a full VFX pipeline to get the job done. Now for this to work, you need to pre visualize your shots a bit.
06:54So here my first shot is looking through the trash can as I'm sitting on the bench and my second shot is going to be an over the shoulder shot looking at my phone. And I made sure to keep the focal point in the center of the shot for both of them. That's going to help with a smooth transition.
07:10Now I'll just take the last frame from the first shot and the first frame from the second shot. I'll throw these into cling three point o as my start and end frame and then describe how I want the transition to be. You can play around with different prompts and different models to see what it comes up with and you're not always going to get the best results, but I found that most of the time it actually works really well.
07:33The golden rule here is however to shoot with the transition in mind. The more intentional you are on set, the better these things are going to work and the shots are going to feel much cooler and more natural. Now we've already seen how we can use AI to change our location in a more traditional way.
07:50However, Cinema Studio three point o is so insanely good at motion control and honestly takes this way beyond what I thought was even possible. So step one is to record a video of you walking or driving or basically doing whatever you want. Step two is to create our new location.
08:07So head over to any of the image generation models on Higgs Field and give it a prompt describing your desired location. I, for example, generated these ones in Higgs field's soul cinema. Then in Higgs field cinema studio three point o or by now version 3.5, upload your original video and select the location image you generated.
08:26Now all you need is a pretty basic prompt making sure you explain what you want to keep and that you wanted to change the location to the image. And honestly, the results I got here were seriously impressive. Whether you like AI or not, this is pretty nuts.
08:42Alright. The last thing I want to show you is not so much directly related to filmmaking, but I wanted to add it in here because it really did surprise me.
08:51I'll upload a logo that I want to animate to Hicksfield Cinema Studio and write a prompt describing what kind of animation I want. I found it helpful to use an LLM like Claude to assist me with turning my thoughts into a better prompt. You can decide on the length of the animation or whether or not you wanted to add sound design as well.
09:11And as easy as that, you're now able to animate logos with minimal effort and get great results for branding, intros, or social content. So those are just six ways you can start mixing AI with real footage right now. But the thing I kept realizing is this isn't the ceiling.
09:29Every few months these tools take another leap. What you've seen in this video is just a snapshot of where we are today. And I know people are very divided when it comes to AI.
09:40Some of you see this as the most exciting creative unlocking in a generation while others feel like it's pulling at something that maybe shouldn't be pulled at. And I think both of those opinions are worth hearing so feel free to let me know in the comments where you stand.
09:56Are you excited? Are you worried? Are you afraid?
09:59And have you been experimenting with these tools yourself? Anyway, I hope you enjoyed this video and found it interesting. Thanks a lot for watching, and I'll see you guys in the next one.
10:08Bye.
The Hook

The bait, then the rug-pull.

The opening shot looks like a war-zone aftermath — crumbling buildings, overgrown streets, a lone figure standing in the rubble. It takes sixteen seconds before the creator admits it started as a quiet suburban street. That gap between expectation and reality is the entire argument of the video.

Frameworks

Named ideas worth stealing.

01:50concept

Start/End Frame Prompting

Instead of text-only generation, upload a still image as the start state and a generated or real still as the end state. The model interpolates the transition, giving directors visual control equivalent to storyboarding.

Steal forAny AI video project where text prompts produce inconsistent results — give the model boundaries instead of imagination room
00:57model

Clean Plate → Inpaint → Animate workflow

  1. Record with actor (original shot)
  2. Record same shot without actor (clean plate)
  3. Export still from clean plate
  4. Inpaint desired change in image editor
  5. Feed original still + inpainted still to video model as start/end frames
  6. Composite actor back over generated background in DaVinci

The six-step pipeline underlying techniques 1 and 2 — clean plate discipline is what separates controllable results from AI chaos.

Steal forAny background replacement or environmental transformation where you want to keep the actor on real footage
CTA Breakdown

How they asked for the click.

VERBAL ASK
09:48next-video
Feel free to let me know in the comments where you stand. Are you excited? Are you worried? Are you afraid?

Soft engagement CTA disguised as genuine curiosity about audience's opinion on AI. No subscribe push, no product pitch. Comments-first approach.

MENTIONED ON CAMERA
Storyboard

Visual structure at a glance.

AI cold open
hookAI cold open00:00
clean plate demo
valueclean plate demo00:57
start/end frame principle
valuestart/end frame principle01:43
environment swap
valueenvironment swap02:51
white eye VFX
valuewhite eye VFX04:27
impossible transition shoot
valueimpossible transition shoot06:38
AI motion control
valueAI motion control08:22
logo animation
valuelogo animation09:25
closing reflection
ctaclosing reflection09:36
Frame Gallery

Visual moments.

Watch next

More from this channel + related breakdowns.

13:48
Matt Loui · Tutorial

I Turned Claude Into A Viral Content Machine

A 13-minute tutorial showing how one creator built a monetized faceless YouTube video from scratch — character, script, 50 images, and thumbnails — inside a single Claude conversation, for $6.48.

May 20th
Chat about this