This experimental ComfyUI workflow supports:
Image-to-Video (I2V) and Video Extension (V2V) generation using MoviiGen1.1-VACE-GGUF
Using CausVid LoRA with 2 samplers for faster generations
Generate the first video as your starting point
Extend the video one at a time to gradually build out the full sequence
Cherry-pick the best segments for your final cut
Refine prompts step-by-step as the scene or motion evolves
๐ MoviiGen1.1-VACE-GGUF (by Finanzamt_Endgegner)
Based on Wan2.1, fine-tuned on cinematic 720p+ videos
VACE (All-in-One Video Creation and Editing framework) allows motion control using reference videos (like ControlNet for video)
Native support in ComfyUI via GGUF format
Temporal consistency across the full sequence
Model Downloads:
Official Repo:
Related:
โก CausVid LoRA (by Kijai)
Speeds up Wan2.1-based video workflows
Works best with two samplers (based on Maraan666):
First few steps: without CausVid LoRA
Remaining steps: with CausVid LoRA
Improves motion without suppressing natural movement
LoRA Downloads:
14B: Download
Source: lightx2v/Wan2.1-T2V-14B-CausVid
1.3B: Download
Source: tianweiy/CausVid - Bidirectional Checkpoint 2
Official Repo:
Related:
CivitAI: Causvid Lora, massive speedup for Wan2.1 made by Kijai
Reddit: Causvid Lora, massive speedup for Wan2.1 made by Kijai
๐ผ๏ธ To Generate Video from an Image as First Frame
Enable "First Frame" from the muter node
Upload your input image
Set generation parameters:
Prompts (positive/negative)
Shift
Steps
Seed
Width / Height
Length (frame count)
Sampler
Scheduler
Click Run
๐ฅ To Extend an Existing Video
Enable "Video Extension" and "Combine" options
Upload your input video
Set extension parameters:
Overlap Frame Count
Extension Frame Count
Prompts (positive/negative)
Shift
Steps
Seed
Sampler
Scheduler
Click Run
The base model is a T2V model, not a true I2V model.
The I2V is achieved by feeding a reference image into the VACE node, rather than directly preserving the image.
An I2V model typically keeps the input image as the exact first frame.
Here, VACE treats the image as loose guidance, not strict visual preservation
Examples:
If your source image lacks an object, but your prompt includes it, that object might be added to the first frame.
If the prompt contradicts the image, some original elements may be missing.
Fine details may degrade over time, especially in extended video generations.
Yes. I ran it on an RTX 5060 Ti with 16GB VRAM using the Q6_K GGUF model.
With GGUF models, you can choose a version that fits your GPU memory:
Q3_X_X (3-bit) for ~8GB VRAM
Q4_X_X (4-bit) for ~12GB
Q5โQ6 for ~16GB
Q8 for ~24GB+
๐ Model & hardware info: https://huggingface.co/QuantStack/MoviiGen1.1-VACE-GGUF
This workflow is still experimental, so crashes or poor results are common. Here are some tips:
OOM (out of memory) error = your GPU doesnโt have enough VRAM
Use a lower quant model (e.g. Q3 or Q4) to reduce memory usage
Lower the video resolution or clip length to avoid overload
If transitions look bad, try adjusting the prompt or other settings
Generate multiple times, then pick the best clips to stitch together
The "WanVaceToVideo" model only accepts resolutions where both width and height are divisible by 16. If your input resolution doesnโt meet this requirement, youโll likely run into errors or processing failures.
Below are safe resolutions for commonly used aspect ratios, based on standard output heights (320, 368, 480, 544, 640, 720):
โ Recommended Aspect Ratios & Resolutions (All values divisible by 16)
๐ฅ 32:9 -> 1136x320
๐ฝ 21:9 -> 752x320, 864ร368, 1120ร480, 1264ร544
๐ผ 2:1 -> 640x320, 736ร368, 960ร480, 1088ร544, 1280ร640
๐บ 16:9 -> 576x320, 656ร368, 848ร480, 960ร544, 1136ร640, 1280ร720
๐ฅ 16:10 -> 512x320, 592ร368, 768ร480, 864ร544, 1024ร640, 1152ร720
๐ท 3:2 -> 480x320, 560ร368, 720ร480, 816ร544, 960ร640, 1088ร720
๐ผ 4:3 -> 432x320, 496ร368, 640ร480, 720ร544, 848ร640, 960ร720
๐ผ 5:4 -> 400x320, 464ร368, 608ร480, 688ร544, 800ร640, 896ร720
Go ahead and upload yours!
Your query returned no results โ please try removing some filters or trying a different term.