WAN 2.2 IMAGE to VIDEO with Caption and Postprocessing

Experimental
tremolo28
about 1 month ago

Workflow: Image -> Autocaption (Prompt) -> WAN I2V with Upscale and Frame Interpolation and Video Extension

  • Creates Video Clips with 480p or 720p resoltion.

There is a Florence Caption Version and a LTX Prompt Enhancer (LTXPE) version. LTXPE is more heavy on VRAM


V1.0 WAN 2.2. 14B Image to Video workflow with LightX2v Lora support for low steps (4-8 steps)

  • Wan 2.2. uses 2 models to process a clip. A High Noise and a Low Noise model, processed in sequence.

  • compatible with LightX2v Lora from Wan2.1 to process clips fast with low steps.

  • compatible to some of the Wan2.1 Loras, required to inject twice due to 2 model setup.

  • See notes in workflow.

  • GGUF models

  • 5sec clip with 6 Steps @ 480p take about 4mins, including autoprompt, 2x upscaling to 960p & frame interpolation to 30fps. (RTX4080-16gb Vram and 64gb Ram, sage attention)

Models can be donwloaded here:

Models (Low & High Noise required): https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/tree/main

LightX2v Lora (same as Wan 2.1): https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v/tree/main/loras

Vae (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/vae

Textencoder (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders


WAN 2.2. I2V 5B Model (GGUF) workflow with Florence or LTXPE auto caption

  • lower quality than 14B model and currently slower (there is no LightX lora)

  • 720p @ 24 frames

Model (GGUF): https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/tree/main

VAE: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae

Textencoder (same as Wan 2.1) :https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders


location to save those files within your Comfyui folder:

Wan GGUF Model -> models/unet

Textencoder -> models/clip

Vae -> models/vae


Tips:

  • Default strength of LightX2v Lora with 0.8 is setup for a more realistic look, hair and skin look more real. For anime or comic like look you can increase strength to 1.0 or beyond (black nodes in wokflow)

Read more...
Download (3.43 MB) Download available on desktop only

Popularity

390 ~10

Info

Base model: Wan Video 14B i2v 480p

Latest version (Experimental): 1 File

To download these files, please visit this page from a desktop computer.

About this version: Experimental

experimental workflow for WAN 2.2_14B. MultiClip to generate video with up to 20sec of length. (no Loras support yet)

Normal Version with own prompt and LTXPE version for autoprompt.

5 Versions

πŸ˜₯ There are no WAN 2.2 IMAGE to VIDEO with Caption and Postprocessing Experimental prompts yet!

Go ahead and upload yours!

No results

Your query returned no results – please try removing some filters or trying a different term.