Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai

Kijai-umt5-xxl-enc-Wan
METAFILM_Ai
6 months ago

Kijai ComfyUI wrapper nodes for WanVideo

WORK IN PROGRESS

@kijaidesign 's works

Huggingface - Kijai/WanVideo_comfy

GitHub - kijai/ComfyUI-WanVideoWrapper

主图视频来自 AiWood

https://www.bilibili.com/video/BV1TKP3eVEue

Text encoders to ComfyUI/models/text_encoders

Transformer to ComfyUI/models/diffusion_models

Vae to ComfyUI/models/vae

Right now I have only ran the I2V model succesfully.

Can't get frame counts under 81 to work, this was 512x512x81

~16GB used with 20/40 blocks offloaded


NOW Comfy-Org/Wan_2.1_ComfyUI_repackaged


DiffSynth-Studio Inference GUI

Wan-Video LoRA & Finetune training.

DiffSynth-Studio/examples/wanvideo at main · modelscope/DiffSynth-Studio · GitHub


💜 Wan    |    🖥️ GitHub    |   🤗 Hugging Face   |   🤖 ModelScope   |    📑 Paper (Coming soon)    |    📑 Blog    |   💬 WeChat Group   |    📖 Discord  


Wan: Open and Advanced Large-Scale Video Generative Models

通义万相Wan2.1视频模型开源!视频生成模型新标杆,支持中文字效+高质量视频生成

In this repository, we present Wan2.1, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. Wan2.1 offers these key features:

  • 👍 SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.

  • 👍 Supports Consumer-grade GPUs: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.

  • 👍 Multiple Tasks: Wan2.1 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.

  • 👍 Visual Text Generation: Wan2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.

  • 👍 Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.

This repository features our T2V-14B model, which establishes a new SOTA performance benchmark among both open-source and closed-source models. It demonstrates exceptional capabilities in generating high-quality visuals with significant motion dynamics. It is also the only video model capable of producing both Chinese and English text and supports video generation at both 480P and 720P resolutions.

Read more...

What is Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai?

Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai is a highly specialized Image generation AI Model of type Safetensors / Checkpoint AI Model created by AI community user METAFILM_Ai. Derived from the powerful Stable Diffusion (Other) model, Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai has undergone an extensive fine-tuning process, leveraging the power of a dataset consisting of images generated by other AI models or user-contributed data. This fine-tuning process ensures that Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai is capable of generating images that are highly relevant to the specific use-cases it was designed for, such as base model, video, basemodel.

With a rating of 0 and over 0 ratings, Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai is a popular choice among users for generating high-quality images from text prompts.

Can I download Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai?

Yes! You can download the latest version of Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai from here.

How to use Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai?

To use Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai, download the model checkpoint file and set up an UI for running Stable Diffusion models (for example, AUTOMATIC1111). Then, provide the model with a detailed text prompt to generate an image. Experiment with different prompts and settings to achieve the desired results. If this sounds a bit complicated, check out our initial guide to Stable Diffusion – it might be of help. And if you really want to dive deep into AI image generation and understand how set up AUTOMATIC1111 to use Safetensors / Checkpoint AI Models like Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai, check out our crash course in AI image generation.

Download (10.3 GB) Download available on desktop only
You'll need to use a program like A1111 to run this – learn how in our crash course

Popularity

580 ~10

Info

Base model: Other

Version Kijai-umt5-xxl-enc-Wan: 2 Files

To download these files, please visit this page from a desktop computer.

About this version: Kijai-umt5-xxl-enc-Wan

umt5-xxl-enc-bf16

---

Text encoders to ComfyUI/models/text_encoders

Transformer to ComfyUI/models/diffusion_models

Vae to ComfyUI/models/vae

Right now I have only ran the I2V model succesfully.

Can't get frame counts under 81 to work, this was 512x512x81

~16GB used with 20/40 blocks offloaded

---

Kijai ComfyUI wrapper nodes for WanVideo

WORK IN PROGRESS

@kijaidesign 's works

Huggingface - Kijai/WanVideo_comfy

GitHub - kijai/ComfyUI-WanVideoWrapper

NOW Comfy-Org/Wan_2.1_ComfyUI_repackaged

---

💜 Wan    |    🖥️ GitHub    |   🤗 Hugging Face   |   🤖 ModelScope   |    📑 Paper (Coming soon)    |    📑 Blog    |   💬 WeChat Group   |    📖 Discord  

Wan: Open and Advanced Large-Scale Video Generative Models

通义万相Wan2.1视频模型开源!视频生成模型新标杆,支持中文字效+高质量视频生成

In this repository, we present Wan2.1, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. Wan2.1 offers these key features:

  • 👍 SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.

  • 👍 Supports Consumer-grade GPUs: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.

  • 👍 Multiple Tasks: Wan2.1 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.

  • 👍 Visual Text Generation: Wan2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.

  • 👍 Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.

This repository features our T2V-14B model, which establishes a new SOTA performance benchmark among both open-source and closed-source models. It demonstrates exceptional capabilities in generating high-quality visuals with significant motion dynamics. It is also the only video model capable of producing both Chinese and English text and supports video generation at both 480P and 720P resolutions.

13 Versions

😥 There are no Wan-AI 万相/ Wan2.1 Video Model (Safetensors) - Comfy&Kijai Kijai-umt5-xxl-enc-Wan prompts yet!

Go ahead and upload yours!

No results

Your query returned no results – please try removing some filters or trying a different term.