wan-t2v-video

Solid

Build WAN 2.2 Text-to-Video workflows — dual hi-lo models, lightning LoRAs, VACE modules, and KSamplerAdvanced two-pass

AI & Automation 160 stars 30 forks Updated today MIT

Install

View on GitHub

Quality Score: 90/100

Stars 20%
73
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# WAN 2.2 Text-to-Video (T2V) Workflows ## Overview WAN 2.2 T2V generates videos from text prompts using a 14B parameter MoE (Mixture of Experts) architecture split across two specialized models: - **HighNoise model**: Handles early denoising — establishes structure, motion, composition - **LowNoise model**: Handles late denoising — refines details, sharpens output This dual-model technique is the same as FLF/I2V (see wan-flf-video skill) but without image conditioning nodes. **Key difference from I2V/FLF**: T2V does NOT use `CLIPVisionEncode`, `WanFirstLastFrameToVideo`, or any image input. It uses `EmptyHunyuanLatentVideo` for latent initialization and text-only conditioning. ## Models ### UNET (Installed) | Model | Loader | Notes | |-------|--------|-------| | `Wan2_2-T2V-A14B_HIGH_fp8_e4m3fn_scaled_KJ.safetensors` | `UNETLoader` | HighNoise expert, 14.3GB FP8 | | `Wan2_2-T2V-A14B-LOW_fp8_e4m3fn_scaled_KJ.safetensors` | `UNETLoader` | LowNoise expert, 14.3GB FP8 | ### Text Encoder | Component | Node | Model | Notes | |-----------|------|-------|-------| | **CLIP (T5)** | `CLIPLoader` (type=`wan`) | `umt5_xxl_fp8_e4m3fn_scaled.safetensors` | UMT5-XXL fp8, in clip/ | ### VAE | Component | Node | Model | |-----------|------|-------| | **VAE** | `VAELoader` | `wan_2.1_vae.safetensors` | ### VACE Modules (Installed — For Advanced Control) | Model | Size | Notes | |-------|------|-------| | `Wan2_2_Fun_VACE_module_A14B_HIGH_bf16.safetensors` | 5.8GB | HighNoise VAC...

Details

Author
artokun
Repository
artokun/comfyui-mcp
Created
4 months ago
Last Updated
today
Language
TypeScript
License
MIT

Integrates with

Related Skills