What resolutions does Wan 2.2 support?

Wan 2.2 supports 480p and 720p at 24fps, with the TI2V-5B model optimized for 1280x704 or 704x1280.

Is Wan 2.2 free to use?

Yes, it's open-source under MIT license, available on Hugging Face and integrable into various tools.

How does Wan 2.2 handle hardware requirements?

The 5B model runs on RTX 4090 in under 9 minutes for 720p videos, making it accessible for non-enterprise users.

Can I fine-tune Wan 2.2 with LoRA?

While not explicitly detailed in the release, its architecture supports style training, with community integrations emerging.

Where can I test Wan 2.2 demos?

Explore demos on Hugging Face spaces or use ComfyUI for interactive testing and experimentation.

What types of video generation does Wan 2.2 support?

Wan 2.2 supports text-to-video (T2V), image-to-video (I2V), and hybrid text-image-to-video (TI2V) modes, offering flexibility for diverse creative projects.

How does Wan 2.2 improve prompt adherence?

Its curated training data and MoE architecture ensure high fidelity to text and image prompts, producing videos with accurate details and minimal errors.

Is multi-GPU support available for Wan 2.2?

Yes, Wan 2.2 supports multi-GPU configurations, which can significantly speed up video generation for larger projects.

videoEffect.duration

videoEffect.resolution

videoEffect.ratio

videoEffect.autoSound

videoEffect.autoSpeech

videoEffect.noWatermark

videoEffect.private

Discover Wan 2.2: Revolutionizing AI Video Creation

Wan 2.2: Turn Words into Cinematic Masterpieces – Power Your Creativity with AI Video Innovation

What is Wan 2.2

Wan 2.2, released on July 28, 2025, marks a major leap forward from Wan 2.1, debuting the first open-source Mixture-of-Experts (MoE) architecture for video diffusion models. Its dual-expert system—high-noise for initial structure and low-noise for refined details—boasts 27B parameters, activating only 14B per step for enhanced efficiency without added computational cost. The training dataset has been significantly expanded, with 65.6% more images and 83.2% more videos, improving motion, semantics, and visual quality. Key advancements include movie-quality visuals driven by curated data with detailed labels for lighting, composition, contrast, and color; improved handling of complex motions; and a streamlined 5B hybrid TI2V model with Wan2.2-VAE, offering 16×16×4 compression for 720p@24fps videos on consumer GPUs like the RTX 4090. Wan 2.2 achieves lower validation loss, better convergence, and leads benchmarks like Wan-Bench 2.0, offering greater control, realism, and accessibility compared to its predecessor.

What's New in Wan 2.2

Mixture-of-Experts (MoE) Architecture:
Wan 2.2 pioneers an open-source MoE for video diffusion, with high-noise experts shaping initial layouts and low-noise experts refining details, using 27B parameters but only 14B per step for superior efficiency and quality over Wan 2.1’s traditional diffusion approach.
Expanded and Curated Training Data:
Includes 65.6% more images and 83.2% more videos than Wan 2.1, enriched with labels for lighting, composition, contrast, and color, delivering cinematic visuals and precise prompt adherence.
New Hybrid Model Variant (TI2V-5B):
A compact 5B model with high-compression Wan2.2-VAE, supporting hybrid text-to-video and image-to-video at 720p@24fps, generating 5-second videos in under 9 minutes on GPUs like the RTX 4090 for greater accessibility.
Benchmark Dominance and Integrations:
Leads Wan-Bench 2.0, surpassing open-source and proprietary models; integrates seamlessly with ComfyUI, Diffusers, and Hugging Face, supporting low-VRAM options and prompt extensions for ease of use.

Key Features

MoE Architecture for Dynamic Expertise

Wan 2.2 utilizes a Mixture-of-Experts (MoE) design with high-noise and low-noise experts, totaling 27B parameters but activating only 14B per step for efficiency. This enables superior handling of complex motions and semantics, surpassing traditional models in fluidity and detail.

Cinematic Aesthetics and Prompt Precision

Curated with detailed labels for lighting, composition, contrast, and color, Wan 2.2 produces movie-grade visuals. It excels in prompt adherence, generating natural animations with minimal hallucinations, ideal for precise creative control.

Enhanced Motion and Resolution Support

With +65.6% more images and +83.2% more videos in its training data compared to Wan 2.1, Wan 2.2 minimizes frame flickering and supports 720p@24fps videos up to 5 seconds. The TI2V-5B variant enables fast generation on budget hardware.

Multimodal Versatility

Seamlessly integrates text, images, and video, supporting image-to-video transitions and style consistency. Features like particle systems, lighting effects, and LoRA training optimizations make it ideal for diverse applications.

Wan 2.2 vs Wan 2.1 vs Other Video Models

Feature	Wan 2.2	Wan 2.1	Kling AI (1.5/2.0)	OpenAI Sora	Luma AI Dream Machine
Architecture	Mixture-of-Experts (MoE) with high/low-noise experts; first open-source MoE for video diffusion	Standard diffusion model; no MoE	Proprietary transformer-based; focuses on temporal consistency	Proprietary diffusion with advanced transformer; emphasis on world simulation	Diffusion-based with emphasis on surreal and dynamic effects
Parameters	27B total (14B active per step); 5B hybrid variant	~11B (estimated; less efficient scaling)	Not disclosed (proprietary; likely 10B+)	Not disclosed (proprietary; rumored 10B+)	Not disclosed (proprietary; mid-range)
Max Resolution/FPS	720p@24fps (native 1080p in some previews); up to 5s videos	480p/720p@ lower FPS; shorter clips with more artifacts	1080p@30fps; up to 2min videos	1080p@ variable FPS; up to 1min (based on demos)	720p@ variable FPS; up to 10s clips
Benchmark Performance	Tops Wan-Bench 2.0; better convergence and loss than 2.1	Solid but outperformed by 2.2; good in open-source category	Strong in user tests vs. Sora/Luma; excels in temporal metrics	Leading in creative benchmarks (demos show superiority in coherence)	High in qualitative demos; no public benchmarks

How to Use Wan 2.2

Install Dependencies:
Clone the GitHub repo (git clone https://github.com/Wan-Video/Wan2.2.git) and run pip install -r requirements.txt (PyTorch >= 2.4.0 required).
Download Models:
Use Hugging Face CLI for T2V-A14B, I2V-A14B, or TI2V-5B (e.g., huggingface-cli download Wan-AI/Wan2.2-T2V-A14B).
Generate Videos:
For T2V: python generate.py --task t2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-T2V-A14B --prompt "Your detailed prompt". Optimize with --offload_model True for memory efficiency. Use ComfyUI for a user-friendly interface.
Advanced Tips:
Enhance results with prompt extensions via Dashscope API or local models; multi-GPU support speeds up processing.

FAQs

What resolutions does Wan 2.2 support?
Wan 2.2 supports 480p and 720p at 24fps, with the TI2V-5B model optimized for 1280x704 or 704x1280.
Is Wan 2.2 free to use?
Yes, it's open-source under MIT license, available on Hugging Face and integrable into various tools.
How does Wan 2.2 handle hardware requirements?
The 5B model runs on RTX 4090 in under 9 minutes for 720p videos, making it accessible for non-enterprise users.
Can I fine-tune Wan 2.2 with LoRA?
While not explicitly detailed in the release, its architecture supports style training, with community integrations emerging.
Where can I test Wan 2.2 demos?
Explore demos on Hugging Face spaces or use ComfyUI for interactive testing and experimentation.
What types of video generation does Wan 2.2 support?
Wan 2.2 supports text-to-video (T2V), image-to-video (I2V), and hybrid text-image-to-video (TI2V) modes, offering flexibility for diverse creative projects.
How does Wan 2.2 improve prompt adherence?
Its curated training data and MoE architecture ensure high fidelity to text and image prompts, producing videos with accurate details and minimal errors.
Is multi-GPU support available for Wan 2.2?
Yes, Wan 2.2 supports multi-GPU configurations, which can significantly speed up video generation for larger projects.