WAN 2.2 I2V-A14B Training Issues - Need Guidance on Correct Training Commands #410

hl2dm · 2025-08-05T20:46:47Z

hl2dm
Aug 5, 2025

I'm trying to use WAN 2.2 for I2V-A14B training, but both configurations I've tried are encountering errors. Since I haven't used musubi for a long time, I'm not sure about the correct usage.
Previous Working Setup (WAN 2.1)
When I was training WAN 2.1, I used the following scripts and they worked perfectly as long as I didn't exceed GPU memory limits or have naming errors:

Step 1: Cache text encoder outputs

bashwan_cache_text_encoder_outputs.py --dataset_config D:/musbi/musubi-tuner/dataset/dataset.toml --t5 D:/musbi/musubi-tuner/wan/models_t5_umt5-xxl-enc-bf16.pth --batch_size 16

Step 2: Cache latents

bashwan_cache_latents.py --dataset_config D:/musbi/musubi-tuner/dataset/dataset.toml --vae D:/musbi/musubi-tuner/wan/wan_2.1_vae2.safetensors --clip D:/musbi/musubi-tuner/wan/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth

Step 3: Train network

bashaccelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 wan_train_network.py --task i2v-14B --dit D:/musbi/musubi-tuner/wan/wan_2.1_vae2.safetensors --dataset_config D:/musbi/musubi-tuner/dataset/dataset.toml --sdpa --mixed_precision bf16 --fp8_base --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 16 --save_every_n_epochs 1 --seed 42 --output_dir D:/musbi/musubi-tuner/out --blocks_to_swap 36 --output_name jumpdance
Current Attempted Training Commands (WAN 2.2 - Both Failing)

Method 1: Using high_noise_model alone

bashaccelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py --task i2v-A14B --dit D:/musbi/b/musubi-tuner/wan/Wan2.2-I2V-A14B/high_noise_model/ --min_timestep 900 --max_timestep 1000 --preserve_distribution_shape --dataset_config D:/musbi/b/musubi-tuner/dataset/dataset.toml --sdpa --mixed_precision bf16 --fp8_base --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 16 --save_every_n_epochs 1 --seed 42 --output_dir D:/musbi/musubi-tuner/out99/low_noise --output_name low_noise_lora

Method 2: Using both dit and dit_high_noise

bashaccelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py --task i2v-A14B --dit D:/musbi/b/musubi-tuner/wan/wan_2.1_vae2.safetensors --dit_high_noise D:/musbi/b/musubi-tuner/wan/Wan2.2-I2V-A14B/high_noise_model --min_timestep 900 --max_timestep 1000 --preserve_distribution_shape --dataset_config D:/musbi/b/musubi-tuner/dataset/dataset.toml --sdpa --mixed_precision bf16 --fp8_base --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 16 --save_every_n_epochs 1 --seed 42 --output_dir D:/musbi/musubi-tuner/out99/high_noise --output_name high_noise_lora

Both methods are showing errors.
Background
Based on my previous experience with diffusion-pipe, WAN 2.2 training doesn't require using the 2.2 VAE unless it's a 5B model. So the issue should be elsewhere.
Environment Setup
To avoid issues with outdated projects, I've freshly cloned the current project, switched to the appropriate branch, and completed the installation following all the setup steps.
Request for Help

What should be the correct training command for WAN 2.2 I2V-A14B?
Are there any missing steps I need to execute?
Are there any adjustments needed for the accelerate launch parameters?
What are the key differences between WAN 2.1 and WAN 2.2 training procedures that I might be missing?

Since I haven't used musubi for a long time, I hope experienced users can provide guidance. Thank you!
Environment Information

Model: WAN 2.2 I2V-A14B
Task type: i2v-A14B
Text encoder and latents caching steps have been completed
Fresh clone with proper branch and installation

niceguy4 · 2025-08-13T04:41:40Z

Even though the documentation example shows --mixed-precision bf16 I don't think it's supported with the FP16 models that are linked from the same page.

Hi just saw your post, did you find a way to train properly ?

wzgrx Sep 29, 2025

Any results? Ask for help

wzgrx · 2025-09-29T14:04:47Z

wzgrx
Sep 29, 2025

Any results? Ask for help

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WAN 2.2 I2V-A14B Training Issues - Need Guidance on Correct Training Commands #410

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

WAN 2.2 I2V-A14B Training Issues - Need Guidance on Correct Training Commands #410

Uh oh!

Uh oh!

hl2dm Aug 5, 2025

Replies: 2 comments · 3 replies

Uh oh!

niceguy4 Aug 13, 2025

Uh oh!

Oruli Sep 8, 2025

Uh oh!

BenDes21 Sep 15, 2025

Uh oh!

wzgrx Sep 29, 2025

Uh oh!

wzgrx Sep 29, 2025

hl2dm
Aug 5, 2025

Replies: 2 comments 3 replies

niceguy4
Aug 13, 2025

wzgrx
Sep 29, 2025