Replies: 2 comments 3 replies
-
|
Copy and paste the error. |
Beta Was this translation helpful? Give feedback.
3 replies
-
|
Any results? Ask for help |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to use WAN 2.2 for I2V-A14B training, but both configurations I've tried are encountering errors. Since I haven't used musubi for a long time, I'm not sure about the correct usage.
Previous Working Setup (WAN 2.1)
When I was training WAN 2.1, I used the following scripts and they worked perfectly as long as I didn't exceed GPU memory limits or have naming errors:
Step 1: Cache text encoder outputs
bashwan_cache_text_encoder_outputs.py --dataset_config D:/musbi/musubi-tuner/dataset/dataset.toml --t5 D:/musbi/musubi-tuner/wan/models_t5_umt5-xxl-enc-bf16.pth --batch_size 16Step 2: Cache latents
bashwan_cache_latents.py --dataset_config D:/musbi/musubi-tuner/dataset/dataset.toml --vae D:/musbi/musubi-tuner/wan/wan_2.1_vae2.safetensors --clip D:/musbi/musubi-tuner/wan/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pthStep 3: Train network
Method 1: Using high_noise_model alone
bashaccelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py --task i2v-A14B --dit D:/musbi/b/musubi-tuner/wan/Wan2.2-I2V-A14B/high_noise_model/ --min_timestep 900 --max_timestep 1000 --preserve_distribution_shape --dataset_config D:/musbi/b/musubi-tuner/dataset/dataset.toml --sdpa --mixed_precision bf16 --fp8_base --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 16 --save_every_n_epochs 1 --seed 42 --output_dir D:/musbi/musubi-tuner/out99/low_noise --output_name low_noise_loraMethod 2: Using both dit and dit_high_noise
bashaccelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py --task i2v-A14B --dit D:/musbi/b/musubi-tuner/wan/wan_2.1_vae2.safetensors --dit_high_noise D:/musbi/b/musubi-tuner/wan/Wan2.2-I2V-A14B/high_noise_model --min_timestep 900 --max_timestep 1000 --preserve_distribution_shape --dataset_config D:/musbi/b/musubi-tuner/dataset/dataset.toml --sdpa --mixed_precision bf16 --fp8_base --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --max_data_loader_n_workers 2 --persistent_data_loader_workers --network_module networks.lora_wan --network_dim 32 --timestep_sampling shift --discrete_flow_shift 3.0 --max_train_epochs 16 --save_every_n_epochs 1 --seed 42 --output_dir D:/musbi/musubi-tuner/out99/high_noise --output_name high_noise_loraBoth methods are showing errors.
Background
Based on my previous experience with diffusion-pipe, WAN 2.2 training doesn't require using the 2.2 VAE unless it's a 5B model. So the issue should be elsewhere.
Environment Setup
To avoid issues with outdated projects, I've freshly cloned the current project, switched to the appropriate branch, and completed the installation following all the setup steps.
Request for Help
What should be the correct training command for WAN 2.2 I2V-A14B?
Are there any missing steps I need to execute?
Are there any adjustments needed for the accelerate launch parameters?
What are the key differences between WAN 2.1 and WAN 2.2 training procedures that I might be missing?
Since I haven't used musubi for a long time, I hope experienced users can provide guidance. Thank you!
Environment Information
Model: WAN 2.2 I2V-A14B
Task type: i2v-A14B
Text encoder and latents caching steps have been completed
Fresh clone with proper branch and installation
Beta Was this translation helpful? Give feedback.
All reactions