Skip to content

[feat] Refactor training framework into fastvideo/train#1156

Closed
alexzms wants to merge 226 commits intohao-ai-lab:mainfrom
FoundationResearch:distill1
Closed

[feat] Refactor training framework into fastvideo/train#1156
alexzms wants to merge 226 commits intohao-ai-lab:mainfrom
FoundationResearch:distill1

Conversation

@alexzms
Copy link
Collaborator

@alexzms alexzms commented Mar 6, 2026

Summary

Introduces fastvideo/train, a refactored training framework that replaces the monolithic training/distillation pipelines with a modular, YAML-driven architecture.

Key design changes

  • _target_-based instantiation: Models and methods are selected via _target_ keys in YAML (e.g., fastvideo.train.models.wan.WanModel,
    fastvideo.train.methods.distribution_matching.dmd2.DMD2Method), making it easy to add new models/methods without modifying framework code.
  • Separated concerns: Models (models/), methods (methods/), callbacks (callbacks/), and the training loop (trainer.py) are fully decoupled. The trainer calls
    method.train_one_step() without knowing which method is running.
  • Callback system: Gradient clipping, validation, and EMA are now callbacks (callbacks/) rather than hardcoded in the training loop. Configured via the callbacks:
    section in YAML.
  • Structured config with defaults: TrainingConfig dataclass (utils/training_config.py) provides typed defaults for all training parameters. The fully-resolved config
    (with defaults filled in) is logged to W&B.
  • Checkpoint management: DCP-based save/resume with CheckpointManager, plus dcp_to_diffusers.py for converting checkpoints to Diffusers format.

Supported models & methods

Models Methods
Wan 2.1 (T2V 1.3B) DMD2 distillation
WanGame (incl. causal) Self-forcing distillation
SFT finetuning
DFSFT (Diffusion ForcingSFT)

Bug fixes

  • CFG formula: Fixed real_score_guidance_scale in DMD2 and self-forcing to use the standard formula uncond + scale * (cond - uncond) instead of cond + scale * (cond - uncond) (which silently added +1 to the effective guidance scale).

File structure

fastvideo/train/
trainer.py
models/{base, wan/, wangame/}
methods/{base, distribution_matching/, fine_tuning/}
callbacks/{callback, grad_clip, validation, ema}
entrypoint/{train, dcp_to_diffusers}
utils/{config, builder, training_config, checkpoint, dataloader, optimizer, tracking, ...}

Usage

torchrun --nproc_per_node=8 -m fastvideo.train.entrypoint.train \
    --config examples/distillation/refactor/distill_wan2.1_t2v_1.3B_dmd2.yaml

Test plan

- DMD2 8-step distillation on Wan 2.1 T2V 1.3B matches legacy training loss curves
- VSA finetuning on Wan produces equivalent results to legacy pipeline
- Self-forcing distillation on WanGame runs without errors
- DFSFT on WanGame runs without errors
- Checkpoint save/resume round-trips correctly
- W&B logging shows fully-resolved config with defaults

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the training infrastructure by introducing a new, highly modular framework. The primary goal is to enhance extensibility and maintainability by clearly separating concerns between core training logic, specific models, and various training methods. This change moves towards a more configurable and less coupled system, allowing for easier integration of new research and models.

Highlights

  • Refactored Training Framework: Introduced fastvideo/train, a new modular, YAML-driven architecture that replaces monolithic training/distillation pipelines. This framework decouples models, methods, callbacks, and the training loop, making it easier to extend and maintain.
  • Target-based Instantiation: Models and methods are now instantiated via _target_ keys in YAML configurations, allowing for flexible addition of new components without modifying core framework code.
  • Decoupled Components: Models, methods, and callbacks are fully separated. The trainer interacts with methods through a generic train_one_step() interface, unaware of the specific method's internal logic.
  • Callback System: Gradient clipping, validation, and EMA are implemented as configurable callbacks, moving them out of the main training loop.
  • Structured Configuration: A TrainingConfig dataclass provides typed defaults for all training parameters, and the fully-resolved configuration is logged to W&B for reproducibility.
  • Checkpoint Management: Implemented DCP-based save/resume with CheckpointManager, including a utility for converting checkpoints to Diffusers format.
  • Supported Models & Methods: The refactored framework supports Wan 2.1 (T2V 1.3B) and WanGame (including causal) models, with DMD2 distillation, self-forcing distillation, SFT finetuning, and DFSFT (Diffusion ForcingSFT) methods.
  • CFG Formula Fix: Corrected the real_score_guidance_scale formula in DMD2 and self-forcing to use the standard uncond + scale * (cond - uncond).

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added new entries for *.npy and slurm_outputs/ to ignore generated files and Slurm logs.
  • dev/config.md
    • Added a new document detailing the YAML schema v2 for distillation configurations, including field meanings and design choices.
  • dev/design.md
    • Added a new design document outlining the refactoring of distillation, emphasizing the decoupling of roles, methods, and trainers, drawing inspiration from the FastGen architecture.
  • dev/fastgen.md
    • Added a new document summarizing the architecture and key design principles of the FastGen framework, serving as a reference for the current refactoring.
  • dev/phase_add_causal_wangame_dfsft.md
    • Added a new document detailing the plan to integrate causal WanGame and Diffusion-Forcing SFT (DFSFT) into the new distillation framework, including configuration semantics and validation strategies.
  • dev/phase_add_wangame.md
    • Added a new document outlining the process and current progress of integrating WanGame into the new distillation framework, covering finetuning and DMD2 support.
  • dev/phase_causal.md
    • Added a new document describing the phase of aligning causal/streaming capabilities with FastGen's design, focusing on managing preprocessors and causal variants.
  • dev/phase_causal2.md
    • Added a new document detailing further refinements to the causal design, emphasizing student ownership of shared components and fixing DCP checkpointing issues with activation checkpointing.
  • dev/phase_mode.md
    • Added a new document discussing the 'de-adapterization' of model plugins, aiming to simplify naming and remove intermediate ModelComponents.
  • dev/phase_refactor.md
    • Added a new document outlining the _target_ instantiate-first refactor, detailing the new class interfaces, dynamic config, and assembly flow.
  • dev/phases/phase_0.md
    • Added a new document detailing Phase 0 of the refactor, focusing on landing the framework skeleton and running Wan DMD2, while addressing implicit coupling.
  • dev/phases/phase_1.md
    • Added a new document detailing Phase 1 of the refactor, focusing on algorithm/model decoupling (DMD2Method + WanAdapter) and establishing a general entrypoint.
  • dev/phases/phase_2.md
    • Added a new document detailing Phase 2 of the refactor, aiming for the new distillation framework to run independently of the legacy pipeline.
  • dev/phases/phase_2_9.md
    • Added a new document detailing Phase 2.9 of the refactor, focusing on semantic convergence (operation-centric adapter, policy regression to method, elegant dispatch).
  • dev/phases/phase_3.md
    • Added a new document detailing Phase 3 of the refactor, covering config schema v2, pluggable ODE/SDE samplers, finetuning method integration, and naming/structure organization.
  • dev/refactor.md
    • Added a new document discussing the trade-offs and approaches for refactoring, specifically comparing 'config-driven loading' versus 'clear layering'.
  • docs/wangame/zero_init_fixes.md
    • Added a new document summarizing fixes for zero initialization issues in WanGame, addressing FSDP loader overwriting and attention mask shape mismatches.
  • examples/distill/SFWanGame2.1/distill_dmd.sh
    • Added a new shell script for Self-Forcing DMD distillation training.
  • examples/distill/SFWanGame2.1/distill_dmd.slurm
    • Added a new Slurm script for Self-Forcing DMD distillation training.
  • examples/distill/SFWanGame2.1/validation.json
    • Added a new JSON file for validation data in Self-Forcing WanGame 2.1 distillation.
  • examples/distill/WanGame2.1/distill_dmd.slurm
    • Added a new Slurm script for WanGame 2.1 DMD distillation training.
  • examples/distill/WanGame2.1/validation.json
    • Added a new JSON file for validation data in WanGame 2.1 distillation.
  • examples/distillation/phase0/distill_wan2.1_t2v_1.3B_dmd2_8steps.sh
    • Added a new shell script for Phase 0 Wan DMD2 distillation.
  • examples/distillation/phase0/temp.sh
    • Added a new temporary shell script for Phase 0 Wan DMD2 distillation.
  • examples/distillation/phase1/distill_wan2.1_t2v_1.3B_dmd2_8steps.sh
    • Added a new shell script for Phase 1 Wan DMD2 distillation.
  • examples/distillation/phase1/run.md
    • Added a new markdown file for optional W&B run links in Phase 1.
  • examples/distillation/phase1/temp.sh
    • Added a new temporary shell script for Phase 1 Wan DMD2 distillation.
  • examples/distillation/phase2/README.md
    • Added a new README for Phase 2 examples, detailing YAML-only entrypoints and resume functionality.
  • examples/distillation/phase2/distill_wan2.1_t2v_1.3B_dmd2_8steps.yaml
    • Added a new YAML configuration for Phase 2 Wan DMD2 distillation.
  • examples/distillation/phase2/run_wan2.1_t2v_1.3B_dmd2_8steps.sh
    • Added a new shell script to launch Phase 2 Wan DMD2 distillation.
  • examples/distillation/phase2/temp.sh
    • Added a new temporary shell script for Phase 2 Wan DMD2 distillation.
  • examples/distillation/phase2_9/distill_wan2.1_t2v_1.3B_dmd2_8steps_phase2.9.yaml
    • Added a new YAML configuration for Phase 2.9 Wan DMD2 distillation.
  • examples/distillation/phase2_9/temp.sh
    • Added a new temporary shell script for Phase 2.9 Wan DMD2 distillation.
  • examples/distillation/phase3_1/distill_wan2.1_t2v_1.3B_dmd2_8steps_phase3.1.yaml
    • Added a new YAML configuration for Phase 3.1 Wan DMD2 distillation.
  • examples/distillation/phase3_1/temp.sh
    • Added a new temporary shell script for Phase 3.1 Wan DMD2 distillation.
  • examples/distillation/phase3_2/distill_wan2.1_t2v_1.3B_dmd2_8steps_phase3.2.yaml
    • Added a new YAML configuration for Phase 3.2 Wan DMD2 distillation.
  • examples/distillation/phase3_2/temp.sh
    • Added a new temporary shell script for Phase 3.2 Wan DMD2 distillation.
  • examples/distillation/phase3_3/finetune_wan2.1_t2v_1.3B_phase3.3.yaml
    • Added a new YAML configuration for Phase 3.3 Wan finetuning.
  • examples/distillation/phase3_3/finetune_wan2.1_t2v_1.3B_vsa_phase3.3.yaml
    • Added a new YAML configuration for Phase 3.3 Wan finetuning with VSA.
  • examples/distillation/phase3_3/temp-vsa.sh
    • Added a new temporary shell script for Phase 3.3 VSA finetuning.
  • examples/distillation/phase3_3/temp.sh
    • Added a new temporary shell script for Phase 3.3 finetuning.
  • examples/distillation/phase3_4/distill-temp.sh
    • Added a new temporary shell script for Phase 3.4 distillation.
  • examples/distillation/phase3_4/distill_wan2.1_t2v_1.3B_dmd2_8steps_phase3.4.yaml
    • Added a new YAML configuration for Phase 3.4 Wan DMD2 distillation.
  • examples/distillation/phase3_4/finetune-vsa-temp.sh
    • Added a new temporary shell script for Phase 3.4 VSA finetuning.
  • examples/distillation/phase3_4/finetune_wan2.1_t2v_1.3B_phase3.4.yaml
    • Added a new YAML configuration for Phase 3.4 Wan finetuning.
  • examples/distillation/phase3_4/finetune_wan2.1_t2v_1.3B_vsa_phase3.4_0.7sparsity.yaml
    • Added a new YAML configuration for Phase 3.4 Wan finetuning with 0.7 VSA sparsity.
  • examples/distillation/phase3_4/finetune_wan2.1_t2v_1.3B_vsa_phase3.4_0.9sparsity.yaml
    • Added a new YAML configuration for Phase 3.4 Wan finetuning with 0.9 VSA sparsity.
  • examples/distillation/refactor/dfsft_wangame_causal_v3.yaml
    • Added a new YAML configuration for WanGame causal Diffusion-Forcing SFT (DFSFT).
  • examples/distillation/refactor/distill_wan2.1_t2v_1.3B_dmd2.yaml
    • Added a new YAML configuration for Wan 2.1 T2V 1.3B DMD2 distillation.
  • examples/distillation/refactor/example.yaml
    • Added a new YAML configuration example with full reference for fastvideo.train.
  • examples/distillation/refactor/rfc.md
    • Added a new markdown file detailing the refactoring discussion on complexity and layering.
  • examples/distillation/refactor/run.sh
    • Added a new shell script to launch distillation training from a v3 YAML config.
  • examples/distillation/refactor/self_forcing_wangame_causal_v3.yaml
    • Added a new YAML configuration for WanGame causal Self-Forcing distillation.
  • examples/distillation/wangame/dfsft_wangame2.1_i2v_1.3B_causal.yaml
    • Added a new YAML configuration for WanGame causal DFSFT.
  • examples/distillation/wangame/dfsft_wangame2.1_i2v_1.3B_causal_4n8g.slurm
    • Added a new Slurm script for WanGame causal DFSFT training on 4 nodes with 8 GPUs.
  • examples/distillation/wangame/dfsft_wangame2.1_i2v_1.3B_causal_4n8g.yaml
    • Added a new YAML configuration for WanGame causal DFSFT training on 4 nodes with 8 GPUs.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_dmd2_4steps_actioncfg.yaml
    • Added a new YAML configuration for WanGame DMD2 distillation with action CFG.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_dmd2_4steps_causal_teacher_ckpt22000_nocfg.yaml
    • Added a new YAML configuration for WanGame causal DMD2 distillation with a causal teacher checkpoint.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_dmd2_4steps_causal_teacher_ckpt22000_nocfg_4n8g.slurm
    • Added a new Slurm script for WanGame causal DMD2 distillation with a causal teacher checkpoint.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_dmd2_4steps_nocfg.yaml
    • Added a new YAML configuration for WanGame DMD2 distillation without CFG.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_self_forcing_4steps_causal_teacher_ckpt22000_nocfg.yaml
    • Added a new YAML configuration for WanGame causal Self-Forcing distillation with a causal teacher checkpoint.
  • examples/distillation/wangame/distill_wangame2.1_i2v_1.3B_self_forcing_4steps_causal_teacher_ckpt22000_nocfg_4n8g.slurm
    • Added a new Slurm script for WanGame causal Self-Forcing distillation with a causal teacher checkpoint.
  • examples/distillation/wangame/finetune_wangame2.1_i2v_1.3B.yaml
    • Added a new YAML configuration for WanGame finetuning.
  • examples/distillation/wangame/finetune_wangame2.1_i2v_1.3B_bidirectional_4n8g.slurm
    • Added a new Slurm script for WanGame bidirectional finetuning on 4 nodes with 8 GPUs.
  • examples/distillation/wangame/finetune_wangame2.1_i2v_1.3B_bidirectional_4n8g.yaml
    • Added a new YAML configuration for WanGame bidirectional finetuning on 4 nodes with 8 GPUs.
  • examples/distillation/wangame/run.sh
    • Added a new shell script to launch WanGame finetuning.
  • examples/inference/basic/basic_causal_wangame.py
    • Added a new Python script for basic causal WanGame inference.
  • examples/inference/basic/basic_wangame.py
    • Added a new Python script for basic WanGame inference.
  • examples/inference/basic/basic_wangame_lingbot.py
    • Added a new Python script for basic WanGame LingBot inference.
  • examples/training/consistency_finetune/causal_wangame_ode_init/action/README.md
    • Added a new README file detailing action sequences for causal WanGame ODE initialization.
  • examples/training/consistency_finetune/causal_wangame_ode_init/ar_diff.slurm
    • Added a new Slurm script for causal WanGame ODE initialization with AR diffusion.
  • examples/training/consistency_finetune/causal_wangame_ode_init/finetune_ode_init.sh
    • Added a new shell script for causal WanGame ODE initialization finetuning.
  • examples/training/consistency_finetune/causal_wangame_ode_init/finetune_ode_init.slurm
    • Added a new Slurm script for causal WanGame ODE initialization finetuning.
  • examples/training/consistency_finetune/causal_wangame_ode_init/launch_preprocess_slurm.sh
    • Added a new shell script to launch Slurm preprocessing jobs for causal WanGame ODE initialization.
  • examples/training/consistency_finetune/causal_wangame_ode_init/ode_finetune_worker.slurm
    • Added a new Slurm script for causal WanGame ODE initialization finetuning worker.
  • examples/training/consistency_finetune/causal_wangame_ode_init/preprocess_data.sh
    • Added a new shell script for preprocessing data for causal WanGame ODE initialization.
  • examples/training/consistency_finetune/causal_wangame_ode_init/preprocess_worker.slurm
    • Added a new Slurm script for preprocessing worker in causal WanGame ODE initialization.
  • examples/training/consistency_finetune/causal_wangame_ode_init/validation.json
    • Added a new JSON file for validation data in causal WanGame ODE initialization.
  • examples/training/consistency_finetune/causal_wangame_ode_init/validation_same.json
    • Added a new JSON file for validation data with same samples in causal WanGame ODE initialization.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/actions/README.md
    • Added a new README file detailing action sequences for WanGame 2.1 I2V finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/actions_81/README.md
    • Added a new README file detailing action sequences for WanGame 2.1 I2V finetuning with 81 frames.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/finetune_i2v.sh
    • Added a new shell script for WanGame 2.1 I2V finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/finetune_i2v.slurm
    • Added a new Slurm script for WanGame 2.1 I2V finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/finetune_wangame.slurm
    • Added a new Slurm script for WanGame 2.1 I2V finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/finetune_wangame_freeze_action.slurm
    • Added a new Slurm script for WanGame 2.1 I2V finetuning with frozen action modules.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/preprocess_wangame_data_i2v.sh
    • Added a new shell script for preprocessing WanGame 2.1 I2V data.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/scripts/collect_samples_to_shao.py
    • Added a new Python script to collect samples for validation.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/scripts/generate_actions.py
    • Added a new Python script to generate action sequences.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/scripts/generate_validation.py
    • Added a new Python script to generate validation JSON files.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/scripts/generate_validation_static_w.py
    • Added a new Python script to generate validation JSON for static W actions.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/scripts/generate_validation_to_shao.py
    • Added a new Python script to generate validation JSON from collected samples.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/validation.json
    • Added a new JSON file for validation data.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/validation_random.json
    • Added a new JSON file for random validation data.
  • examples/training/finetune/WanGame2.1_1.3b_i2v/validation_zelda.json
    • Added a new JSON file for Zelda-themed validation data.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/action/README.md
    • Added a new README file detailing action sequences for WanGame 2.1 I2V LingBot finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/finetune_i2v.sh
    • Added a new shell script for WanGame 2.1 I2V LingBot finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/finetune_i2v.slurm
    • Added a new Slurm script for WanGame 2.1 I2V LingBot finetuning.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/generate_actions.py
    • Added a new Python script to generate action sequences for WanGame 2.1 I2V LingBot.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/launch_preprocess_slurm.sh
    • Added a new shell script to launch Slurm preprocessing jobs for WanGame 2.1 I2V LingBot.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/preprocess_wangame_data_i2v.sh
    • Added a new shell script for preprocessing WanGame 2.1 I2V LingBot data.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/preprocess_worker.slurm
    • Added a new Slurm script for preprocessing worker in WanGame 2.1 I2V LingBot.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/validation.json
    • Added a new JSON file for validation data in WanGame 2.1 I2V LingBot.
  • examples/training/finetune/WanGame2.1_1.3b_i2v_LingBot/validation_vizdoom.json
    • Added a new JSON file for VizDoom validation data in WanGame 2.1 I2V LingBot.
  • fastvideo/configs/models/dits/init.py
    • Modified to include new WanGame and WanLingBot video configurations.
  • fastvideo/configs/models/dits/wangamevideo.py
    • Added a new file defining WanGameVideoConfig and WanLingBotVideoConfig for DiT architectures, including FSDP shard conditions and parameter mappings.
  • fastvideo/configs/pipelines/init.py
    • Modified to include new WanGame and WanLingBot pipeline configurations.
  • fastvideo/configs/pipelines/base.py
    • Modified to add sampler_kind and ode_solver fields to PipelineConfig and corresponding CLI arguments.
  • fastvideo/configs/pipelines/wan.py
    • Modified to add WanLingBotI2V480PConfig, WanGameI2V480PConfig, and SelfForcingWanGameI2V480PConfig for various WanGame image-to-video pipelines.
  • fastvideo/configs/sample/wan.py
    • Modified Wan2_1_Fun_1_3B_InP_SamplingParam to adjust default height, width, num_frames, fps, guidance_scale, and num_inference_steps for WanGame.
  • fastvideo/dataset/dataloader/record_schema.py
    • Modified to add wangame_ode_record_creator for creating ODE trajectory records compatible with WanGame.
  • fastvideo/dataset/dataloader/schema.py
    • Modified to add pyarrow_schema_wangame, pyarrow_schema_wangame_lingbot, and pyarrow_schema_ode_trajectory_wangame for different WanGame data schemas.
  • fastvideo/dataset/parquet_dataset_map_style.py
    • Modified ParquetMapStyleSampler to include reshuffle_each_epoch and updated get_parquet_files_and_length to support comma-separated data paths with repeat counts and improved caching.
  • fastvideo/dataset/validation_dataset.py
    • Modified ValidationDataset to support limiting the number of samples and to correctly load action data from .npy files, handling both dictionary and array formats.
  • fastvideo/fastvideo_args.py
    • Modified TrainingArgs to add reshuffle_each_epoch, validation_num_samples, train_action_only, action_train_target, action_warmup_steps, and best_checkpoint_start_step for more granular control over training and validation.
  • fastvideo/models/dits/hyworld/pose.py
    • Modified to import generate_camera_trajectory_local from fastvideo.models.dits.hyworld.trajectory and added reformat_keyboard_and_mouse_tensors and process_custom_actions for handling custom action inputs.
  • fastvideo/models/dits/matrixgame/utils.py
    • Modified to uncomment and update drawing functions for video overlays, and added parse_npy_action, draw_mouse_on_frame, and process_video_with_npy for enhanced visualization of actions.
  • fastvideo/models/dits/wangame/init.py
    • Added a new __init__.py file to export WanGame transformer models and action modules.
  • fastvideo/models/dits/wangame/causal_model.py
    • Added a new file defining CausalWanGameActionTransformer3DModel, which extends BaseDiT with causal attention, KV caching, and PRoPE support for WanGame.
  • fastvideo/models/dits/wangame/hyworld_action_module.py
    • Added a new file defining WanGameActionTimeImageEmbedding for combined timestep and action embeddings, and WanGameActionSelfAttention for self-attention with RoPE and PRoPE.
  • fastvideo/models/dits/wangame/model.py
    • Added a new file defining WanGameActionTransformer3DModel, which extends BaseDiT with action conditioning and camera PRoPE attention for WanGame.
  • fastvideo/models/dits/wangame_lingbot/init.py
    • Added a new __init__.py file to export WanLingBotTransformer3DModel.
  • fastvideo/models/dits/wangame_lingbot/cam_utils.py
    • Added a new file containing utility functions for camera pose interpolation, SE3 inverse, and plucker embeddings, adapted from LingBot World.
  • fastvideo/models/dits/wangame_lingbot/model.py
    • Added a new file defining WanLingBotTransformer3DModel, which extends BaseDiT with camera control and action conditioning for LingBot World.
  • fastvideo/models/loader/component_loader.py
    • Modified PipelineComponentLoader to update handling of gemma_model_path and added specific checks for WanGame and WanLingBot models to prevent unintended behavior.
  • fastvideo/models/loader/fsdp_load.py
    • Modified maybe_load_fsdp_model to log incompatible/unexpected keys during loading and updated load_model_from_full_model_state_dict to selectively apply kaiming_uniform_ initialization for new parameters based on patterns.
  • fastvideo/models/registry.py
    • Modified MODEL_CLASS_REGISTRY to include new WanGame and WanLingBot transformer models, enabling their dynamic loading.
  • fastvideo/pipelines/basic/wan/wan_dmd_pipeline.py
    • Modified WanDMDPipeline to be a compatibility wrapper around WanPipeline, explicitly setting sampler_kind to 'sde' for SDE-style denoising.
  • fastvideo/pipelines/basic/wan/wan_i2v_dmd_pipeline.py
    • Modified WanI2VDMDPipeline to remove TimestepPreparationStage and update DenoisingStage arguments, streamlining the I2V DMD pipeline.
  • fastvideo/pipelines/basic/wan/wan_pipeline.py
    • Modified WanPipeline to dynamically build schedulers and conditionally add TimestepPreparationStage and DenoisingStage/SdeDenoisingStage based on the sampler_kind configuration.
  • fastvideo/pipelines/basic/wan/wangame_causal_dmd_pipeline.py
    • Added a new file defining WanGameCausalDMDPipeline, which extends LoRAPipeline and ComposedPipelineBase with causal DMD stages and action support.
  • fastvideo/pipelines/basic/wan/wangame_i2v_pipeline.py
    • Added a new file defining WanGameActionImageToVideoPipeline and WanLingBotImageToVideoPipeline, which extend LoRAPipeline and ComposedPipelineBase for image-to-video generation with action conditioning.
  • fastvideo/pipelines/pipeline_batch_info.py
    • Modified ForwardBatch to add sampling_timesteps, allowing for explicit control over denoising loop timesteps in sampler-specific contexts.
  • fastvideo/pipelines/preprocess/v1_preprocess.py
    • Modified main function to include wangame and wangame_ode_trajectory as valid preprocessing tasks, expanding the preprocessing capabilities.
  • fastvideo/pipelines/preprocess/wangame/wangame_preprocess_pipeline.py
    • Added a new file defining PreprocessPipeline_WanGame, an I2V preprocessing pipeline that extracts CLIP and VAE features from the first frame and handles action data.
  • fastvideo/pipelines/preprocess/wangame/wangame_preprocess_pipeline_ode_trajectory.py
    • Added a new file defining PreprocessPipeline_WanGame_ODE_Trajectory, a preprocessing pipeline for WanGame that generates ODE trajectories for camera embeddings.
  • fastvideo/pipelines/samplers/init.py
    • Added a new __init__.py file to manage samplers.
  • fastvideo/pipelines/samplers/wan.py
    • Added a new file containing utility functions build_wan_scheduler, get_wan_sampler_kind, and wan_use_btchw_layout for Wan pipeline schedulers and sampler kinds.
  • fastvideo/pipelines/stages/denoising.py
    • Modified DenoisingStage to handle sampling_timesteps and added SdeDenoisingStage for SDE-style denoising loops.
  • fastvideo/pipelines/stages/image_encoding.py
    • Modified ImageEncodingStage to add MatrixGameImageEncodingStage for MatrixGame-specific image encoding.
  • fastvideo/pipelines/stages/latent_preparation.py
    • Modified LatentPreparationStage to handle sampling_timesteps and use_btchw_layout, improving flexibility in latent preparation.
  • fastvideo/pipelines/stages/timestep_preparation.py
    • Modified TimestepPreparationStage to handle sampling_timesteps, allowing for explicit control over timesteps in the denoising process.
  • fastvideo/train/init.py
    • Added a new __init__.py file for the fastvideo.train package.
  • fastvideo/train/callbacks/init.py
    • Added a new __init__.py file for the fastvideo.train.callbacks package.
  • fastvideo/train/callbacks/callback.py
    • Added a new file defining Callback and CallbackDict for managing training callbacks.
  • fastvideo/train/callbacks/ema.py
    • Added a new file defining EMACallback for Exponential Moving Average during training.
  • fastvideo/train/callbacks/grad_clip.py
    • Added a new file defining GradNormClipCallback for gradient norm clipping.
  • fastvideo/train/callbacks/validation.py
    • Added a new file defining ValidationCallback for periodic validation during training.
  • fastvideo/train/entrypoint/init.py
    • Added a new __init__.py file for the fastvideo.train.entrypoint package.
  • fastvideo/train/entrypoint/dcp_to_diffusers.py
    • Added a new file for converting DCP checkpoints to Diffusers format.
  • fastvideo/train/entrypoint/train.py
    • Added a new file defining the main training entrypoint for the refactored framework.
  • fastvideo/train/methods/init.py
    • Added a new __init__.py file for the fastvideo.train.methods package.
  • fastvideo/train/methods/base.py
    • Added a new file defining DistillMethod as the base class for all training methods.
  • fastvideo/train/methods/distribution_matching/init.py
    • Added a new __init__.py file for the fastvideo.train.methods.distribution_matching package.
  • fastvideo/train/methods/distribution_matching/dmd2.py
    • Added a new file defining DMD2Method for Distribution Matching Distillation v2.
  • fastvideo/train/methods/distribution_matching/self_forcing.py
    • Added a new file defining SelfForcingMethod for self-forcing distillation.
  • fastvideo/train/methods/fine_tuning/init.py
    • Added a new __init__.py file for the fastvideo.train.methods.fine_tuning package.
  • fastvideo/train/methods/fine_tuning/dfsft.py
    • Added a new file defining DiffusionForcingSFTMethod for Diffusion Forcing Supervised Fine-Tuning.
  • fastvideo/train/methods/fine_tuning/finetune.py
    • Added a new file defining FineTuneMethod for supervised fine-tuning.
  • fastvideo/train/models/init.py
    • Added a new __init__.py file for the fastvideo.train.models package.
  • fastvideo/train/models/base.py
    • Added a new file defining ModelBase and CausalModelBase as base classes for model plugins.
  • fastvideo/train/models/wan/init.py
    • Added a new __init__.py file for the fastvideo.train.models.wan package.
  • fastvideo/train/models/wan/model.py
    • Added a new file defining WanModel as a model plugin for Wan.
  • fastvideo/train/models/wangame/init.py
    • Added a new __init__.py file for the fastvideo.train.models.wangame package.
  • fastvideo/train/models/wangame/model.py
    • Added a new file defining WanGameModel as a model plugin for WanGame.
  • fastvideo/train/models/wangame/model_causal.py
    • Added a new file defining WanGameCausalModel as a causal model plugin for WanGame.
  • fastvideo/train/trainer.py
    • Added a new file defining DistillTrainer as the core training loop orchestrator.
  • fastvideo/train/utils/init.py
    • Added a new __init__.py file for the fastvideo.train.utils package.
  • fastvideo/train/utils/builder.py
    • Added a new file defining build_from_config for constructing runtime components from configuration.
  • fastvideo/train/utils/checkpoint.py
    • Added a new file defining CheckpointManager for handling checkpoint saving and loading.
  • fastvideo/train/utils/config.py
    • Added a new file defining RunConfig and load_run_config for managing training configurations.
  • fastvideo/train/utils/dataloader.py
    • Added a new file defining utility functions for dataloader construction.
  • fastvideo/train/utils/instantiate.py
    • Added a new file defining instantiate for _target_-based object creation.
  • fastvideo/train/utils/module_state.py
    • Added a new file defining apply_trainable for managing module trainability.
  • fastvideo/train/utils/moduleloader.py
    • Added a new file defining load_module_from_path for dynamic module loading.
  • fastvideo/train/utils/optimizer.py
    • Added a new file defining utility functions for optimizer construction.
  • fastvideo/train/utils/tracking.py
    • Added a new file defining Tracker for logging and tracking training progress.
  • fastvideo/train/utils/training_config.py
    • Added a new file defining TrainingConfig dataclass for structured training settings.
  • fastvideo/train/utils/validation.py
    • Added a new file defining ValidationRequest and WanValidator for validation utilities.
  • fastvideo/train/utils/vsa_scheduler.py
    • Added a new file defining VSASparsityScheduler for managing VSA sparsity during training.
  • fastvideo/training/distillation.py
    • Modified main function to remove distill_model and distill_method CLI arguments, updating config loading and runtime building to use the new YAML-driven approach.
  • fastvideo/training/distillation_pipeline.py
    • Modified DistillationPipeline to remove simulate_generator_forward and update _dmd_forward to fix the CFG formula.
  • fastvideo/training/self_forcing_distillation_pipeline.py
    • Modified SelfForcingDistillationPipeline to remove simulate_generator_forward and update _dmd_forward to fix the CFG formula.
  • fastvideo/training/training_utils.py
    • Modified save_distillation_checkpoint to add get_best_checkpoint_path for saving the best checkpoint based on a metric.
  • fastvideo/training/wan_distillation_pipeline.py
    • Modified WanDistillationPipeline to remove simulate_generator_forward and update _dmd_forward to fix the CFG formula.
  • fastvideo/training/wan_i2v_distillation_pipeline.py
    • Modified WanI2VDistillationPipeline to remove simulate_generator_forward and update _dmd_forward to fix the CFG formula.
  • fastvideo/training/wan_self_forcing_distillation_pipeline.py
    • Modified WanSelfForcingDistillationPipeline to remove simulate_generator_forward and update _dmd_forward to fix the CFG formula.
  • fastvideo/training/wangame_ar_diffusion_pipeline.py
    • Added a new file defining WanGameARDiffusionPipeline for autoregressive diffusion training in WanGame.
  • fastvideo/training/wangame_distillation_pipeline.py
    • Added a new file defining WanGameDistillationPipeline for distillation training in WanGame.
  • fastvideo/training/wangame_ode_causal_pipeline.py
    • Added a new file defining WanGameODECausalPipeline for ODE causal training in WanGame.
  • fastvideo/training/wangame_self_forcing_distillation_pipeline.py
    • Added a new file defining WanGameSelfForcingDistillationPipeline for self-forcing distillation training in WanGame.
  • fastvideo/training/wangame_training_pipeline.py
    • Added a new file defining WanGameTrainingPipeline for general training in WanGame.
Activity
  • This pull request is newly created and has no human activity yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a massive and impressive refactoring of the training framework, moving towards a more modular, decoupled, and YAML-driven architecture inspired by FastGen. The introduction of _target_-based instantiation, a callback system, and clear separation of concerns for models, methods, and the trainer are excellent design choices that will significantly improve maintainability and extensibility. The extensive documentation and phased migration plan are also highly commendable. I've found a few critical issues in the implementation that need to be addressed, primarily related to a syntax error in a package __init__, a logic bug in model type detection, and unreachable code in an attention module.

Note: Security Review did not run due to the size of the PR.

Comment on lines +846 to 855
) and not (
cls_name.startswith("WanGame")
or cls_name == "WanGameActionTransformer3DModel"
or cls_name.startswith("CausalWan")
or getattr(fastvideo_args.pipeline_config, "prefix", "") == "WanGame"
or cls_name.startswith("WanLingBot")
or cls_name == "WanLingBotTransformer3DModel"
or getattr(fastvideo_args.pipeline_config, "prefix", "") == "WanLingBot"
or cls_name.startswith("CausalWanGameActionTransformer3DModel")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to determine is_wan_model appears to have a contradiction. The condition cls_name.startswith("CausalWan") is included in the initial positive check (line 846) and also in a negative exclusion block (line 849). This means the condition for CausalWan models will always evaluate to false, preventing them from being correctly identified as a wan_model.

@alexzms
Copy link
Collaborator Author

alexzms commented Mar 7, 2026

Related Issue: #1158

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants