Release Release 0.1.7 · hao-ai-lab/FastVideo

What's Changed

[VSA] [STA] Fix directory structure for pypi publishing by @SolitaryThinker in #769
[bugfix] fix STA install setup.py import by @SolitaryThinker in #770
[bugfix] [VSA] [STA] Fix MANIFEST.in for VSA and STA; Move tk into both directories by @SolitaryThinker in #771
[misc] [VSA] [STA] fix tk_root in setup.py for VSA and STA by @SolitaryThinker in #772
[Preprocess][Feat] support torchvision to load video in new preprocessing by @Eigensystem in #761
[Preprocess][Fix] video quality issue by @Eigensystem in #773
[misc] Improve text encoding stage by @SolitaryThinker in #774
[CI] Add ssim test for causal inference by @SolitaryThinker in #784
[misc] Update Slack invite link by @SolitaryThinker in #786
fix: lora_B init zeros by @DataAIPlayer in #781
[Feature] Support Lora for DMD by @Edenzzzz in #755
[Backend][Vmoba] Add implementation of VMoba by @EricLina in #778
[Self-forcing] [1/n] Handle extra dim in time embedding and add timestep warping by @SolitaryThinker in #792
[preprocessing] [self-forcing] [2/n] Improve preprocessing and add ode trajectory dataset schema by @SolitaryThinker in #794
[bugfix] Fix delta calculation by @DataAIPlayer in #796
[bugfix] pin gradio version and set current_vsa_sparsity in TrainingPipeline by @SolitaryThinker in #798
[self-forcing] [3/n] Text embed only preprocessing by @SolitaryThinker in #797
[bugfix] Fix empty PipelineConfigs for Wan2.2 A14B by @SolitaryThinker in #800
[Bugfix] Fix VMoba requirements by @Edenzzzz in #802
[bugfix] Wan2.2 Boundary ratio by @SolitaryThinker in #804
[self-forcing] [4/n] Preprocessing for collecting ODE trajectory by @SolitaryThinker in #788
Update example files and readme by @BrianChen1129 in #809
[self-forcing] [5/n] Add Self-Forcing distillation pipeline by @JerryZhou54 in #808
[bugfix] Update learning rates for sparse distillation recipe by @SolitaryThinker in #812
[self-forcing] [6/n] Add Ode Init training by @SolitaryThinker in #811
Add Sage Attention 3 Backend by @RandNMR73 in #815
[Feature]Update count trainable param for FSDP2 by @BrianChen1129 in #820
[bugfix] Use training_state_checkpointing_steps instead of checkpointing_steps by @SolitaryThinker in #821
[self-forcing][8/n] Self-Forcing For Wan2.2-A14B + torch.compile training and distillation support by @RandNMR73 in #818
[bugfix] Allow overriding dit checkpoint for inference and Lower VSA LR in example scripts by @SolitaryThinker in #831
[feature] Add torch profiler by @SolitaryThinker in #827
Add wan2.1 functionality support for Ascend NPU platform by @zyang6 in #810
[Feature]Add video-to-video (V2V) pipeline by @Gary-ChenJL in #829
[feat] unified trainer logging by @Ohm-Rishabh in #841
[Feat] add ray support by @Eigensystem in #838
[bugfix] [misc] Use training_state_checkpointing_steps in scripts/ by @SolitaryThinker in #846
[bugfix] always force spawn instead of fork by @Eigensystem in #852
[feat] Add gradio local inference demo by @SolitaryThinker in #847
[ci] fix causal ssim test by @SolitaryThinker in #848
move STA_configuration.py to fastvideo/attention/backends by @H1yori233 in #856
[Feature] Add Cosmos2 i2v pipeline by @kevin314 in #837
[bugfix] Add Cosmos2 sampling params to registry by @kevin314 in #862
[docs] port to mkdocs by @MihirJagtap in #855
Improve FSDP loading with size-based filtering by @Ohm-Rishabh in #853
[Docs] add diagrams to docs by @H1yori233 in #863
[feat] prepare for wan2.2 SF by @SolitaryThinker in #861
[docs] Update Home Readme.md with fixed links by @MihirJagtap in #873
[misc] update wechat and slack invite links by @SolitaryThinker in #875
fix: incorrect dv in vsa Triton kernel causing test_vsa error by @Y-aang in #879
[docs] add favicon by @MihirJagtap in #878
Fix mp worker busy loop to handle all string RPC methods by @shaoxiongduan in #881
[Bugfix] [DMD Distillation] Each rank should have its own timestep sampled by @JerryZhou54 in #885
[bugfix] [lora] [CI] Fix LoRA alpha scaling factor & Fix LoRA Inference CI by @shaoxiongduan in #870
[feat] Add inference for MoE SF by @JerryZhou54 in #880
[readme] update link to inference code by @SolitaryThinker in #887
[Feat] [I2V] resize all image sizes to below 480*832 by @JerryZhou54 in #890
[misc] Update wechat link by @Edenzzzz in #893
- Awesome work using FastVideo or our research projects by @jzhang38 in #898
[CI] fix VSA training CI by @SolitaryThinker in #900
[Bugfix] Minor bugfixes by @loaydatrain in #889
[docs] modified the .github/workflows/docs.yml file to include path filtering by @MihirJagtap in #906
Fix the docs by @eitanturok in #905
[feat] training mfu calculation scripts by @Ohm-Rishabh in #871
fix: correct mp backend GPU assignment on multi-GPU systems by @kuafou in #912
Use assert_close in tests by @Edenzzzz in #429
[docs] add docs for ssim testing by @SolitaryThinker in #918
[feat]: add COSMOS 2.5 DiT implementation by @KyleShao1016 in #897
[docs] fix testing.md visibility by @SolitaryThinker in #920
[bigfix] [distillation] Fix DMD inference pipeline noise initialization shape by @SolitaryThinker in #921
Add LoRA extraction, verification, and comparison scripts by @ShreejithSG in #865
[bugfix] [VSA] Fix block_size computation in backward kernel by @Chuge0335 in #925
[feat] Add fvd implementation by @ketakitank in #923
[misc] update wechat image by @SolitaryThinker in #931
[bugfix] [VSA] [distillation] Various bugfixes for VSA and distillation and nightly tests by @SolitaryThinker in #932
[bugfix] [lora] [distillation] Fix lora distillation bug by @SolitaryThinker in #933
[misc] upgrade pytorch version to 2.9.0 by @SolitaryThinker in #928
[CI] Fix CI tests by @SolitaryThinker in #935
[Feature] Support for Variable Q/KV Sequence Lengths in VSA ThunderKittens kernel by @alexzms in #911
[ci]: Use pre-built docker image & skip VSA compilation by @alexzms in #939
[misc] add schedule configurations to pytorch profiler by @Ohm-Rishabh in #934
[feat] Support sequence packing and shard after pachification for USP by @loaydatrain in #894
[docs] Minor Fixes by @loaydatrain in #942
[feat] Add Matrix-Game 2.0 by @H1yori233 in #938
[bugfix] Added VSA Padding logic by @loaydatrain in #944
[misc] Allow manual override of Pipeline class through override_pipeline_cls_name by @SolitaryThinker in #945
[New Model] Hunyuan1.5 by @JerryZhou54 in #943
[docs] small fixes by @RandNMR73 in #947
[feat] add sliding_tile attention triton kernel and ROCM support by @ZiguanWang in #916
[rocm] Add rocm fastvideo docker image by @SolitaryThinker in #952
[bugfix] [dmd2] allow dmd2 simulate_student_forward to use text-only dataset by @SolitaryThinker in #951
Add LongCat T2V (Base, Distillation and Refinement) Support to FastVideo by @alexzms in #883
feat: consolidate attention kernels into unified fastvideo-kernel package by @ShreejithSG in #946
[kernel] Reorg and fix fastvideo-kernel by @SolitaryThinker in #962
[kernel] Release fastvideo-kernel v0.2.1 by @SolitaryThinker in #963
[docs] refactor attention docs by @SolitaryThinker in #964
[kernel] Fix docker release build for kernel by @SolitaryThinker in #965
[ci] fix kernel tests by @SolitaryThinker in #955
[feat] Add new feature extractors for fvd by @ketakitank in #954
[fix]: fix sliding_tile_attn with sdpa(without flash_attn) by @ZiguanWang in #967
[fix]: fix fastvideo-kernel Rocm build and Dockerfile for Rocm by @ZiguanWang in #968
[fix]: fix STA trition kernel for AMD RDNA archs by @ZiguanWang in #969
[misc] Add util script to create diffuser HF repo from custom component weights by @SolitaryThinker in #970
[kernel] add turbodiffusion kernels by @SolitaryThinker in #972
[docs]: fix various broken links across the documentation by @kuafou in #979
[feat] Support absmax style quantization for FP8 by @XOR-op in #981
[New Model] Turbodiffusion by @loaydatrain in #971
[feat] support Matrix-Game 2.0 streaming generation by @H1yori233 in #957
[feat] Support text encoder weight override and quantization by @XOR-op in #983
Layer offloading by @Ohm-Rishabh in #966
[docs] Update docs and README by @SolitaryThinker in #975
[ci] increase ssim and lora inference test timeout by @SolitaryThinker in #985
[chore] release fastvideo-kernel 0.2.2 by @SolitaryThinker in #986
[chore] update wechat QR code by @SolitaryThinker in #988
Add LongCat-Video I2V and Video Continuation (Base, Distillation and Refinement) Support to FastVideo by @shaoxiongduan in #953
[misc] pin fastvideo-kernel in .toml file by @SolitaryThinker in #989
[feat] add Turbodiffusion I2V pipeline by @loaydatrain in #984
[misc] add pin_cpu_memory false for RTX 4090 by @SolitaryThinker in #990
[chore] release 0.1.7 (real) by @SolitaryThinker in #980

New Contributors

@DataAIPlayer made their first contribution in #781
@EricLina made their first contribution in #778
@zyang6 made their first contribution in #810
@Ohm-Rishabh made their first contribution in #841
@H1yori233 made their first contribution in #856
@MihirJagtap made their first contribution in #855
@Y-aang made their first contribution in #879
@shaoxiongduan made their first contribution in #881
@loaydatrain made their first contribution in #889
@eitanturok made their first contribution in #905
@kuafou made their first contribution in #912
@KyleShao1016 made their first contribution in #897
@ShreejithSG made their first contribution in #865
@Chuge0335 made their first contribution in #925
@ketakitank made their first contribution in #923
@alexzms made their first contribution in #911
@ZiguanWang made their first contribution in #916
@XOR-op made their first contribution in #981

Full Changelog: v0.1.6...v0.1.7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release 0.1.7

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!