Video Sparse Attention (VSA)

Sparse attention mechanism selecting top-k blocks.

Installation

VSA is included in the fastvideo-kernel package. See the main Attention page for build instructions.

Usage

from fastvideo_kernel import video_sparse_attn

# q, k, v: [batch_size, num_heads, seq_len, head_dim]
# variable_block_sizes: Number of valid tokens per block
# q_variable_block_sizes: Number of valid tokens per q block (can differ from KV for q/k of different lengths)
# topk: Number of blocks to attend

output = video_sparse_attn(
    q, k, v, 
    block_sizes,
    block_sizes,
    topk=32
)

Citation

If you use Video Sparse Attention in your research, please cite:

@article{zhang2025vsa,
  title={Vsa: Faster video diffusion with trainable sparse attention},
  author={Zhang, Peiyuan and Chen, Yongqi and Huang, Haofeng and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao},
  journal={arXiv preprint arXiv:2505.13389},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Video Sparse Attention (VSA)

Installation

Usage

Citation

FilesExpand file tree

index.md

Latest commit

History

index.md

File metadata and controls

Video Sparse Attention (VSA)

Installation

Usage

Citation