Skip to content
Change the repository type filter

All

    Repositories list

    • diffusers

      Public
      🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
      Python
      6.5k000Updated Nov 4, 2025Nov 4, 2025
    • ImageReward

      Public
      [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
      Python
      81000Updated Oct 30, 2025Oct 30, 2025
    • longcat-video-fast

      Public
      🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
      Python
      0600Updated Oct 28, 2025Oct 28, 2025
    • Python
      80000Updated Oct 28, 2025Oct 28, 2025
    • cache-dit

      Public
      A Unified, Flexible and Training-free Cache Acceleration Framework for 🤗 Diffusers.
      Python
      18400Updated Oct 28, 2025Oct 28, 2025
    • ComfyUI

      Public
      The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
      Python
      10k000Updated Oct 27, 2025Oct 27, 2025
    • qwen-image-fast

      Public
      ⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
      Python
      01520Updated Oct 24, 2025Oct 24, 2025
    • Kandinsky-5

      Public
      Kandinsky 5.0: A family of diffusion models for Video & Image generation
      Python
      11000Updated Oct 22, 2025Oct 22, 2025
    • LeetCUDA

      Public
      📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
      Cuda
      8228.3k40Updated Oct 17, 2025Oct 17, 2025
    • Wan2.1

      Public
      Wan: Open and Advanced Large-Scale Video Generative Models
      Python
      2.1k100Updated Oct 17, 2025Oct 17, 2025
    • Wan2.2

      Public
      Wan: Open and Advanced Large-Scale Video Generative Models
      Python
      1.2k000Updated Oct 17, 2025Oct 17, 2025
    • nunchaku

      Public
      [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
      Python
      191200Updated Oct 15, 2025Oct 15, 2025
    • Enjoy the magic of Diffusion models!
      Python
      986000Updated Oct 13, 2025Oct 13, 2025
    • flux-fast

      Public
      A forked version of flux-fast that makes flux-fast even faster with cache-dit.
      Python
      15400Updated Oct 11, 2025Oct 11, 2025
    • HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
      Python
      101100Updated Oct 4, 2025Oct 4, 2025
    • cache-dit for comfyui
      Python
      0700Updated Sep 27, 2025Sep 27, 2025
    • HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​
      Python
      49100Updated Sep 10, 2025Sep 10, 2025
    • Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
      Python
      36000Updated Sep 9, 2025Sep 9, 2025
    • Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
      Python
      321100Updated Sep 3, 2025Sep 3, 2025
    • 📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
      Python
      2143400Updated Aug 19, 2025Aug 19, 2025
    • 📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
      Python
      3174.7k00Updated Aug 19, 2025Aug 19, 2025
    • 🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
      C++
      7644.3k00Updated Aug 19, 2025Aug 19, 2025
    • Model Compression Toolbox for Large Language Models and Diffusion Models
      Python
      63000Updated Aug 14, 2025Aug 14, 2025
    • ffpa-attn

      Public
      🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
      Cuda
      1022700Updated Aug 8, 2025Aug 8, 2025
    • .github

      Public
      0100Updated Aug 8, 2025Aug 8, 2025
    • SpargeAttention: A training-free sparse attention that can accelerate any model inference.
      Cuda
      65600Updated Aug 7, 2025Aug 7, 2025
    • Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
      Cuda
      253000Updated Aug 6, 2025Aug 6, 2025
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      26k000Updated Aug 5, 2025Aug 5, 2025
    • flux

      Public
      A fast communication-overlapping library for tensor/expert parallelism on GPUs.
      C++
      82500Updated Jul 30, 2025Jul 30, 2025
    • A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.
      Python
      02410Updated Jul 18, 2025Jul 18, 2025