xlite-dev

All

49 repositories

diffusers
Public
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Python
•
Apache License 2.0
•6.5k•0•0•0•Updated Nov 4, 2025Nov 4, 2025
ImageReward
Public
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Python
•
Apache License 2.0
•81•0•0•0•Updated Oct 30, 2025Oct 30, 2025
longcat-video-fast
Public
🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
longcat longcat-video
Python
•0•6•0•0•Updated Oct 28, 2025Oct 28, 2025
LongCat-Video
Public
Python
•
MIT License
•80•0•0•0•Updated Oct 28, 2025Oct 28, 2025
cache-dit
Public
A Unified, Flexible and Training-free Cache Acceleration Framework for 🤗 Diffusers.
Python
•
Other
•18•4•0•0•Updated Oct 28, 2025Oct 28, 2025
ComfyUI
Public
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Python
•
GNU General Public License v3.0
•10k•0•0•0•Updated Oct 27, 2025Oct 27, 2025
qwen-image-fast
Public
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
qwen-image qwen-image-lightning qwen-image-edit qwen-image-api qwen-image-lora
Python
•
Apache License 2.0
•0•15•2•0•Updated Oct 24, 2025Oct 24, 2025
Kandinsky-5
Public
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Python
•
Apache License 2.0
•11•0•0•0•Updated Oct 22, 2025Oct 22, 2025
LeetCUDA
Public
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
cuda cuda-kernels cuda-demo cuda-toolkit cuda-library cuda-kernel learn-cuda cuda-cpp hgemm flash-attention
Cuda
•
GNU General Public License v3.0
•822•8.3k•4•0•Updated Oct 17, 2025Oct 17, 2025
Wan2.1
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•2.1k•1•0•0•Updated Oct 17, 2025Oct 17, 2025
Wan2.2
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•1.2k•0•0•0•Updated Oct 17, 2025Oct 17, 2025
nunchaku
Public
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Python
•
Apache License 2.0
•191•2•0•0•Updated Oct 15, 2025Oct 15, 2025
DiffSynth-Studio
Public
Enjoy the magic of Diffusion models!
Python
•
Apache License 2.0
•986•0•0•0•Updated Oct 13, 2025Oct 13, 2025
flux-fast
Public
A forked version of flux-fast that makes flux-fast even faster with cache-dit.
Python
•15•4•0•0•Updated Oct 11, 2025Oct 11, 2025
HunyuanImage-3.0
Public
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Python
•
Other
•101•1•0•0•Updated Oct 4, 2025Oct 4, 2025
comfyui-cache-dit
Public
cache-dit for comfyui
Python
•0•7•0•0•Updated Sep 27, 2025Sep 27, 2025
HunyuanImage-2.1
Public
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
Python
•
Other
•49•1•0•0•Updated Sep 10, 2025Sep 10, 2025
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•36•0•0•0•Updated Sep 9, 2025Sep 9, 2025
Qwen-Image
Public
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Python
•
Apache License 2.0
•321•1•0•0•Updated Sep 3, 2025Sep 3, 2025
Awesome-DiT-Inference
Public
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
flux wan diffusion dit sora stable-diffusion sdxl sd15 deepcache open-sora-plan
Python
•
GNU General Public License v3.0
•21•434•0•0•Updated Aug 19, 2025Aug 19, 2025
Awesome-LLM-Inference
Public
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
mla vllm llm-inference awesome-llm flash-attention tensorrt-llm paged-attention deepseek flash-attention-3 deepseek-v3
Python
•
GNU General Public License v3.0
•317•4.7k•0•0•Updated Aug 19, 2025Aug 19, 2025
lite.ai.toolkit
Public
🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
tensorrt mnn ncnn onnx onnxruntime yolov5 tnn mnn-model yolox robustvideomatting
C++
•
GNU General Public License v3.0
•764•4.3k•0•0•Updated Aug 19, 2025Aug 19, 2025
deepcompressor
Public
Model Compression Toolbox for Large Language Models and Diffusion Models
Python
•
Apache License 2.0
•63•0•0•0•Updated Aug 14, 2025Aug 14, 2025
ffpa-attn
Public
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
cuda attention sdpa mla mlsys tensor-cores flash-attention deepseek deepseek-v3 deepseek-r1
Cuda
•
GNU General Public License v3.0
•10•227•0•0•Updated Aug 8, 2025Aug 8, 2025
.github
Public
0•1•0•0•Updated Aug 8, 2025Aug 8, 2025
SpargeAttn
Public
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Cuda
•
Apache License 2.0
•65•6•0•0•Updated Aug 7, 2025Aug 7, 2025
SageAttention
Public
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Cuda
•
Apache License 2.0
•253•0•0•0•Updated Aug 6, 2025Aug 6, 2025
pytorch
Public
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Python
•
Other
•26k•0•0•0•Updated Aug 5, 2025Aug 5, 2025
flux
Public
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
C++
•
Apache License 2.0
•82•5•0•0•Updated Jul 30, 2025Jul 30, 2025
flux-faster
Public
A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.
flux flux-dev flux-fill flux-kontext flux-fast cache-dit
Python
•0•24•1•0•Updated Jul 18, 2025Jul 18, 2025