Skip to content
Change the repository type filter

All

    Repositories list

    • MF2

      Public
      Python
      0600Updated May 25, 2026May 25, 2026
    • entmaxkv

      Public
      Python
      BSD 3-Clause "New" or "Revised" License
      0000Updated May 20, 2026May 20, 2026
    • adasplash

      Public
      AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)
      Python
      BSD 3-Clause "New" or "Revised" License
      24310Updated May 20, 2026May 20, 2026
    • DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention
      0000Updated May 11, 2026May 11, 2026
    • SEQUOR

      Public
      Python
      0000Updated May 8, 2026May 8, 2026
    • asentmax

      Public
      Code for Long-Context Generalization with Sparse Attention.
      Python
      GNU General Public License v3.0
      0510Updated Apr 21, 2026Apr 21, 2026
    • olmes

      Public
      Reproducible, flexible LLM evaluations
      Python
      Apache License 2.0
      90000Updated Apr 20, 2026Apr 20, 2026
    • LLaVA-NeXT architecture
      Jupyter Notebook
      Apache License 2.0
      464300Updated Feb 24, 2026Feb 24, 2026
    • Jupyter Notebook
      0000Updated Feb 20, 2026Feb 20, 2026
    • EuroMoE

      Public
      0000Updated Feb 15, 2026Feb 15, 2026
    • Official code for the "Sparse Attention as Compact Kernel Regression" paper
      Python
      0100Updated Feb 8, 2026Feb 8, 2026
    • Jupyter Notebook
      1200Updated Feb 5, 2026Feb 5, 2026
    • Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
      Jupyter Notebook
      Apache License 2.0
      1.8k000Updated Jan 30, 2026Jan 30, 2026
    • Python
      52900Updated Nov 14, 2025Nov 14, 2025
    • Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation
      Jupyter Notebook
      MIT License
      0000Updated Oct 6, 2025Oct 6, 2025
    • Code for the paper "Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning"
      Python
      Apache License 2.0
      0200Updated Sep 30, 2025Sep 30, 2025
    • lmms-eval

      Public
      Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
      Python
      Other
      593001Updated Sep 26, 2025Sep 26, 2025
    • Jupyter Notebook
      21100Updated Sep 25, 2025Sep 25, 2025
    • A package for sampling from Gibbs distributions during inference with LLMs.
      Python
      Apache License 2.0
      31010Updated Aug 14, 2025Aug 14, 2025
    • Python
      MIT License
      0100Updated Jul 15, 2025Jul 15, 2025
    • Ongoing research training transformer models at scale
      Python
      Other
      4k201Updated Jun 20, 2025Jun 20, 2025
    • From a+b to sparsemax(QK^T)V in Triton!
      Jupyter Notebook
      03400Updated Jun 19, 2025Jun 19, 2025
    • zsb

      Public
      Python
      0500Updated Jun 9, 2025Jun 9, 2025
    • treqa

      Public
      LLM-based QAG framework for MT Evaluation
      Python
      1511Updated May 13, 2025May 13, 2025
    • Repository containing code to reproduce results of the paper "Sparse Activations as Conformal Predictors".
      Jupyter Notebook
      1210Updated Apr 27, 2025Apr 27, 2025
    • A PyTorch native library for large model training
      Python
      BSD 3-Clause "New" or "Revised" License
      834000Updated Apr 1, 2025Apr 1, 2025
    • fy-vi

      Public
      Jupyter Notebook
      0000Updated Mar 21, 2025Mar 21, 2025
    • doce

      Public
      This is the a repo of DOCE
      Python
      Apache License 2.0
      0200Updated Mar 14, 2025Mar 14, 2025
    • latim

      Public
      Jupyter Notebook
      MIT License
      0600Updated Feb 24, 2025Feb 24, 2025
    • CHM-Net

      Public
      Modern Hopfield Networks with Continuous-Time Memories
      Python
      MIT License
      1300Updated Feb 21, 2025Feb 21, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.