Skip to content
Change the repository type filter

All

    Repositories list

    • DUET-VLM

      Public
      DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
      Python
      0000Updated Feb 7, 2026Feb 7, 2026
    • Primus

      Public
      Python
      2573412Updated Feb 7, 2026Feb 7, 2026
    • Primus-SaFE

      Public
      Primus-SaFE(Stability and Fault Endurance)
      Go
      05006Updated Feb 7, 2026Feb 7, 2026
    • TraceLens

      Public
      Automating analysis from trace files
      Python
      958838Updated Feb 6, 2026Feb 6, 2026
    • Magpie

      Public
      A lightweight, general-purpose framework for evaluating GPU kernel correctness and performance.
      Python
      22600Updated Feb 6, 2026Feb 6, 2026
    • Python
      1060113Updated Feb 6, 2026Feb 6, 2026
    • GEAK

      Public
      It is an LLM-based AI agent, which can write correct and efficient gpu kernels automatically.
      Python
      126121Updated Feb 6, 2026Feb 6, 2026
    • PARD

      Public
      PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation (ICLR 26)
      Python
      11210Updated Feb 6, 2026Feb 6, 2026
    • For world model code developing and releasing.
      Python
      11300Updated Feb 6, 2026Feb 6, 2026
    • Automating analysis from trace files
      Python
      9000Updated Feb 5, 2026Feb 5, 2026
    • Python
      01100Updated Feb 2, 2026Feb 2, 2026
    • AgentKernelArena provides an end-to-end siloed-benchmarking environment where different LLM-powered agents—such as Cursor Agent, Claude Code, Codex, SWE-agent, …
      Python
      26120Updated Feb 2, 2026Feb 2, 2026
    • axlearn

      Public
      An Extensible Deep Learning Library
      Python
      397100Updated Jan 29, 2026Jan 29, 2026
    • Nitro-E

      Public
      Python
      810920Updated Jan 28, 2026Jan 28, 2026
    • Repo containing artifacts for Neurips 2025 tutorial- How to Build Agents to Generate Kernels for Faster LLMs (and Other Models!)
      Jupyter Notebook
      11100Updated Jan 23, 2026Jan 23, 2026
    • Python
      0500Updated Jan 22, 2026Jan 22, 2026
    • Repository for Showcasing DLRM v2 Functionality on AMD
      Python
      0000Updated Jan 15, 2026Jan 15, 2026
    • AMD 0.9B efficient text to video diffusion model
      Python
      64411Updated Jan 12, 2026Jan 12, 2026
    • This is a short course covering GPU optimization techniques for LLM inference
      Python
      0000Updated Jan 11, 2026Jan 11, 2026
    • Examples of training autodrive models in ROCm
      Python
      0200Updated Jan 9, 2026Jan 9, 2026
    • GEAK-eval

      Public
      Python
      61190Updated Dec 24, 2025Dec 24, 2025
    • Synthetic data generation pipeline, finetuning and evaluation scripts.
      Python
      1110Updated Dec 24, 2025Dec 24, 2025
    • Python
      1110Updated Dec 16, 2025Dec 16, 2025
    • monarch

      Public
      PyTorch Single Controller
      Rust
      136006Updated Dec 2, 2025Dec 2, 2025
    • A PyTorch native platform for training generative AI models
      Python
      6991400Updated Nov 18, 2025Nov 18, 2025
    • Python
      1700Updated Nov 14, 2025Nov 14, 2025
    • Instella

      Public
      Fully Open Language Models with Stellar Performance
      Python
      2931820Updated Nov 14, 2025Nov 14, 2025
    • PyTorch-native post-training at scale
      Python
      85000Updated Nov 13, 2025Nov 13, 2025
    • Official repo for AMD hybrid models training and inference workflow
      Python
      1930Updated Nov 8, 2025Nov 8, 2025
    • Python
      1310Updated Nov 4, 2025Nov 4, 2025