Skip to content
Change the repository type filter

All

    Repositories list

    • Megatron-LM

      Public
      Ongoing research training transformer models at scale
      Python
      3.6k43618Updated Feb 7, 2026Feb 7, 2026
    • benchmark-image-tokenzier

      Public
      Jupyter Notebook
      2001Updated Feb 7, 2026Feb 7, 2026
    • multimodal-data

      Public
      Jupyter Notebook
      0001Updated Feb 5, 2026Feb 5, 2026
    • gh200-wheels

      Public
      Python wheels and images for GH200 GPUs
      Dockerfile
      0000Updated Feb 4, 2026Feb 4, 2026
    • tokenizer-intrinsic-evals

      Public
      A suite of intrinsic evaluation metrics for the Apertus tokenization team to use during tokenizer development
      Python
      8100Updated Feb 2, 2026Feb 2, 2026
    • benchmark-audio-tokenizer

      Public
      Python
      1001Updated Feb 2, 2026Feb 2, 2026
    • perf-check

      Public
      Perf-Check is a lightweight “canary” suite to verify the AI training stack before large runs. It quickly checks GPU compute, HBM bandwidth, NVLink/PCIe P2P, NCC…
      Cuda
      0100Updated Jan 30, 2026Jan 30, 2026
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      4.4k000Updated Jan 29, 2026Jan 29, 2026
    • verl

      Public
      verl: Volcano Engine Reinforcement Learning for LLMs
      Python
      3.2k000Updated Jan 28, 2026Jan 28, 2026
    • nanotron_climllama

      Public
      Minimalistic large language model 3D-parallelism training
      Python
      0100Updated Jan 28, 2026Jan 28, 2026
    • model-spinning

      Public archive
      Python
      2900Updated Jan 27, 2026Jan 27, 2026
    • The set of scripts was developed to process Rumantsch data for Apertus V1, the LLM created by the Swiss AI Initiative.
      Python
      1400Updated Jan 26, 2026Jan 26, 2026
    • mmore

      Public
      Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Ever needed to take 8000 P…
      Python
      371861010Updated Jan 23, 2026Jan 23, 2026
    • lmms-eval

      Public
      One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
      Python
      512001Updated Jan 21, 2026Jan 21, 2026
    • Jupyter Notebook
      0000Updated Jan 14, 2026Jan 14, 2026
    • project2

      Public
      Jupyter Notebook
      1002Updated Jan 14, 2026Jan 14, 2026
    • newer llm service
      Python
      3100Updated Jan 13, 2026Jan 13, 2026
    • Code and notebooks for DSL #16 hallucination probe project
      Jupyter Notebook
      0200Updated Jan 10, 2026Jan 10, 2026
    • Python
      0410Updated Dec 28, 2025Dec 28, 2025
    • Final project for the course "Large-Scale AI Engineering". Porting Megatron-LM-based models to AMD GPU hardware.
      Python
      1000Updated Dec 19, 2025Dec 19, 2025
    • Python
      0000Updated Dec 18, 2025Dec 18, 2025
    • evals

      Public
      Python
      3601Updated Dec 17, 2025Dec 17, 2025
    • Shell
      0300Updated Dec 16, 2025Dec 16, 2025
    • parity-aware-bpe

      Public
      Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization [arXiv 2025]
      Python
      31810Updated Dec 10, 2025Dec 10, 2025
    • Response format to be used with apertus
      Python
      11100Updated Dec 3, 2025Dec 3, 2025
    • torrent

      Public
      Python
      1100Updated Nov 27, 2025Nov 27, 2025
    • The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
      Python
      225001Updated Nov 4, 2025Nov 4, 2025
    • Pretraining data reconstruction scripts for Apertus
      Python
      1011321Updated Oct 27, 2025Oct 27, 2025
    • Python
      132511Updated Oct 22, 2025Oct 22, 2025
    • A framework for few-shot evaluation of language models.
      Python
      3k702Updated Oct 22, 2025Oct 22, 2025