Skip to content
Change the repository type filter

All

    Repositories list

    • spark-rapids-jni

      Public
      RAPIDS Accelerator JNI For Apache Spark
      Cuda
      7451776Updated Nov 4, 2025Nov 4, 2025
    • Fuser

      Public
      A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
      C++
      67359206195Updated Nov 4, 2025Nov 4, 2025
    • bionemo-framework

      Public
      BioNeMo Framework: For building and adapting AI models in drug discovery at scale
      Jupyter Notebook
      935605890Updated Nov 4, 2025Nov 4, 2025
    • cuda-quantum

      Public
      C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
      C++
      29583941883Updated Nov 4, 2025Nov 4, 2025
    • cccl

      Public
      CUDA Core Compute Libraries
      C++
      2852k1.1k186Updated Nov 4, 2025Nov 4, 2025
    • k8s-nim-operator

      Public
      An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
      Go
      33131631Updated Nov 4, 2025Nov 4, 2025
    • TensorRT-LLM

      Public
      TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
      C++
      1.8k12k744418Updated Nov 4, 2025Nov 4, 2025
    • TensorRT-Model-Optimizer

      Public
      A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
      Python
      1901.5k5844Updated Nov 4, 2025Nov 4, 2025
    • VisRTX

      Public
      NVIDIA OptiX based implementation of ANARI
      C++
      3526590Updated Nov 4, 2025Nov 4, 2025
    • Megatron-LM

      Public
      Ongoing research training transformer models at scale
      Python
      3.2k14k317173Updated Nov 4, 2025Nov 4, 2025
    • NVSentinel

      Public
      NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments
      Go
      1463239Updated Nov 4, 2025Nov 4, 2025
    • NeMo-Agent-Toolkit

      Public
      The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
      Python
      4071.5k5327Updated Nov 4, 2025Nov 4, 2025
    • TensorRT-Incubator

      Public
      Experimental projects related to TensorRT
      MLIR
      181133717Updated Nov 4, 2025Nov 4, 2025
    • vgpu-device-manager

      Public
      NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes
      Go
      23145010Updated Nov 4, 2025Nov 4, 2025
    • NVFlare

      Public
      NVIDIA Federated Learning Application Runtime Environment
      Python
      2198221116Updated Nov 4, 2025Nov 4, 2025
    • GenerativeAIExamples

      Public
      Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
      Jupyter Notebook
      8953.6k4435Updated Nov 4, 2025Nov 4, 2025
    • nv-ingest

      Public
      NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
      Python
      2722.8k9836Updated Nov 4, 2025Nov 4, 2025
    • cuda-python

      Public
      CUDA Python: Performance meets Productivity
      Python
      2173k18111Updated Nov 4, 2025Nov 4, 2025
    • k8s-kata-manager

      Public
      Go
      1121411Updated Nov 4, 2025Nov 4, 2025
    • nccl

      Public
      Optimized primitives for collective multi-GPU communication
      C++
      1.1k4.2k15363Updated Nov 4, 2025Nov 4, 2025
    • aistore

      Public
      AIStore: scalable storage for AI applications
      Go
      2211.6k00Updated Nov 4, 2025Nov 4, 2025
    • TransformerEngine

      Public
      A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
      Python
      5372.9k23089Updated Nov 4, 2025Nov 4, 2025
    • numbast

      Public
      Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.
      Python
      16513110Updated Nov 3, 2025Nov 3, 2025
    • NeMo-text-processing

      Public
      NeMo text processing for ASR and TTS
      Python
      13538315Updated Nov 3, 2025Nov 3, 2025
    • tilus

      Public
      Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
      Python
      839360Updated Nov 3, 2025Nov 3, 2025
    • cuDecomp

      Public
      An Adaptive Pencil Decomposition Library for NVIDIA GPUs
      C++
      117001Updated Nov 3, 2025Nov 3, 2025
    • spark-rapids

      Public
      Spark RAPIDS plugin - accelerate Apache Spark with GPUs
      Scala
      2619411.7k23Updated Nov 3, 2025Nov 3, 2025
    • spark-rapids-benchmarks

      Public
      Spark RAPIDS Benchmarks – benchmark sets and utilities for the RAPIDS Accelerator for Apache Spark
      Python
      3543275Updated Nov 3, 2025Nov 3, 2025
    • nvidia-container-toolkit

      Public
      Build and run containers leveraging NVIDIA GPUs
      Go
      4273.8k41333Updated Nov 3, 2025Nov 3, 2025
    • cuopt

      Public
      GPU accelerated decision optimization
      Cuda
      885287116Updated Nov 3, 2025Nov 3, 2025