NVIDIA repositories

spark-rapids-jni

Public

RAPIDS Accelerator JNI For Apache Spark

Cuda

•

Apache License 2.0

•74•51•77•6•Updated

Nov 4, 2025

Fuser

Public

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++

•

Other

•67•359•206•195•Updated

Nov 4, 2025

bionemo-framework

Public

BioNeMo Framework: For building and adapting AI models in drug discovery at scale

machine-learning gpu pytorchdrug-discovery

Jupyter Notebook

•93•560•58•90•Updated

Nov 4, 2025

cuda-quantum

Public

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

python cpp quantumquantum-computing hacktoberfest quantum-programming-language quantum-algorithms quantum-machine-learning unitaryhack

C++

•

Other

•295•839•418•83•Updated

Nov 4, 2025

cccl

Public

CUDA Core Compute Libraries

cpp hpc gpumodern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing

C++

•

Other

•285•2k•1.1k•186•Updated

Nov 4, 2025

k8s-nim-operator

Public

An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.

Go

•

Apache License 2.0

•33•131•6•31•Updated

Nov 4, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moeblackwell llm-serving

C++

•

Apache License 2.0

•1.8k•12k•744•418•Updated

Nov 4, 2025

TensorRT-Model-Optimizer

Public

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.

Python

•

Apache License 2.0

•190•1.5k•58•44•Updated

Nov 4, 2025

VisRTX

Public

NVIDIA OptiX based implementation of ANARI

C++

•

Other

•35•265•9•0•Updated

Nov 4, 2025

Megatron-LM

Public

Ongoing research training transformer models at scale

transformers model-para large-language-models

Python

•

Other

•3.2k•14k•317•173•Updated

Nov 4, 2025

NVSentinel

Public

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go

•

Apache License 2.0

•14•63•23•9•Updated

Nov 4, 2025

NeMo-Agent-Toolkit

Public

The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.

Python

•

Apache License 2.0

•407•1.5k•53•27•Updated

Nov 4, 2025

TensorRT-Incubator

Public

Experimental projects related to TensorRT

MLIR

•18•113•37•17•Updated

Nov 4, 2025

vgpu-device-manager

Public

NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes

Go

•

Apache License 2.0

•23•145•0•10•Updated

Nov 4, 2025

NVFlare

Public

NVIDIA Federated Learning Application Runtime Environment

python decentralized petprivacy-protection federated-learning federated-analytics federated-computing

Python

•

Apache License 2.0

•219•822•11•16•Updated

Nov 4, 2025

GenerativeAIExamples

Public

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

microservice gpu-acceleration nemotensorrt rag triton-inference-server large-language-models llm llm-inference retrieval-augmented-generation

Jupyter Notebook

•

Apache License 2.0

•895•3.6k•44•35•Updated

Nov 4, 2025

nv-ingest

Public

NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.

Python

•

Apache License 2.0

•272•2.8k•98•36•Updated

Nov 4, 2025

cuda-python

Public

CUDA Python: Performance meets Productivity

Python

•

Other

•217•3k•181•11•Updated

Nov 4, 2025

k8s-kata-manager

Public

Go

•

Apache License 2.0

•11•21•4•11•Updated

Nov 4, 2025

nccl

Public

Optimized primitives for collective multi-GPU communication

deep-learning cpp gpucuda nvidia communications

C++

•

Other

•1.1k•4.2k•153•63•Updated

Nov 4, 2025

aistore

Public

AIStore: scalable storage for AI applications

kubernetes high-performance distributed-storagehigh-availability object-storage multi-cloud batch-jobs s3-compatible multipart-upload ml-training

Go

•

MIT License

•221•1.6k•0•0•Updated

Nov 4, 2025

TransformerEngine

Public

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learninggpu cuda pytorch jax fp8

Python

•

Apache License 2.0

•537•2.9k•230•89•Updated

Nov 4, 2025

numbast

Public

Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.

cuda numba

Python

•

Apache License 2.0

•16•51•31•10•Updated

Nov 3, 2025

NeMo-text-processing

Public

NeMo text processing for ASR and TTS

text-normalization inverse-text-n

Python

•

Apache License 2.0

•135•383•1•5•Updated

Nov 3, 2025

tilus

Public

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

tile programming kernelcuda

Python

•

Apache License 2.0

•8•393•6•0•Updated

Nov 3, 2025

cuDecomp

Public

An Adaptive Pencil Decomposition Library for NVIDIA GPUs

fft pencil-decomposition

C++

•

Apache License 2.0

•11•70•0•1•Updated

Nov 3, 2025

spark-rapids

Public

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

big-data spark gpurapids

Scala

•

Apache License 2.0

•261•941•1.7k•23•Updated

Nov 3, 2025

spark-rapids-benchmarks

Public

Spark RAPIDS Benchmarks – benchmark sets and utilities for the RAPIDS Accelerator for Apache Spark

Python

•

Apache License 2.0

•35•43•27•5•Updated

Nov 3, 2025

nvidia-container-toolkit

Public

Build and run containers leveraging NVIDIA GPUs

Go

•

Apache License 2.0

•427•3.8k•413•33•Updated

Nov 3, 2025

cuopt

Public

GPU accelerated decision optimization

gpu optimization cudalinear-programming

Cuda

•

Apache License 2.0

•88•528•71•16•Updated

Nov 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Corporation

All

All

621 repositories

spark-rapids-jni

Fuser

bionemo-framework

cuda-quantum

cccl

k8s-nim-operator

TensorRT-LLM

TensorRT-Model-Optimizer

VisRTX

Megatron-LM

NVSentinel

NeMo-Agent-Toolkit

TensorRT-Incubator

vgpu-device-manager

NVFlare

GenerativeAIExamples

nv-ingest

cuda-python

k8s-kata-manager

nccl

aistore

TransformerEngine

numbast

NeMo-text-processing

tilus

cuDecomp

spark-rapids

spark-rapids-benchmarks

nvidia-container-toolkit

cuopt

All

All

Repositories list

621 repositories