Skip to content

Add torch grouped gemm bf16 and mxfp8 support w/ cuda graphed + inference_optimized MoEs#3858

Merged
sidsingh-nvidia merged 56 commits intoNVIDIA:mainfrom
sidsingh-nvidia:siddharth/torch-ggemm-mxfp8
Mar 17, 2026
Merged

Add torch grouped gemm bf16 and mxfp8 support w/ cuda graphed + inference_optimized MoEs#3858
sidsingh-nvidia merged 56 commits intoNVIDIA:mainfrom
sidsingh-nvidia:siddharth/torch-ggemm-mxfp8

Commits

Commits on Mar 13, 2026

Commits on Mar 16, 2026

Commits on Mar 17, 2026