[WIP] N*G Triton group gemm for MoE #960

lessw2020 · 2025-03-14T05:51:54Z

This PR adds a Triton Group GEMM with full backwards pass support, for integration with MoE training.
The forward pass is from FBGemm experimental:
https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/experimental/gemm/triton_gemm/grouped_gemm.py

1 - Numerics with BF16 have been verified on sample sizes and the core DeepSeek v3 shapes.

2025-03-13 16:40:08,294 - INFO - Gradient shapes - grad_x: torch.Size([1024, 256]), grad_w: torch.Size([10240, 256])
2025-03-13 16:40:08,294 - INFO - Running PyTorch reference implementation
2025-03-13 16:40:08,591 - INFO - Comparing gradients with PyTorch reference
2025-03-13 16:40:08,602 - INFO - Maximum gradient error - grad_x: 0.125, grad_w: 0.125
2025-03-13 16:40:08,641 - INFO - ✓ SUCCESS! grad_X matches the PyTorch reference (allclose check passed)
2025-03-13 16:40:08,641 - INFO - ✓ SUCCESS! grad_W matches the PyTorch reference (allclose check passed)
2025-03-13 16:40:08,641 - INFO - Gradients allclose check - grad_x: True, grad_w: True
2025-03-13 16:40:08,641 - INFO - ✓ SUCCESS: Gradients match the PyTorch reference (allclose check passed)

2 - Todos:
a - fp8 support
b - TMA (was removed to focus on numerics)
c - WS
d - Perf and auto-tuning

3 - Integration - ready now for BF16 though may want to do perf work first.

torchtitan/experiments/kernels/triton_group_gemm/tgroup_gemm_forward.py

triton group gemm

171de46

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 14, 2025

lessw2020 mentioned this pull request Mar 14, 2025

DSV3 generation script #959

Open

refactor block size selection

51d7b23

lessw2020 self-assigned this Mar 14, 2025

GD06 reviewed Mar 15, 2025

View reviewed changes

torchtitan/experiments/kernels/triton_group_gemm/tgroup_gemm_forward.py Show resolved Hide resolved

lessw2020 mentioned this pull request Mar 15, 2025

[WIP] M*G Triton group gemm for MoE training #967

Open

lessw2020 changed the title ~~[WIP] Triton group gemm for MoE~~ [WIP] N*G Triton group gemm for MoE Mar 15, 2025

Link to FBGemm

cc20ac5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] N*G Triton group gemm for MoE #960

[WIP] N*G Triton group gemm for MoE #960

lessw2020 commented Mar 14, 2025 •

edited

Loading

[WIP] N*G Triton group gemm for MoE #960

Are you sure you want to change the base?

[WIP] N*G Triton group gemm for MoE #960

Conversation

lessw2020 commented Mar 14, 2025 • edited Loading

lessw2020 commented Mar 14, 2025 •

edited

Loading