Skip to content

Refactor Cutlass MoE runner integration#12023

Draft
Jonahcb wants to merge 41 commits intosgl-project:mainfrom
Jonahcb:refactor/cutlass-moe-runner-integration
Draft

Refactor Cutlass MoE runner integration#12023
Jonahcb wants to merge 41 commits intosgl-project:mainfrom
Jonahcb:refactor/cutlass-moe-runner-integration

Conversation

@Jonahcb
Copy link
Contributor

@Jonahcb Jonahcb commented Oct 23, 2025

Motivation

Refactor Cutlass MoE runner integration into cutlass.py per #8715

Modifications

Implemented CutlassRunnerInput, CutlassRunnerOutput, CutlassMoeQuantInfo, CutlassRunnerCore, pre_permute_standard_to_cutlass, and post_permute_cutlass_to_standard

Accuracy Tests

In progress

Benchmarking and Profiling

Not applicable

Checklist

@ch-wan ch-wan mentioned this pull request Oct 23, 2025
66 tasks
@b8zhong b8zhong added the run-ci label Oct 28, 2025
@github-actions github-actions bot added the quant LLM Quantization label Nov 6, 2025
@ch-wan ch-wan self-assigned this Jan 5, 2026
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is lengthy. We can move some functions to cutlass_utils.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

quant LLM Quantization run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments