Skip to content

Commit 832b75b

Browse files
authored
chore: update why not svqd nvfp4 for sm100 (#1043)
1 parent 5f5cfe4 commit 832b75b

1 file changed

Lines changed: 8 additions & 4 deletions

File tree

setup.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,13 @@
3030
# tools/release_workspace.py.
3131
SPDLOG_SUBMODULE_PATH = ROOT_DIR / "csrc" / "third_party" / "spdlog"
3232
SPDLOG_HEADER_PATH = SPDLOG_SUBMODULE_PATH / "include" / "spdlog" / "spdlog.h"
33-
# Blackwell sm100 is not has been fully tested, so we don't include it in the
34-
# default target list.Users with sm100 devices can specify it explicitly with
35-
# `CACHE_DIT_CUDA_ARCH_LIST=100` or `TORCH_CUDA_ARCH_LIST=100` when building
36-
# the SVDQuant extension.
33+
# sm100 (Blackwell B100/B200, CC 10.0) is not included in the default
34+
# target list because the NVFP4 block-scaled MMA instruction used by the
35+
# SVDQuant FP4 kernel requires sm_120a or higher (PTX ISA, warp-level mma
36+
# Target ISA Notes: ".kind::mxf4nvf4 and .kind::mxf4 are supported on
37+
# sm_120a and sm_121a"). INT4 kernels work correctly on sm100. Users who
38+
# only need INT4 kernels can build for sm100 explicitly via
39+
# CACHE_DIT_CUDA_ARCH_LIST=100 or TORCH_CUDA_ARCH_LIST=100.
3740
CUDA_ARCH_ALIASES = {
3841
"maxwell": "50",
3942
"pascal": "60",
@@ -42,6 +45,7 @@
4245
"ampere": "80",
4346
"ada": "89",
4447
"hopper": "90",
48+
"blackwell": "100",
4549
}
4650

4751

0 commit comments

Comments
 (0)