Commit 6625856
authored
Add support for CUDA architecture family codes (#27278)
This change extends CUDA architecture handling to support
family-specific codes (suffix 'f') introduced in CUDA 12.9, aligning
with updates made to Triton Inference Server repositories (backend and
onnxruntime_backend).
Changes:
1. Added CUDAARCHS environment variable support (standard CMake
variable)
- Allows users to override architecture list via environment variable
- Takes precedence when CMAKE_CUDA_ARCHITECTURES is not set
2. Extended regex patterns to recognize family code suffix 'f'
- Supports codes like 100f, 110f, 120f for CC 10.x, 11.x, 12.x families
- Preserves 'f' suffix during parsing phase
3. Updated normalization logic to handle family codes
- Family codes (ending with 'f') preserved without adding -real suffix
- Traditional codes continue to receive -real or -a-real suffixes
- Architecture-specific codes (with 'a') remain unchanged
4. Extended architecture support lists
- Added SM 110 to ARCHITECTURES_WITH_KERNELS
- Added SM 110 to ARCHITECTURES_WITH_ACCEL
Family-specific codes (introduced in CUDA 12.9/Blackwell) enable forward
compatibility within a GPU family. For example, 100f runs on CC 10.0,
10.3, and future 10.x devices, using features common across the family.
Usage examples:
- CUDAARCHS="75;80;90;100f;110f;120f" cmake ..
- cmake -DCMAKE_CUDA_ARCHITECTURES="75-real;80-real;90-real;100f;120f"
..
- python build.py --cmake_extra_defines
CMAKE_CUDA_ARCHITECTURES="100f;110f"
The implementation supports mixed formats in the same list:
- Traditional: 75-real, 80-real, 90-real
- Architecture-specific: 90a-real (CC 9.0 only)
- Family-specific: 100f, 110f, 120f (entire family)
Note: Current defaults still use bare numbers (75;80;90;100;120) which
normalize to architecture-specific codes with 'a' suffix. Users who want
family-specific behavior should explicitly use the 'f' suffix via
CUDAARCHS environment variable or CMAKE_CUDA_ARCHITECTURES.
References:
- NVIDIA Blackwell and CUDA 12.9 Family-Specific Architecture Features:
https://developer.nvidia.com/blog/nvidia-blackwell-and-nvidia-cuda-12-9-introduce-family-specific-architecture-features/
- Triton Inference Server backend updates (commit f5e901f)
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->1 parent 5645b77 commit 6625856
1 file changed
+15
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
88 | 93 | | |
89 | 94 | | |
90 | 95 | | |
| |||
142 | 147 | | |
143 | 148 | | |
144 | 149 | | |
145 | | - | |
| 150 | + | |
146 | 151 | | |
147 | | - | |
148 | | - | |
149 | | - | |
| 152 | + | |
150 | 153 | | |
| 154 | + | |
| 155 | + | |
151 | 156 | | |
152 | 157 | | |
153 | 158 | | |
| |||
159 | 164 | | |
160 | 165 | | |
161 | 166 | | |
162 | | - | |
| 167 | + | |
163 | 168 | | |
164 | 169 | | |
165 | 170 | | |
| |||
168 | 173 | | |
169 | 174 | | |
170 | 175 | | |
171 | | - | |
| 176 | + | |
172 | 177 | | |
173 | 178 | | |
174 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
175 | 183 | | |
176 | 184 | | |
177 | 185 | | |
| |||
0 commit comments