Commit 20b6496
fix: arch 12.1 -> "sm120a" flag for Spark, CUDA 12.9 (#2839)
<!-- .github/pull_request_template.md -->
## 📌 Description
Bug found in nightly [Spark, 12.9] matrix
https://gitlab-master.nvidia.com/dl/flashinfer/flashinfer-ci/-/jobs/285092631,
where Spark compiles to "120a" (see "/tmp/.cache/flashinfer/0.6.6/120a/"
path in log below).
```
E RuntimeError: Check failed: (status == cudaSuccess) is false: SingleDecodeWithKVCache kernel launch failed, error: no kernel image is available for execution on the device
/tmp/.cache/flashinfer/0.6.6/120a/generated/single_decode_with_kv_cache_dtype_q_f16_dtype_kv_f16_dtype_o_f16_head_dim_qk_128_head_dim_vo_128_posenc_2_use_swa_False_use_logits_cap_False/single_decode.cu:100: RuntimeError: Check failed: (status == cudaSuccess) is false: SingleDecodeWithKVCache kernel launch failed, error: no kernel image is available for execution on the device
```
Root cause was flashinfer-ai/flashinfer#2725 ,
where we added logic for compiling both Spark and Thor to 120f, but on
the condition that cuda version is 13 or higher. Lower (12.9) defaults
to 'a' suffix, 120a.
## 🔍 Related Issues
<!-- Link any related issues here -->
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Strengthened CUDA validation for SM 12.x GPUs: now requires CUDA 12.9
or newer and emits a clear error if unmet, replacing the previous silent
fallback behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent 34263f1 commit 20b6496
1 file changed
Lines changed: 7 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
40 | | - | |
41 | | - | |
| 39 | + | |
| 40 | + | |
42 | 41 | | |
43 | 42 | | |
44 | 43 | | |
45 | 44 | | |
46 | 45 | | |
47 | 46 | | |
48 | | - | |
49 | | - | |
| 47 | + | |
50 | 48 | | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
61 | 53 | | |
62 | 54 | | |
63 | 55 | | |
| |||
0 commit comments