ROCM-21372: clamp rocSOLVER reset_info launch block counts#7914
ROCM-21372: clamp rocSOLVER reset_info launch block counts#7914Copilot wants to merge 3 commits into
reset_info launch block counts#7914Conversation
reset_info launch block counts
There was a problem hiding this comment.
Pull request overview
Addresses ROCM-21372 by introducing a shared reset_info_nblocks helper that clamps derived grid x-dimension to [1, 65535] (and threads to [32, 1024]) for all reset_info kernel launches in rocSOLVER, eliminating the possibility of zero-block or oversized grid configurations. Because reset_info is implemented as a strided 1D loop, the clamp is correctness-preserving.
Changes:
- Added
reset_info_nblockshelper (and<algorithm>include) inlib_device_helpers.hpp. - Replaced open-coded ceil-division block counts at ~37
reset_infolaunch sites across LAPACK, auxiliary, and refactor paths. - Removed the per-launch
hipGetDevicePropertiesclamp inrocsolver_lacn2_templatein favor of the shared helper.
Reviewed changes
Copilot reviewed 49 out of 49 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
library/src/include/lib_device_helpers.hpp |
Adds reset_info_nblocks helper and <algorithm> include. |
library/src/include/lapack_device_functions.hpp |
lacn2 switches from device-property clamp to helper. |
library/src/auxiliary/rocauxiliary_bdsqr.hpp |
Replaces batch_count/BS1 + 1 with reset_info_nblocks(batch_count + 1, BS1). |
library/src/auxiliary/rocauxiliary_bdsvdx.hpp |
Uses helper for both info and Dtgk resets. |
library/src/auxiliary/rocauxiliary_gecon.hpp |
Uses helper for rcond reset. |
library/src/auxiliary/rocauxiliary_lasyf.hpp |
Uses helper for kb/info reset. |
library/src/auxiliary/rocauxiliary_stebz.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_stedc.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_stedcj.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_stedcx.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_stein.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_steqr.hpp |
Uses helper for reset grid. |
library/src/auxiliary/rocauxiliary_sterf.hpp |
Uses helper for reset grid. |
library/src/lapack/roclapack_gels.hpp / _outofplace.hpp |
Uses helper for reset grid. |
library/src/lapack/roclapack_gesdd.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_gesv.hpp / _outofplace.hpp |
Uses helper for reset grid. |
library/src/lapack/roclapack_gesvdj.hpp / _notransv.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_gesvdx.hpp / _notransv.hpp |
Uses helper for info/nsv reset. |
library/src/lapack/roclapack_getf2.hpp |
Uses helper with static_cast<I>(256). |
library/src/lapack/roclapack_getrf.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_getri.hpp / _outofplace.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_posv.hpp |
Uses helper for reset grid. |
library/src/lapack/roclapack_potf2.hpp / roclapack_potrf.hpp |
Uses helper for reset grid. |
library/src/lapack/roclapack_potri.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_syev_heev.hpp and syevd/_dx/_dx_inplace/_dj/_jx_jx variants |
Uses helper for reset grid. |
library/src/lapack/roclapack_syevj_heevj.hpp |
Uses helper at two sites; + 1 preserves prior batch_count/BS1 + 1 semantics. |
library/src/lapack/roclapack_sygv_hegv.hpp and sygvd/_dj/_dx/_dx_inplace/_j/_x variants |
Uses helper for reset grid. |
library/src/lapack/roclapack_sytf2.hpp / roclapack_sytrf.hpp |
Uses helper in quick-return path. |
library/src/lapack/roclapack_trtri.hpp |
Uses helper for info reset. |
library/src/refact/rocrefact_csrrf_splitlu.hpp |
Uses helper for ptrU/ptrL/... resets. |
library/src/refact/rocrefact_csrrf_sumlu.hpp |
Uses helper with n + 1 to match prior n/BS1 + 1 semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
❌ Your project status has failed because the head coverage (77.83%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #7914 +/- ##
========================================
Coverage 61.43% 61.43%
========================================
Files 2091 2091
Lines 358852 358857 +5
Branches 54266 54265 -1
========================================
+ Hits 220451 220457 +6
Misses 119672 119672
+ Partials 18729 18728 -1
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
Some rocSOLVER paths compute
reset_infogrid dimensions directly from problem size or batch count. A few of those launches can produce invalid HIP grid/workgroup configurations on AMD GPUs when the derived block count falls outside the valid range.What changed
lib_device_helpers.hppto computereset_infoblock counts with bounded thread/block dimensions:[32, 1024][1, 64*1024 - 1]reset_infolaunch sites to use the helper instead of open-coded ceil-division formulas.Where this is applied
lapack_device_functions.hpp(rocsolver_lacn2_template)reset_infoEffect
reset_infoExample
Original prompt
This pull request was created from Copilot chat.