Commit 9ed35ab
authored
## Summary
This PR refactors the variadic device-side kernel helper `call_device`
using C++17 fold expressions and reorders its definition in
`Src/Base/AMReX_GpuLaunch.H`.
Specifically:
1. Moved the definition of `call_device` above `launch_global` to ensure
it is visible during Phase 1 of template parsing.
2. Replaced the traditional recursive template overloads of
`call_device` with a native C++17 unary right fold expression over the
comma operator (`(fs(), ...);`).
## Additional background
Currently, `launch_global` invokes `call_device(fs...)` before
`call_device` is fully defined or declared. This layout introduces
brittleness to **two-phase name lookup**.
If a user attempts to pass multiple lambdas belonging to the
global/anonymous namespace (which is common in isolated unit tests or
external mock environments), the compiler fails to resolve `call_device`
via Argument-Dependent Lookup (ADL), throwing a compilation error.
```cpp
auto f1 = [=] __device__ () { ... };
auto f2 = [=] __device__ () { ... };
auto f3 = [=] __device__ () { ... };
AMREX_LAUNCH_KERNEL_NOBOUND(1, 1, 0, 0, f1, f2, f3); // Fails to compile!
```
ERROR: all to function "call_device" that is neither visible in the
template definition nor found by argument-dependent lookup
This defect didn't surface within AMReX before because almost all
internal production codes pass a **single** lambda function to
`AMREX_LAUNCH_KERNEL`, which causes the compiler to optimize out or skip
the recursive invocation during instantiation.
The new C++17 fold expression cleanly handles both empty parameter packs
(safely evaluating to `void()`) and multiple lambdas while perfectly
preserving the original signature of `launch_global`. The fix has been
verified under CUDA 12.0+.
## Checklist
The proposed changes:
- [x] fix a bug or incorrect behavior in AMReX
- [ ] add new capabilities to AMReX
- [ ] changes answers in the test suite to more than roundoff level
- [ ] are likely to significantly affect the results of downstream AMReX
users
- [ ] include documentation in the code and/or rst files, if appropriate
1 parent 61f7201 commit 9ed35ab
1 file changed
Lines changed: 9 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | | - | |
58 | 56 | | |
59 | 57 | | |
60 | | - | |
61 | | - | |
62 | 58 | | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
67 | 63 | | |
| 64 | + | |
68 | 65 | | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
| |||
0 commit comments