Skip to content

Conversation

BenWibking
Copy link
Contributor

@BenWibking BenWibking commented Sep 24, 2025

Summary

Adds an integer superaccumulator class that performs reproducible summation of {non-subnormal, non-NAN, non-Inf} non-negative single-precision floats on device.

GPU atomics code path is currently very slow for many summands.

Additional background

Partially addresses #4669.

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

@BenWibking
Copy link
Contributor Author

bwibking@moth:~/amrex/Tests/Numerics/IntSuperAccumulator> time ./main3d.hip.FLOAT.HIP.ex
Initializing AMReX (22.03-1419-gd364b6bd6475)...
Initializing HIP...
HIP initialized with 1 device.
AMReX (22.03-1419-gd364b6bd6475) initialized
Range [1.175494351e-38, 2.350988562e-38] with count 32768: reference=5.774813904e-34 accumulator=5.774813904e-34
Range [4.253529587e+37, 8.507059173e+37] with count 4: reference=2.543284763e+38 accumulator=2.543284763e+38
Range [9.99999996e-13, 0.001000000047] with count 32768: reference=16.3573494 accumulator=16.3573494
Range [9.999999747e-05, 1] with count 65536: reference=32755.99805 accumulator=32755.99805
Range [1, 1000000] with count 65536: reference=3.265710694e+10 accumulator=3.265710694e+10
Range [0.009999999776, 100000000] with count 131072: reference=6.547726926e+12 accumulator=6.547726926e+12
Constant accumulation: reference=25000 accumulator=25000
Total GPU global memory (MB): 65520
Free  GPU global memory (MB): 16162
[The         Arena] max space allocated (MB): 49140
[The         Arena] max space used      (MB): 39
[The Managed Arena] max space allocated (MB): 8
[The Managed Arena] max space used      (MB): 0
[The  Pinned Arena] max space allocated (MB): 8
[The  Pinned Arena] max space used      (MB): 0
AMReX (22.03-1419-gd364b6bd6475) finalized

real	0m14.501s
user	0m13.866s
sys	0m0.579s
bwibking@moth:~/amrex/Tests/Numerics/IntSuperAccumulator> time ./main3d.gnu.FLOAT.ex
Initializing AMReX (22.03-1419-gd364b6bd6475)...
AMReX (22.03-1419-gd364b6bd6475) initialized
Range [1.175494351e-38, 2.350988562e-38] with count 32768: reference=5.774813904e-34 accumulator=5.774813904e-34
Range [4.253529587e+37, 8.507059173e+37] with count 4: reference=2.543284763e+38 accumulator=2.543284763e+38
Range [9.99999996e-13, 0.001000000047] with count 32768: reference=16.3573494 accumulator=16.3573494
Range [9.999999747e-05, 1] with count 65536: reference=32755.99805 accumulator=32755.99805
Range [1, 1000000] with count 65536: reference=3.265710694e+10 accumulator=3.265710694e+10
Range [0.009999999776, 100000000] with count 131072: reference=6.547726926e+12 accumulator=6.547726926e+12
Constant accumulation: reference=25000 accumulator=25000
AMReX (22.03-1419-gd364b6bd6475) finalized

real	0m0.016s
user	0m0.010s
sys	0m0.005s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant