Separate halo and MAC peer rank lists
Fully converge global tree when needed
Replace focusTransfer and macRefinement with multi-level LET updates and update LET until converged
Improve performance of segmented reductions with multiple threads per segment
Eliminate duplicated data structure for CPU and GPU particles data

Fixes

Add rocthrust and hipcub dependencies in CMake to fix HIP spack package build
Add custom MPI reduction to prevent overflows during initialization

Assets 2

06 Nov 13:54

sekelle

v0.95

5e12d29

v0.95 2025/10

LET improvements

Enforce global octree keys in LET
find halos with tight interaction boxes
unify mac and halo flags

Performance improvements

Avoid duplication of particle data fields on the CPU and GPU
find halos with a) tight and b) fp interaction boxes
stackless DFS traversals
GPU-direct MPI communication for all LET parts, including globals

Features and enhancements:

Replaced h5part with h5hut
Improved profiling framework, capturing additional information
replaced gsl::span with std::span

Assets 2

13 Dec 08:58

sekelle

v0.93.1

4be3f10

HIP and Spack

HIP compatibility without requiring the source code to be hipified
CMake changes to allow easier integration with Spack

Assets 2

21 Nov 14:56

sekelle

v0.93

f56fc7c

Propagator library

Enhancements:

Separate library with a translation unit for each propagator to reduce compilation times

Fixes:

Prevent GPU kernel launches with 0 thread blocks which started to be an issue with CUDA 12.6

Assets 2

21 Nov 13:37

sekelle

v0.92

d0001fb

CUDA 12.5 compatibility

Fixes:

Full encapsulation of thrust::device_vector, because starting from CUDA 12.5 inclusion of its in .cpp files is no longer possible

Assets 2

21 Nov 13:34

sekelle

v0.91

1f29ba8

Dynamic LET surface refinement and node pruning

New features:

Refine LET resolution at surface after domain boundary changes
Prune LET nodes outside focus that exceed the LET resolution on the owning rank

Assets 2

20 Nov 17:33

sekelle

v0.90

26cbbc2

Hierarchical block time steps

New features:

Hierarchical block time stepping

Assets 2

20 Nov 17:17

sekelle

v0.82

61a9674

Ewald summation

New features:

Ewald summation on CPUs and GPUs for gravitational forces with periodic boundarys
New smoothing kernel for SPH: S49

Performance enhancements:

Improve tree refinement for remote LET nodes such that fewer remote nodes are needed to ensure successful gravity traversal. Improves performance due to smaller amount of communication needed
injectKeys on GPUs. This tree resolution-enforcement mechanism is needed more frequently than previously thought,
hence it made sense to port it to GPU.

Fixes:

Fix compilation issues with CUDA 12.4 related to thrust::device_vector

Assets 2

Releases: sphexa-org/sphexa

v0.96.2 2026/03

Fixes

Uh oh!

v0.96.1 2026/03

Uh oh!

v0.96 2026/03

Features

Improvements

Fixes

Uh oh!

v0.95 2025/10

Uh oh!

HIP and Spack

Uh oh!

Propagator library

Uh oh!

CUDA 12.5 compatibility

Uh oh!

Dynamic LET surface refinement and node pruning

Uh oh!

Hierarchical block time steps

Uh oh!

Ewald summation

Uh oh!