Skip to content

Commit 20c77c5

Browse files
committed
Improve comments
1 parent 773da40 commit 20c77c5

File tree

1 file changed

+1
-3
lines changed
  • src/schemes/fluid/weakly_compressible_sph

1 file changed

+1
-3
lines changed

src/schemes/fluid/weakly_compressible_sph/rhs.jl

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,13 +161,11 @@ end
161161

162162
# Optimized version for WCSPH with `ContinuityDensity` in 3D,
163163
# which combines the velocity and density load into one wide load.
164-
# This is significantly faster on GPUs.
164+
# This is significantly faster on GPUs than the 4 individual loads of `extract_svector`.
165165
@inline function velocity_and_density(v, ::ContinuityDensity,
166166
::WeaklyCompressibleSPHSystem{3}, particle)
167167
# Since `v` is stored as a 4 x N matrix, this aligned load extracts one column
168168
# of `v` corresponding to `particle`.
169-
# As opposed to `extract_svector`, this will translate to a single wide load instruction
170-
# on the GPU, which is faster than 4 separate loads.
171169
# Note that this doesn't work for 2D because it requires a stride of 2^n.
172170
vrho_particle = SIMD.vloada(SIMD.Vec{4, eltype(v)}, pointer(v, 4 * (particle - 1) + 1))
173171

0 commit comments

Comments
 (0)