Skip to content

reduce seems slow #1107

Closed
Closed
@DiamonDinoia

Description

@DiamonDinoia

Hi all! Thank you for developing this library. It is super useful in multiple projects!

I noticed something that might be an issue. But I am not sure.

I have a code where I multiply a complex valued (a) array with a real valued array (ker):
Basically, I need to multiply each element of 'a' twice.
My code is as follows:

auto func(real a1, real a2 complex ker):
     // this trick halves the number of loads for ker also the reason why I use a1 and a2 instead of a
    const auto low = xsimd::zip_lo(ker, ker);
    const auto high= xsimd::zip_hi(ker, ker);
    const auto res0 = a1 * low;
    const auto res1 = a2 * high;

what I noticed is that the original implementation of reduce_add on my machine can be optimized. Is it possible to have a split function that returns low and hi? By doing split + add multiple times my code is 7 times faster.

I have pushed the benchmarks here:
https://github.com/DiamonDinoia/cpp-learning/tree/master/xsimd

it results in the following performance:

ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
6.96 143,690,879.59 0.6% 19.00 21.47 0.885 0.00 0.0% 0.01 add+store
2.31 432,949,727.65 0.6% 24.00 7.11 3.374 0.00 0.0% 0.01 hsum
3.81 262,211,901.24 0.1% 36.00 11.75 3.064 2.00 0.0% 0.01 reduce_add
2.59 385,491,672.62 0.2% 20.00 7.99 2.503 0.00 0.0% 0.01 union pun
1.18 846,618,297.70 0.9% 17.00 3.64 4.672 0.00 0.0% 0.01 double union pun

I tweaked master a bit in https://github.com/DiamonDinoia/xsimd/tree/hadd-tweaks
and I got:

ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
7.00 142,933,991.35 0.9% 19.00 21.50 0.884 0.00 0.0% 0.01 add+store
2.27 439,741,444.70 0.9% 24.00 6.99 3.434 0.00 0.0% 0.01 hsum
2.99 334,267,996.40 1.5% 36.00 9.15 3.935 2.00 0.0% 0.01 reduce_add
2.09 478,101,632.03 1.2% 28.00 6.44 4.346 2.00 0.0% 0.01 union pun
1.05 956,625,856.43 1.6% 17.00 3.21 5.289 0.00 0.0% 0.01 double union pun

Thanks,
Marco

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions