Skip to content

wasm: __builtin_reduce_and does not optimize well #129441

Open
@folkertdev

Description

@folkertdev

given this C code

https://godbolt.org/z/YMo1qqccT

#include <stdbool.h>
#include <wasm_simd128.h>

bool foo(v128_t a) { return wasm_i8x16_all_true(a); }

bool bar(v128_t a) {
    v128_t zero = wasm_i8x16_splat(0);
    return __builtin_reduce_and(wasm_i8x16_ne(a, zero));
}

bool baz(v128_t a) {
    v128_t zero = wasm_i8x16_splat(0);
    return __builtin_reduce_and((a != zero));
}

I'd expect these all to optimize to

foo:
        local.get       0
        i8x16.all_true
        end_function

or some variation in it. However, the other variants optimize much worse.

bar:
        local.get       0
        v128.const      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        i8x16.ne
        local.tee       0
        local.get       0
        local.get       0
        i8x16.shuffle   8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 0, 1, 2, 3
        v128.and
        local.tee       0
        local.get       0
        local.get       0
        i8x16.shuffle   4, 5, 6, 7, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3
        v128.and
        i32x4.extract_lane      0
        i32.const       0
        i32.ne  
        end_function

baz:
        local.get       0
        v128.const      0, 0, 0, 0
        i32x4.eq
        v128.any_true
        i32.const       -1
        i32.xor 
        i32.const       1
        i32.and 
        end_function

Binary size is especially important for wasm, and it looks like __builtin_reduce_and just does not optimize well (I suspect the same is true for __builtin_reduce_or).

s390x has the same limitation #129434, so maybe some work can be shared between backends?

This came up while working on the rust standard library, which would rather use the generic implementation of operations than a target-specific one.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions