Skip to content

Question about parallelization-strategy option #48

Open
@vmiheer

Description

@vmiheer
  1. Seems like lapis expects the default parallelization strategy. Using alternative option causes lowering error.

Steps to reproduce:

gh gist clone https://gist.github.com/vmiheer/06c23a25e37e69f3de05c9d031e1512f lapis-bspmm
cmake --workflow --preset=default-kokkos
  1. Orthogonal question: the default value of the option seems to be "always parallel," which is opposite the default opted by mlir-opt, which is "always serial." The always-serial seems to be conservative but always produces correct code. The other options seem to be directives rather than suggestions, which causes the generated code to make all loops parallel (even though some iterators are reduction iterators), which causes incorrect output due to data races. I think the compiler should always be correct before being performant, so maybe the default parallelization strategy should be serial, and (a) user or (b) heuristic in the compiler can selectively make some loops parallel.

To reproduce, download, and extract the zip:

OMP_NUM_THREADS=1 ./localbuild/dump_partitions.part_tensor -nh 2 -dh 4 -i banded_11_r2.coordinates.bin --ntimes 0 --local-only # avoid data race
# In another shell open python with dgl installed
./a.py # The return code will be 0
for i in `seq 1 3`; do 
  # running 3 times so that data race would definitely happen
  ./localbuild/dump_partitions.part_tensor -nh 2 -dh 4 -i banded_11_r2.coordinates.bin --ntimes 0 --local-only # let omp do it's thing
  ./a.py banded_11_r2.coordinates.bin 11 4 2 # The script will dump all tensors and difference between expected/observed
done

inputs.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions