Skip to content

Improve the alpaka-based prefix scan #47331

Open
@fwyzard

Description

@fwyzard

Currently the logic used by the prefix scan is scattered and replicated in many files, e.g.

  • HeterogeneousCore/AlpakaInterface/interface/prefixScan.h (obviously)
  • RecoLocalTracker/SiPixelClusterizer/plugins/alpaka/ClusterChargeCut.h
  • RecoLocalTracker/SiPixelClusterizer/plugins/alpaka/SiPixelRawToClusterKernel.dev.cc
  • etc.

We should implement a single prefixscan(acc, ...) function that is able to deal with arbitrary sized inputs and automatically splits the work in multiple passes, as needed.

It should also use the compile-time warp size (if available), and fall back to a simple loop for single-threaded back-ends.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions