Implement cross-device reductions using Jet


#### Feature description
Currently Jet allows the slicing of a given tensor network to allow distribution over multiple devices: each device contracts its part of the network, be they CPU cores (BLAS backed contractions), or GPUs (cuTENSOR backed contractions). Currently no mechanism exists for reductions across different device types. The goal will be to implement a reduction task that allows efficient CPU-GPU, and GPU-GPU contractions.

* *Context:* Enable reductions across mixed devices for tensor network contractions.

* *Factors:* Performance should be equal to or better than a single device alone.

Tasks:
 - Implement a reduction task for CPU-GPU, and GPU-GPU contractions based on the current main branch.
 - Add tests to verify the reduction is performed correctly.
 - Compare runtimes using the CPU-only (default) backend.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement cross-device reductions using Jet #57

Feature description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement cross-device reductions using Jet #57

Description

Feature description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions