Open
Description
We need to document the parallel systems we must be able to support with the graphBLAS. This would include:
- Multi-core, multi-CPU in a shared address space. Explicit management of NUMA features of a system is critical
- Single GPU ... basic Host/Device model with disjoint host/device memories and Uniform Shared Memory (USM)
- Multi GPU ... Host/Device model with disjoint memories and USM
- Arbitrary accelerators instead of GPUs (aside: An accelerator is restricted to a fixed API, unlike a GPU which is programmable)
- Shared nothing distributed systems with nodes composed of the above
We need a platform model that appropriately abstracts systems composed of the above. It must deal with the complexity of the various memory spaces and support arbitrary, dynamic partitions of the above.
Finally, we need a way to deal with nonblocking GraphBLAS operations as part of a larger execution context that supports asynchronous execution. I will add a separate issue for this topic.
Metadata
Metadata
Assignees
Labels
No labels