[RFC] Cross-Platform Refactor: CPU-only implementation

### Motivation

As we want to have this library portable, the first step would be to make 100% of this library run correctly on only CPU (i.e. not requiring CUDA for any part of the functionality). This would serve two purposes:

- Provide a baseline that contributors of ports can reference
- Provide a fallback for partially implemented hardware platforms


### Proposed solution

- [ ] Implement all the CUDA kernels in "normal" C++
- [ ] Make sure the unit tests all run on the CPU as well
- [ ] Make sure unit test coverage is satisfactory


### Open questions

- Which CPU architectures do we support (x86_64 and arm64 are givens, but any more)?
- How do we deal with SIMD intrinsics? Build separate libraries for each SIMD architecture? Or run-time selection based on CPU features?

@Titus-von-Koeller Feel free to edit this issue as you see fit, if you want a different structure for it for example.tbd

tbd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Cross-Platform Refactor: CPU-only implementation #1021

Motivation

Proposed solution

Open questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Cross-Platform Refactor: CPU-only implementation #1021

Description

Motivation

Proposed solution

Open questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions