Open
Description
Would it make sense for this library to support platforms other than cuda on x64 Linux? I am specifically looking for Apple silicon support. Currently not even cpuonly works since it assumes SSE2 support (Even without Neon. Support).
i would guess that the first step would be a full cross platform compile (arm64), then ideally support for Metal Performance Shaders as an alternative to CUDA (assuming it is at all feasible).
I could probably contribute some towards support if there is interest for bitsandbytes to be multi platform. I have some experience setting up cross platform Python libraries.