refactor: unify CuPy/buffer-protocol path for array and sparse Csr#104
Open
rho-novatron wants to merge 1 commit into
Open
refactor: unify CuPy/buffer-protocol path for array and sparse Csr#104rho-novatron wants to merge 1 commit into
rho-novatron wants to merge 1 commit into
Conversation
…nstructors Move the duplicated __cuda_array_interface__ / buffer-protocol conversion logic from array.cpp and matrix.cpp into a single gko_array_from_pyobject<T>() template in utils.hpp. Both the gko::array(Executor, py::object) constructor and the sparse-matrix (Executor, py::object) constructor now delegate to this helper, eliminating ~260 lines of duplicated CUDA/host branching. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR-A: refactor: unify CuPy/buffer-protocol path for
arrayand sparseCsrBranch (fork):
rho-novatron:refactor/array-helperBase (upstream):
Helmholtz-AI-Energy:mainCommits: 1 (
17b586d)Summary
Pulls the
__cuda_array_interface__/ Python-buffer-protocol inputhandling out of
array.cppandmatrix.cppand into a single helperin
utils.hpp:Both the
gko::array(Executor, py::object)constructor and thesparse
Csr(Executor, scipy/cupyx csr_matrix)constructor now gothrough it.
Why this isn't just a refactor
The pre-existing sparse-matrix constructor took two structurally
different paths depending on input type:
CudaExecutor— zero-copy via__cuda_array_interface__(added in Add CuPy zero-copy interoperability via __cuda_array_interface__ #98).fallback that built
gko::array<...>::view(...)on aReferenceExecutorand handed those host arrays toCsr::create(exec, ...), forcing a second host→device copy whenthe executor was CUDA.
After this PR, both paths flow through
gko_array_from_pyobject,which:
execfor the buffer-protocol path(constructs the array directly on the requested executor instead
of via a Reference detour),
typestrvalidation that previously only theCUDA branch performed, to the host path as well, and
arrayandCsra single source of truth, so future fixes(e.g. the cross-platform integer-dtype-char fix in the
follow-up MPI PR) automatically apply to both.
Behavior change for sparse matrices on CUDA built from
NumPy/scipy input: now allocated and copied directly on the target
device executor instead of through a host
ReferenceExecutorintermediate.
Diff
Why this is its own PR
feature work that follows in the stacked PR-B.
helper — much easier to review when the helper itself has already
been reviewed in isolation.
Testing
The existing test suite covers all the input shapes: NumPy/scipy CSR
on
ReferenceExecutor, CuPy CSR zero-copy onCudaExecutor, andthe dense-array constructors. No regressions observed locally.
Stacked PR
Not stacked on anything (based on
main). PR-B for the distributedMPI bindings is stacked on top of this.