Skip to content

Conversation

@ekatralis
Copy link

@ekatralis ekatralis commented Dec 2, 2025

Description

This pull request adds sparse solvers to xobjects via the xo.sparse module. This functionality is important for extending xfields to include Finite Difference solvers.

The xo.sparse module can be used to solve sparse linear systems of equations:
A*x = b
where A is a sparse matrix.

Currently only CPU and Cupy contexts are supported. This module contains a variety of solvers for different contexts, with consistent APIs. The intended use is to reuse the same LHS for many solves, so the solvers work as follows:

solver(A) # Performs decomposition/factorization
solver.solve(b) # Solves Ax = b using precomputed factors

For optimal performance across backends b should be a column-major (F Contiguous) array or vector.

The intended interface for this module is:

xo.sparse.factorized_sparse_solver()

The above function includes detailed documentation for usage, but in short, it returns the best performing solver object based on the context and available modules. If the context is not explicitly defined, it is inferred based on the input matrix.

For development and convenience purposes the xo.sparse.solvers module is present, which provides the following aliases:

  • xo.sparse.solvers.CPU.
    • scipysplu : Alias for scipy SuperLU
    • KLUSuperLU : Alias for PyKLU
  • xo.sparse.solvers.CPU.
    • cuDSS : nvmath.sparse.advanced.DirectSolver Wrapper with a SuperLU-like interface
    • cachedSpSM : Rewrite of cupy's SuperLU to cache the SpSM analysis step offering massive speedups compared to cupy splu when the only available backend is SpSM
    • cupysplu : Alias for scipy SuperLU

This pull request introduces new optional dependencies for xobjects:
nvmath-python: Provides the cuDSS bindings for fast solves on GPU
PyKLU: For fast LU solves on CPU (now published in pypi)
cuDSS needs to be installed in the conda environment and to be supported by the driver version (CUDA 12.0+)

Additional changes:

  • Added ModuleNotAvailableError class
  • Raise ModuleNotAvailableError when OpenCL/Cupy is not installed and an instance of the context is initiated

Checklist

Mandatory:

  • I have added tests to cover my changes
  • All the tests are passing, including my new ones
  • I described my changes in this PR description

Optional:

  • The code I wrote follows good style practices (see PEP 8 and PEP 20).
  • I have updated the docs in relation to my changes, if applicable (added examples)
  • I have tested also GPU contexts

@ekatralis
Copy link
Author

Adding this comment to document my environment setup for creating a cupy environment with all the dependencies for the sparse functionality. These instructions worked on AlmaLinux 9.6 with CUDA 13.0 installed and a V100 GPU.

nvrtc no longer supports older architectures (Volta, Pascal) on CUDA 13.0, as such, we need to install an older cuda toolkit on older GPUs. Instructions given use micromamba, but conda works in the same way.

Install cupy and cuda-toolkit:

micromamba install -c conda-forge \
    "cuda-toolkit=12.9" \
    "cupy" \
    -y

And for cuDSS bindings:

micromamba install nvmath-python

In the past, I had to manually install cuDSS using:

micromamba install -c nvidia libcudss0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant