Skip to content

L1 Gauss Seidel preconditioner #191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 51 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
2382a8f
init l1 smoother
Abdelrahman912 Feb 28, 2025
e8991ba
minor fix
Abdelrahman912 Feb 28, 2025
a127bc0
init cuda prec setup
Abdelrahman912 Mar 23, 2025
d216e34
init working cuda
Abdelrahman912 Mar 24, 2025
086ed78
fix partition limit indices
Abdelrahman912 Mar 24, 2025
99b1ab6
add comment
Abdelrahman912 Mar 24, 2025
64a247c
minor change
Abdelrahman912 Mar 24, 2025
343c99d
minor adjustment for csc
Abdelrahman912 Mar 25, 2025
ee66f73
add cuda csr
Abdelrahman912 Mar 25, 2025
fbb3338
check symmetry for csc
Abdelrahman912 Mar 26, 2025
2b2dc17
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Mar 26, 2025
5e1203a
rm unnecessary code
Abdelrahman912 Mar 26, 2025
bc1cec3
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Mar 26, 2025
c8cc291
add cpu version
Abdelrahman912 Mar 27, 2025
457c110
Merge branch 'add-multi-threading-l1-prec' into l1-gs-smoother
Abdelrahman912 Mar 27, 2025
7f17845
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Mar 28, 2025
7c9b474
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Apr 1, 2025
42026cd
init ka, working but buggy.
Abdelrahman912 Apr 1, 2025
1b3ce6f
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Apr 1, 2025
8a5675b
Merge branch 'ka-porting' into l1-gs-smoother
Abdelrahman912 Apr 1, 2025
476928b
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Apr 2, 2025
8bcca25
fix ka buggy code
Abdelrahman912 Apr 2, 2025
77f7148
add tests
Abdelrahman912 Apr 2, 2025
aecbf67
minor fix
Abdelrahman912 Apr 2, 2025
1aa8986
update manifest
Abdelrahman912 Apr 2, 2025
552f8b8
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Apr 2, 2025
dc8d4e4
merge cpu and gpu
Abdelrahman912 Apr 8, 2025
24b72ab
Merge branch 'main' into l1-gs-smoother
Abdelrahman912 Apr 8, 2025
a87c66f
minor fix
Abdelrahman912 Apr 8, 2025
5778fe9
add Preconditioners submodule
Abdelrahman912 Apr 9, 2025
b104cfb
remove unnecessary module reference
Abdelrahman912 Apr 9, 2025
0e93f05
add cpu symmetric test
Abdelrahman912 Apr 9, 2025
51fa896
add test path
Abdelrahman912 Apr 9, 2025
1f56bad
minor fix
Abdelrahman912 Apr 9, 2025
226f55a
set nparts to be ncores
Abdelrahman912 Apr 9, 2025
292aacc
precompute blocks
Abdelrahman912 Apr 10, 2025
36f5754
separate CPU GPU tests
Abdelrahman912 Apr 10, 2025
33f1de7
fix ci
Abdelrahman912 Apr 10, 2025
3b52869
minor fix
Abdelrahman912 Apr 10, 2025
a056a67
add symmetric test
Abdelrahman912 Apr 10, 2025
bf2cc96
rm dead code
Abdelrahman912 Apr 10, 2025
a84ab45
comment out adapt
Abdelrahman912 Apr 10, 2025
0794dce
rm direct solver
Abdelrahman912 Apr 10, 2025
2532179
add doc string
Abdelrahman912 Apr 10, 2025
8203cac
add gpu test examples
Abdelrahman912 Apr 11, 2025
7ef9e15
minor fix
Abdelrahman912 Apr 14, 2025
869d3db
elementwise operations refinement
Abdelrahman912 Apr 14, 2025
9609d59
add reference
Abdelrahman912 Apr 14, 2025
8e62678
add block partitioning to doc string + some comments for (CSC/CSR)Format
Abdelrahman912 Apr 14, 2025
4a6454c
rm piratical code (only those which were merged into CUDA.jl) + add w…
Abdelrahman912 Apr 18, 2025
cce6547
rm dead code
Abdelrahman912 Apr 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Project.toml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a compat entry for the new CUDA release?

Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ FerriteGmsh = "4f95f4f8-b27c-4ae5-9a39-ea55e634e36b"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
GPUArraysCore = "46192b85-c4d5-4398-a991-12ede77f4527"
JLD2 = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
Expand Down Expand Up @@ -47,6 +48,7 @@ FastBroadcast = "0.3.5"
Ferrite = "1"
ForwardDiff = "0.10.38"
JET = "0.9"
KernelAbstractions = "0.9.34"
LinearSolve = "2"
Logging = "1.10"
ModelingToolkit = "9"
Expand All @@ -64,6 +66,8 @@ Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Tensors = "48a634ad-e948-5137-8d70-aa71f2a747f4"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
MatrixDepot = "b51810bb-c9f3-55da-ae3c-350fc1fbce05"
ThreadPinning = "811555cd-349b-4f26-b7bc-1f208b848042"

[targets]
test = ["Aqua", "DelimitedFiles", "JET", "Pkg", "StaticArrays", "Tensors", "Test"]
test = ["Aqua", "DelimitedFiles", "JET", "Pkg", "StaticArrays", "Tensors", "Test","MatrixDepot","ThreadPinning"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need ThreadPinning for testing?

47 changes: 44 additions & 3 deletions docs/Manifest.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This file is machine-generated - editing it directly is not advised

julia_version = "1.11.4"
julia_version = "1.11.3"
manifest_format = "2.0"
project_hash = "e3f0483fad38a42c2ece39dd647c14ff0c29b8f8"

Expand Down Expand Up @@ -122,6 +122,24 @@ weakdeps = ["SparseArrays"]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
version = "1.11.0"

[[deps.Atomix]]
deps = ["UnsafeAtomics"]
git-tree-sha1 = "b5bb4dc6248fde467be2a863eb8452993e74d402"
uuid = "a9b6321e-bd34-4604-b9c9-b65b8de01458"
version = "1.1.1"

[deps.Atomix.extensions]
AtomixCUDAExt = "CUDA"
AtomixMetalExt = "Metal"
AtomixOpenCLExt = "OpenCL"
AtomixoneAPIExt = "oneAPI"

[deps.Atomix.weakdeps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Metal = "dde4c033-4e86-420c-a63e-0dd931031962"
OpenCL = "08131aa3-fb12-5dee-8b74-c09406e224a2"
oneAPI = "8f75cd03-7ff8-4ecb-9b8f-daf728133b1b"

[[deps.Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
version = "1.11.0"
Expand Down Expand Up @@ -1021,6 +1039,18 @@ git-tree-sha1 = "07649c499349dad9f08dde4243a4c597064663e9"
uuid = "ef3ab10e-7fda-4108-b977-705223b18434"
version = "0.6.0"

[[deps.KernelAbstractions]]
deps = ["Adapt", "Atomix", "InteractiveUtils", "MacroTools", "PrecompileTools", "Requires", "StaticArrays", "UUIDs"]
git-tree-sha1 = "80d268b2f4e396edc5ea004d1e0f569231c71e9e"
uuid = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
version = "0.9.34"
weakdeps = ["EnzymeCore", "LinearAlgebra", "SparseArrays"]

[deps.KernelAbstractions.extensions]
EnzymeExt = "EnzymeCore"
LinearAlgebraExt = "LinearAlgebra"
SparseArraysExt = "SparseArrays"

[[deps.Krylov]]
deps = ["LinearAlgebra", "Printf", "SparseArrays"]
git-tree-sha1 = "b29d37ce30fa401a4563b18880ab91f979a29734"
Expand Down Expand Up @@ -1563,7 +1593,7 @@ version = "0.3.27+1"
[[deps.OpenLibm_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "05823500-19ac-5b8b-9628-191a04bc5112"
version = "0.8.1+4"
version = "0.8.1+2"

[[deps.OpenMPI_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "MPIPreferences", "TOML"]
Expand Down Expand Up @@ -2238,7 +2268,7 @@ uuid = "8290d209-cae3-49c0-8002-c8c24d57dab5"
version = "0.5.2"

[[deps.Thunderbolt]]
deps = ["Adapt", "BlockArrays", "DataStructures", "DiffEqBase", "FastBroadcast", "Ferrite", "FerriteGmsh", "ForwardDiff", "GPUArraysCore", "JLD2", "LinearAlgebra", "LinearSolve", "Logging", "ModelingToolkit", "OrderedCollections", "OrdinaryDiffEqCore", "Polyester", "Preferences", "ReadVTK", "Reexport", "SciMLBase", "SparseArrays", "SparseMatricesCSR", "StaticArrays", "Tensors", "TimerOutputs", "UnPack", "Unrolled", "WriteVTK"]
deps = ["Adapt", "BlockArrays", "DataStructures", "DiffEqBase", "FastBroadcast", "Ferrite", "FerriteGmsh", "ForwardDiff", "GPUArraysCore", "JLD2", "KernelAbstractions", "LinearAlgebra", "LinearSolve", "Logging", "ModelingToolkit", "OrderedCollections", "OrdinaryDiffEqCore", "Polyester", "Preferences", "ReadVTK", "Reexport", "SciMLBase", "SparseArrays", "SparseMatricesCSR", "StaticArrays", "Tensors", "TimerOutputs", "UnPack", "Unrolled", "WriteVTK"]
path = ".."
uuid = "909927c2-98d5-4a67-bba9-79f03a9ad49b"
version = "0.0.1"
Expand Down Expand Up @@ -2330,6 +2360,17 @@ git-tree-sha1 = "6cc9d682755680e0f0be87c56392b7651efc2c7b"
uuid = "9602ed7d-8fef-5bc8-8597-8f21381861e8"
version = "0.1.5"

[[deps.UnsafeAtomics]]
git-tree-sha1 = "b13c4edda90890e5b04ba24e20a310fbe6f249ff"
uuid = "013be700-e6cd-48c3-b4a1-df204f14c38f"
version = "0.3.0"

[deps.UnsafeAtomics.extensions]
UnsafeAtomicsLLVM = ["LLVM"]

[deps.UnsafeAtomics.weakdeps]
LLVM = "929cbde3-209d-540e-8aea-75f648917ca0"

[[deps.VTKBase]]
git-tree-sha1 = "c2d0db3ef09f1942d08ea455a9e252594be5f3b6"
uuid = "4004b06d-e244-455f-a6ce-a5f9919cc534"
Expand Down
8 changes: 8 additions & 0 deletions docs/src/api-reference/solver.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ DocTestSetup = :(using Thunderbolt)
SchurComplementLinearSolver
```

## Preconditioners

```@docs
Thunderbolt.Preconditioners.L1GSPreconditioner
Thunderbolt.Preconditioners.BlockPartitioning
Thunderbolt.Preconditioners.L1GSPrecBuilder
```

## Nonlinear

```@docs
Expand Down
16 changes: 16 additions & 0 deletions docs/src/assets/references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -345,3 +345,19 @@ @inproceedings{PonVerEldGarEnnPer:2019:mlv
year={2019},
organization={Springer},
}

@article{BakFalKolYan:2011:MSU,
author = {Allison H. Baker and Robert D. Falgout and Tzanio V. Kolev and Ulrike Meier Yang},
title = {Multigrid Smoothers for Ultraparallel Computing},
journal = {SIAM Journal on Scientific Computing},
volume = {33},
number = {5},
pages = {2864--2887},
year = {2011},
doi = {10.1137/100798806},
url = {https://doi.org/10.1137/100798806},
eprint = {https://doi.org/10.1137/100798806},
abstract = {This paper investigates the properties of smoothers in the context of algebraic multigrid (AMG) running on parallel computers with potentially millions of processors. The development of multigrid smoothers in this case is challenging, because some of the best relaxation schemes, such as the Gauss–Seidel (GS) algorithm, are inherently sequential. Based on the sharp two-grid multigrid theory from Falgout and Vassilevski (2004) and Falgout, Vassilevski, and Zikatanov (2005), we characterize the smoothing properties of a number of practical candidates for parallel smoothers, including several C-F, polynomial, and hybrid schemes. We show, in particular, that the popular hybrid GS algorithm has multigrid smoothing properties which are independent of the number of processors in many practical applications, provided that the problem size per processor is large enough. This is encouraging news for the scalability of AMG on ultraparallel computers. We also introduce the more robust ℓ₁ smoothers, which are always convergent and have already proven essential for the parallel solution of some electromagnetic problems.}
}


20 changes: 19 additions & 1 deletion ext/CuThunderboltExt.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
module CuThunderboltExt

using Thunderbolt
using LinearSolve
using KernelAbstractions
using SparseMatricesCSR

import SparseArrays

import CUDA:
CUDA, CuArray, CuVector, CUSPARSE,blockDim,blockIdx,gridDim,threadIdx,
Expand All @@ -10,7 +15,7 @@ import CUDA:
import Thunderbolt:
UnPack.@unpack,
SimpleMesh,
SparseMatrixCSR, SparseMatrixCSC,
SparseMatrixCSR, SparseMatrixCSC, AbstractSparseMatrix,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we import these types here from their respective packages instead of Thunderbolt? We might hit weird bugs otherwise.

AbstractSemidiscreteFunction, AbstractPointwiseFunction, solution_size,
AbstractPointwiseSolverCache,assemble_element!,
LinearOperator,QuadratureRuleCollection,
Expand All @@ -26,6 +31,9 @@ import Thunderbolt.FerriteUtils:
FeMemShape, KeMemShape, KeFeMemShape, DeviceCellIterator,DeviceOutOfBoundCellIterator,DeviceCellCache,
FeCellMem, KeCellMem, KeFeCellMem,NoCellMem,AbstractMemShape

import Thunderbolt.Preconditioners:
sparsemat_format_type, CSCFormat, CSRFormat


import Ferrite:
AbstractDofHandler,get_grid,CellIterator,get_node_coordinate,getcoordinates,get_coordinate_eltype,getcells,
Expand Down Expand Up @@ -80,9 +88,19 @@ function Thunderbolt.adapt_vector_type(::Type{<:CuVector}, v::VT) where {VT <: V
return CuVector(v)
end

const __cuda_version__ = pkgversion(CUDA)
@info("CuThunderboltExt.jl: CUDA version: ", __cuda_version__)

include("cuda/cuda_utils.jl")
include("cuda/cuda_operator.jl")
include("cuda/cuda_memalloc.jl")
include("cuda/cuda_adapt.jl")
include("cuda/cuda_iterator.jl")

if __cuda_version__ >= v"5.7.3" #TODO: better way? support back compatibility?
include("cuda/cuda_preconditioner.jl")
else
@warn("CuThunderboltExt.jl: CUDA version is too old <$__cuda_version__, skipping CUDA preconditioner.")
end

end
35 changes: 35 additions & 0 deletions ext/cuda/cuda_preconditioner.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#########################################
## CUDA L1 Gauss Seidel Preconditioner ##
#########################################

# PIRACY ALERT: this code exhibit piratic nature because both `adapt` and its arguments are foreign objects.
# Therefore, `adapt` behavior is going to be different depending on whether `Thunderbolt` and `CuThunderboltExt` are loaded or not.
# Reference: https://juliatesting.github.io/Aqua.jl/stable/piracies/
# Note: the problem is with `AbstractSparseMatrix` as the default behavior of `adapt` is to return the same object, whatever the backend is.
# Adapt.adapt(::CUDABackend, A::CUSPARSE.AbstractCuSparseMatrix) = A
# Adapt.adapt(::CUDABackend,A::AbstractSparseMatrix) = A |> cu
# Adapt.adapt(::CUDABackend, x::Vector) = x |> cu # not needed
# Adapt.adapt(::CUDABackend, x::CuVector) = x # not needed

# TODO: remove this function if back compatibility is not needed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backward compat with what?

Preconditioners.convert_to_backend(::CUDABackend, A::AbstractSparseMatrix) = adapt(CUDABackend(), A)

Check warning on line 15 in ext/cuda/cuda_preconditioner.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_preconditioner.jl#L15

Added line #L15 was not covered by tests


# For some reason, these properties are not automatically defined for Device Arrays,
# TODO: remove the following code when https://github.com/JuliaGPU/CUDA.jl/pull/2738 is merged
#SparseArrays.rowvals(A::CUSPARSE.CuSparseDeviceMatrixCSC{Tv,Ti,1}) where {Tv,Ti} = A.rowVal
#SparseArrays.getcolptr(A::CUSPARSE.CuSparseDeviceMatrixCSC{Tv,Ti,1}) where {Tv,Ti} = A.colPtr
#SparseArrays.getnzval(A::CUSPARSE.CuSparseDeviceMatrixCSC{Tv,Ti,1}) where {Tv,Ti} = A.nzVal
#SparseMatricesCSR.getnzval(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.nzVal

# PIRACY ALERT: the following code is commented out to avoid piracy
# SparseMatricesCSR.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal
# SparseMatricesCSR.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr
Comment on lines +6 to +7
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two lines should be added in JuliaGPU/CUDA.jl#2720, right ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I am not sure about this one. Probably not, because SparseMatricesCSR is not an interface package. I would discuss this in the linked PR.

Comment on lines +5 to +7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# PIRACY ALERT: the following code is commented out to avoid piracy
# SparseMatricesCSR.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal
# SparseMatricesCSR.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr


# workaround for the issue with SparseMatricesCSR
# TODO: find a more robust solution to dispatch the correct function
Preconditioners.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal
Preconditioners.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr

Check warning on line 32 in ext/cuda/cuda_preconditioner.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_preconditioner.jl#L31-L32

Added lines #L31 - L32 were not covered by tests

Preconditioners.sparsemat_format_type(::CUSPARSE.CuSparseDeviceMatrixCSC{Tv,Ti,1}) where {Tv,Ti} = CSCFormat
Preconditioners.sparsemat_format_type(::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = CSRFormat

Check warning on line 35 in ext/cuda/cuda_preconditioner.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_preconditioner.jl#L34-L35

Added lines #L34 - L35 were not covered by tests
20 changes: 20 additions & 0 deletions ext/cuda/cuda_utils.jl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these here for now and add a comment to the tests that prior to 2720 we need to add these dispatches? I do not want to artificially postpone this PR here further due to missing optional stuff in the dependencies.

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# remove the following code once PR (https://github.com/JuliaGPU/CUDA.jl/pull/2720) is merged ##
CUDA.CUSPARSE.CuSparseMatrixCSR{T}(Mat::SparseMatrixCSR) where {T} =

Check warning on line 2 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L2

Added line #L2 was not covered by tests
CUDA.CUSPARSE.CuSparseMatrixCSR{T}(CuVector{Cint}(Mat.rowptr), CuVector{Cint}(Mat.colval),
CuVector{T}(Mat.nzval), size(Mat))


CUSPARSE.CuSparseMatrixCSC{T}(Mat::SparseMatrixCSR) where {T} =

Check warning on line 7 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L7

Added line #L7 was not covered by tests
CUSPARSE.CuSparseMatrixCSC{T}(CUSPARSE.CuSparseMatrixCSR(Mat))

SparseMatricesCSR.SparseMatrixCSR(A::CUSPARSE.CuSparseMatrixCSR) =

Check warning on line 10 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L10

Added line #L10 was not covered by tests
SparseMatrixCSR(CUSPARSE.SparseMatrixCSC(A)) # no direct conversion (gpu_CSR -> cpu_CSC -> cpu_CSR)

Adapt.adapt_storage(::Type{CuArray}, xs::SparseMatrixCSR) =

Check warning on line 13 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L13

Added line #L13 was not covered by tests
CUSPARSE.CuSparseMatrixCSR(xs)

Adapt.adapt_storage(::Type{CuArray{T}}, xs::SparseMatrixCSR) where {T} =

Check warning on line 16 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L16

Added line #L16 was not covered by tests
CUSPARSE.CuSparseMatrixCSR{T}(xs)

Adapt.adapt_storage(::Type{Array}, mat::CUSPARSE.CuSparseMatrixCSR) =

Check warning on line 19 in ext/cuda/cuda_utils.jl

View check run for this annotation

Codecov / codecov/patch

ext/cuda/cuda_utils.jl#L19

Added line #L19 was not covered by tests
SparseMatrixCSR(mat)
4 changes: 3 additions & 1 deletion src/Thunderbolt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ import ForwardDiff

import ModelingToolkit
import ModelingToolkit: @variables, @parameters, @component, @named,
compose, ODESystem, Differential
compose, ODESystem, Differential

# Accelerator support libraries
import GPUArraysCore: AbstractGPUVector, AbstractGPUArray
Expand Down Expand Up @@ -83,6 +83,8 @@ include("solver/interface.jl")
include("solver/linear.jl")
include("solver/nonlinear.jl")
include("solver/time_integration.jl")
include("solver/linear/preconditioners/Preconditioners.jl")
@reexport using .Preconditioners


include("modeling/electrophysiology/ecg.jl")
Expand Down
48 changes: 48 additions & 0 deletions src/solver/linear/preconditioners/Preconditioners.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
module Preconditioners

using SparseArrays, SparseMatricesCSR
using LinearSolve
import LinearSolve: \
using Adapt
using UnPack
import KernelAbstractions: Backend, @kernel, @index, @ndrange, @groupsize, @print, functional,
CPU,synchronize
import SparseArrays: getcolptr,getnzval
import SparseMatricesCSR: getnzval
import LinearAlgebra: Symmetric

## Generic Code #

# CSR and CSC are exact the same in symmetric matrices,so we need to hold symmetry info
# in order to be exploited in cases in which one format has better access pattern than the other.
abstract type AbstractMatrixSymmetry end
struct SymmetricMatrix <: AbstractMatrixSymmetry end
struct NonSymmetricMatrix <: AbstractMatrixSymmetry end

abstract type AbstractMatrixFormat end
struct CSRFormat <: AbstractMatrixFormat end
struct CSCFormat <: AbstractMatrixFormat end


# Why using these traits?
# Since we are targeting multiple backends, but unfortunately, all the sparse matrix CSC/CSR on all
# backends don't share the same supertype (e.g. AbstractSparseMatrixCSC/AbstractSparseMatrixCSR)
# e.g. CUSPARSE.CuSparseDeviceMatrixCSC <:SparseArrays.AbstractSparseMatrixCSC → false
# So we need to define our own traits to identify the format of the sparse matrix
sparsemat_format_type(::SparseMatrixCSC) = CSCFormat
sparsemat_format_type(::SparseMatrixCSR) = CSRFormat

#TODO: remove once https://github.com/JuliaGPU/CUDA.jl/pull/2740 is merged
convert_to_backend(backend::Backend, A::AbstractSparseMatrix) =
adapt(backend, A) # fallback value, specific backends are to be extended in their corresponding extensions.

# Why? because we want to circumvent piracy when extending these functions for device backend (e.g. CuSparseDeviceMatrixCSR)
# TODO: find a more robust solution to dispatch the correct function
colvals(A::SparseMatrixCSR) = SparseMatricesCSR.colvals(A)
getrowptr(A::SparseMatrixCSR) = SparseMatricesCSR.getrowptr(A)

include("l1_gauss_seidel.jl")

export L1GSPrecBuilder

end
Loading
Loading