-
Notifications
You must be signed in to change notification settings - Fork 3
L1 Gauss Seidel preconditioner #191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
|
Unsymmetric StatsUnpreconditioned
Preconditionednparts = 8 (same as cores)
|
Symmetric Problem:
Unpreconditioned stats:
Preconditioned stats:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great already. Some last fine tuning and we are ready to go.
# SparseMatricesCSR.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal | ||
# SparseMatricesCSR.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two lines should be added in JuliaGPU/CUDA.jl#2720, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I am not sure about this one. Probably not, because SparseMatricesCSR is not an interface package. I would discuss this in the linked PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a compat entry for the new CUDA release?
|
||
[targets] | ||
test = ["Aqua", "DelimitedFiles", "JET", "Pkg", "StaticArrays", "Tensors", "Test"] | ||
test = ["Aqua", "DelimitedFiles", "JET", "Pkg", "StaticArrays", "Tensors", "Test","MatrixDepot","ThreadPinning"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need ThreadPinning for testing?
ext/cuda/cuda_preconditioner.jl
Outdated
# Adapt.adapt(::CUDABackend, x::Vector) = x |> cu # not needed | ||
# Adapt.adapt(::CUDABackend, x::CuVector) = x # not needed | ||
|
||
# TODO: remove this function if back compatibility is not needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backward compat with what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing a test for the ThreadedSparseCSR format here
@doc raw""" | ||
L1GSPreconditioner{Partitioning, VectorType} | ||
|
||
The ℓ₁ Gauss–Seidel preconditioner is a robust and parallel-friendly preconditioner, particularly effective for sparse and ill-conditioned systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill-conditioned systems
Where have you got that statement from?
@@ -10,7 +15,7 @@ import CUDA: | |||
import Thunderbolt: | |||
UnPack.@unpack, | |||
SimpleMesh, | |||
SparseMatrixCSR, SparseMatrixCSC, | |||
SparseMatrixCSR, SparseMatrixCSC, AbstractSparseMatrix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we import these types here from their respective packages instead of Thunderbolt? We might hit weird bugs otherwise.
# PIRACY ALERT: the following code is commented out to avoid piracy | ||
# SparseMatricesCSR.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal | ||
# SparseMatricesCSR.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# PIRACY ALERT: the following code is commented out to avoid piracy | |
# SparseMatricesCSR.colvals(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.colVal | |
# SparseMatricesCSR.getrowptr(A::CUSPARSE.CuSparseDeviceMatrixCSR{Tv,Ti,1}) where {Tv,Ti} = A.rowPtr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove these here for now and add a comment to the tests that prior to 2720 we need to add these dispatches? I do not want to artificially postpone this PR here further due to missing optional stuff in the dependencies.
if __cuda_version__ >= __min_cuda_version__ | ||
include("cuda/cuda_preconditioner.jl") | ||
else | ||
@warn("CuThunderboltExt.jl: CUDA.jl version is too old $__cuda_version__ <$__min_cuda_version__, skipping `cuda_preconditioner.jl`.") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if __cuda_version__ >= __min_cuda_version__ | |
include("cuda/cuda_preconditioner.jl") | |
else | |
@warn("CuThunderboltExt.jl: CUDA.jl version is too old $__cuda_version__ <$__min_cuda_version__, skipping `cuda_preconditioner.jl`.") | |
end | |
include("cuda/cuda_preconditioner.jl") |
- add a compat entry to the Project.toml.
# y: preconditioned residual | ||
@unpack partitioning,B,D = P | ||
@unpack backend = partitioning | ||
@. y = x / (B + D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we precompute 1/(B+D)
? This should give us a speedup, as multiplication is usually way faster than division.
Also, I think we are missing the Gauss-Seidel part on the local block?
Rough ideas for the data structures that incorporate partitioned arrays with CPU example of Gauss Seidel with and without partitions 📚