You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/index.md
+19-25Lines changed: 19 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,15 +1,23 @@
1
1
# ThreadedDenseSparseMul.jl
2
2
3
-
ThreadedDenseSparseMul.jl is a Julia package that provides a threaded implementation of dense-sparse matrix multiplication, built on top of `Polyester.jl`.
3
+
`ThreadedDenseSparseMul.jl` is a Julia package that provides a threaded implementation of dense-sparse matrix multiplication, built on top of `Polyester.jl`.
4
4
5
-
## Installation
5
+
This package addresses the need for fast computation of `C ← C + D × S`, where `D` and `S` are dense and sparse matrices, respectively. It differs from existing solutions in the following ways:
6
6
7
-
You can install ThreadedDenseSparseMul.jl using Julia's package manager:
7
+
- SparseArrays.jl doesn't support threaded multiplication.
8
+
- MKLSparse.jl doesn't support dense × sparsecsc multiplication directly.
9
+
- ThreadedSparseCSR.jl and ThreadedSparseArrays.jl support only sparse × dense multiplication.
8
10
9
-
```julia
10
-
using Pkg
11
-
Pkg.add("ThreadedDenseSparseMul")
12
-
```
11
+
ThreadedDenseSparseMul.jl shows significant performance improvements over base Julia implementations, especially for large matrices.
12
+
13
+
## Performance
14
+
#### [`fastdensesparsemul_threaded!`](@ref) outperforms `MKLSparse` by 2x:
15
+
16
+

17
+
18
+
#### [`fastdensesparsemul_outer_threaded!`](@ref) outperforms `SparseArrays` by 4x:
19
+
20
+

-`fastdensesparsemul_outer_threaded!(C, a, b, α, β)`: Threaded version of `fastdensesparsemul_outer!`
40
-
41
-
## Rationale
42
-
43
-
This package addresses the need for fast computation of `C ← C - D × S`, where `D` and `S` are dense and sparse matrices, respectively. It differs from existing solutions in the following ways:
44
-
45
-
- SparseArrays.jl doesn't support threaded multiplication.
46
-
- IntelMKL.jl doesn't support dense × sparsecsc multiplication directly.
47
-
- ThreadedSparseCSR.jl and ThreadedSparseArrays.jl support only sparse × dense multiplication.
43
+
```@autodocs
44
+
Modules = [ThreadedDenseSparseMul]
45
+
```
48
46
49
47
## Implementation
50
48
51
-
The core implementation is quite simple, leveraging `Polyester.jl` for threading:
49
+
The core implementation is quite simple, leveraging `Polyester.jl` for threading. The result is simply something similar to
@@ -60,10 +58,6 @@ function fastdensesparsemul_threaded!(C::AbstractMatrix, A::AbstractMatrix, B::S
60
58
end
61
59
```
62
60
63
-
## Performance
64
-
65
-
ThreadedDenseSparseMul.jl shows significant performance improvements over base Julia implementations, especially for large matrices. Benchmark results comparing against SparseArrays.jl and MKLSparse.jl are available in the repository.
66
-
67
61
## Contributing
68
62
69
63
Contributions to ThreadedDenseSparseMul.jl are welcome! Please feel free to submit issues or pull requests on the GitHub repository.
0 commit comments