Skip to content

Commit aca7b35

Browse files
committed
Update documentation.
1 parent ba63378 commit aca7b35

File tree

1 file changed

+19
-25
lines changed

1 file changed

+19
-25
lines changed

docs/src/index.md

Lines changed: 19 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,23 @@
11
# ThreadedDenseSparseMul.jl
22

3-
ThreadedDenseSparseMul.jl is a Julia package that provides a threaded implementation of dense-sparse matrix multiplication, built on top of `Polyester.jl`.
3+
`ThreadedDenseSparseMul.jl` is a Julia package that provides a threaded implementation of dense-sparse matrix multiplication, built on top of `Polyester.jl`.
44

5-
## Installation
5+
This package addresses the need for fast computation of `C ← C + D × S`, where `D` and `S` are dense and sparse matrices, respectively. It differs from existing solutions in the following ways:
66

7-
You can install ThreadedDenseSparseMul.jl using Julia's package manager:
7+
- SparseArrays.jl doesn't support threaded multiplication.
8+
- MKLSparse.jl doesn't support dense × sparsecsc multiplication directly.
9+
- ThreadedSparseCSR.jl and ThreadedSparseArrays.jl support only sparse × dense multiplication.
810

9-
```julia
10-
using Pkg
11-
Pkg.add("ThreadedDenseSparseMul")
12-
```
11+
ThreadedDenseSparseMul.jl shows significant performance improvements over base Julia implementations, especially for large matrices.
12+
13+
## Performance
14+
#### [`fastdensesparsemul_threaded!`](@ref) outperforms `MKLSparse` by 2x:
15+
16+
![`fastdensesparsemul!` outperforms MKLSparse by 2x.](assets/main.svg)
17+
18+
#### [`fastdensesparsemul_outer_threaded!`](@ref) outperforms `SparseArrays` by 4x:
19+
20+
![`fastdensesparsemulmul!` outperforms SparseArrays for outer product by 4x.](assets/main_outer.svg)
1321

1422
## Usage
1523

@@ -32,23 +40,13 @@ fastdensesparsemul_outer_threaded!(buf, @view(A[:, 1]), B[1,:], 1, 0)
3240
The interface is adapted from the 5-parameter definition used by `mul!` and BLAS.
3341

3442
## API Reference
35-
36-
- `fastdensesparsemul!(C, A, B, α, β)`: Computes `C = β*C + α*A*B`
37-
- `fastdensesparsemul_threaded!(C, A, B, α, β)`: Threaded version of `fastdensesparsemul!`
38-
- `fastdensesparsemul_outer!(C, a, b, α, β)`: Computes `C = β*C + α*a*b'`
39-
- `fastdensesparsemul_outer_threaded!(C, a, b, α, β)`: Threaded version of `fastdensesparsemul_outer!`
40-
41-
## Rationale
42-
43-
This package addresses the need for fast computation of `C ← C - D × S`, where `D` and `S` are dense and sparse matrices, respectively. It differs from existing solutions in the following ways:
44-
45-
- SparseArrays.jl doesn't support threaded multiplication.
46-
- IntelMKL.jl doesn't support dense × sparsecsc multiplication directly.
47-
- ThreadedSparseCSR.jl and ThreadedSparseArrays.jl support only sparse × dense multiplication.
43+
```@autodocs
44+
Modules = [ThreadedDenseSparseMul]
45+
```
4846

4947
## Implementation
5048

51-
The core implementation is quite simple, leveraging `Polyester.jl` for threading:
49+
The core implementation is quite simple, leveraging `Polyester.jl` for threading. The result is simply something similar to
5250

5351
```julia
5452
function fastdensesparsemul_threaded!(C::AbstractMatrix, A::AbstractMatrix, B::SparseMatrixCSC, α::Number, β::Number)
@@ -60,10 +58,6 @@ function fastdensesparsemul_threaded!(C::AbstractMatrix, A::AbstractMatrix, B::S
6058
end
6159
```
6260

63-
## Performance
64-
65-
ThreadedDenseSparseMul.jl shows significant performance improvements over base Julia implementations, especially for large matrices. Benchmark results comparing against SparseArrays.jl and MKLSparse.jl are available in the repository.
66-
6761
## Contributing
6862

6963
Contributions to ThreadedDenseSparseMul.jl are welcome! Please feel free to submit issues or pull requests on the GitHub repository.

0 commit comments

Comments
 (0)