Skip to content

Commit 7cf4562

Browse files
committed
Update readme to reflect new interface.
1 parent b63c352 commit 7cf4562

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

Readme.md

+9-7
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,15 @@
66
Just install and import this package, and launch Julia with some threads `e.g. julia --threads=auto`! Then e.g. any of these will be accelerated:
77
```julia
88
A = rand(1_000, 2_000); B = sprand(2_000, 30_000, 0.05); buf = similar(size(A,1), size(B,2)) # prealloc
9-
res = A*B
10-
buf .= A * B
11-
buf .+= 2 .* A * B
9+
fastdensesparsemul!(buf, A, B, 1, 0)
10+
fastdensesparsemul_threaded!(buf, A, B, 1, 0)
11+
fastdensesparsemul_outer!(buf, @view(A[:, 1]), B[1,:], 1, 0)
12+
fastdensesparsemul_outer_threaded!(buf, @view(A[:, 1]), B[1,:], 1, 0)
1213
```
1314

15+
The interface is adapted from the 5-parameter definition used by `mul!` and also BLAS.
16+
Previously we tried to overload the `mul!` operator, which comes with some nice syntactic convenience, but it lead to some downstream problems with package precompilation, and checking whether our implementation is actually used.
17+
1418
## Rationale
1519
I want to do $C \leftarrow C - D \times S$ fast, where $D$ and $S$ are dense and sparse matrices, respectively.
1620
Notice how this is different from $C \leftarrow C - S \times D$, i.e. dense $\times$ sparse vs sparse $\times$ dense.
@@ -25,7 +29,7 @@ I haven't found an implementation for that, so made one myself. In fact, the pac
2529
import SparseArrays: SparseMatrixCSC, mul!; import SparseArrays
2630
import Polyester: @batch
2731

28-
function SparseArrays.mul!(C::AbstractMatrix, A::AbstractMatrix, B::SparseMatrixCSC, α::Number, β::Number)
32+
function fastdensesparsemul_threaded!(C::AbstractMatrix, A::AbstractMatrix, B::SparseMatrixCSC, α::Number, β::Number)
2933
@batch for j in axes(B, 2)
3034
C[:, j] .*= β
3135
C[:, j] .+= A *.*B[:, j])
@@ -34,9 +38,7 @@ function SparseArrays.mul!(C::AbstractMatrix, A::AbstractMatrix, B::SparseMatrix
3438
end
3539
```
3640

37-
Julia will automatically use this 5-parameter definition to generate `mul!(C, A, B)` and calls like `C .+= A*B` and so forth.
38-
39-
Notice that this approach doesn't make sense for matrix-vector multiplication (the loop would just have one element), so that case is not considered in this package.
41+
Notice that this approach doesn't make sense for matrix-vector multiplication (the loop would just have one element), so that case is not considered in this package, however it does make sense for outer producs.
4042

4143

4244
### Note on column-major and CSC vs CSR

0 commit comments

Comments
 (0)