-
Notifications
You must be signed in to change notification settings - Fork 403
Open
Description
Dear BLIS Team,
I'm currently working to improve Numpy's matmul for the strided case and I ran a large grid search with different BLAS frameworks, see
Here a repost of the plots:
The plots show the improvement of performance of the respective BLAS framework plus copying over naïve matrix multiplication.
In the case of BLIS, a red (performance degradation instead of speedup) hyperbola for very small matrices exists and is more intense than in other frameworks, e.g. OpenBLAS or AOCL in a small triangular area on the same machine. Maybe there is still some room for improvement. I can do more benchmarks and plots like that if interested and also provide some code.
Best from Berlin, Michael
Metadata
Metadata
Assignees
Labels
No labels
