Rework/simplify the Matrix Multiplication algorithm.#11365
Open
hexawyz wants to merge 2 commits intodotnet:mainfrom
Open
Rework/simplify the Matrix Multiplication algorithm.#11365hexawyz wants to merge 2 commits intodotnet:mainfrom
hexawyz wants to merge 2 commits intodotnet:mainfrom
Conversation
Author
|
@dotnet-policy-service agree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Similar to my other PR #11364 , this seeks to improve performance of Matrix operations.
While attempting to vectorize Matrix code to improve performance, I tried various approaches at changing the shape of the code, when it occured to me that the structure of the code itself could be made more straightforward.
While this change in itself did little to help with performance in the vectorization case (it changed nothing), it does have a much greater impact when applied to the current non-vectorized code.
The approach I took here was:
As a consequence, the final assembly code ends up significantly smaller.
However, benchmarking showed that the JIT now very much liked inlining the matrix multiplication code, even in the absence of any specific hint. This may or may not be a good thing. This does mean, though, that the performance is increased in this instance.
Customer Impact
Regression
Testing
If I force the method to not be inlined, it seems that there could be a very slight performance regression:
However, without any hint, the performance gains can be huge:
Benchmark code: https://github.com/hexawyz/WpfVectorBenchmark/blob/6513f1eb2cf4a6beb797df5ccc4f515a376c57d8/WpfVectorBenchmark/Benchmarks.cs
Risk
It will change the performance characteristics of matrix multiplication. Identity special cases will be marginally slower due to losing their special treatment, but the reduction in the number of conditional jumps should be helpful overall.
Performance gains from the method becoming inlinable can be impressive, but this could lead to code size increasing significantly in some scenarios. It will largely depend on common usage patterns of each program.
Microsoft Reviewers: Open in CodeFlow