Skip to content

Rework/simplify the Matrix Multiplication algorithm.#11365

Open
hexawyz wants to merge 2 commits intodotnet:mainfrom
hexawyz:matrix-multiply
Open

Rework/simplify the Matrix Multiplication algorithm.#11365
hexawyz wants to merge 2 commits intodotnet:mainfrom
hexawyz:matrix-multiply

Conversation

@hexawyz
Copy link

@hexawyz hexawyz commented Jan 17, 2026

Description

Similar to my other PR #11364 , this seeks to improve performance of Matrix operations.
While attempting to vectorize Matrix code to improve performance, I tried various approaches at changing the shape of the code, when it occured to me that the structure of the code itself could be made more straightforward.
While this change in itself did little to help with performance in the vectorization case (it changed nothing), it does have a much greater impact when applied to the current non-vectorized code.

The approach I took here was:

  • Remove local variables (avoid hinting the runtime at using the stack)
  • Remove reliance on multiple conditional jumps, in favor of a single switch
  • Pack the switch cases densely (the previous algorithm was leaving large gaps between cases)
  • Rely on case fallback to reduce code duplication
  • Do not change the computation logic itself (i.e. for M11, M12, M21, …)

As a consequence, the final assembly code ends up significantly smaller.
However, benchmarking showed that the JIT now very much liked inlining the matrix multiplication code, even in the absence of any specific hint. This may or may not be a good thing. This does mean, though, that the performance is increased in this instance.

Customer Impact

Regression

Testing

If I force the method to not be inlined, it seems that there could be a very slight performance regression:

| Method             | Mean     | Error    | StdDev   | Code Size |
|------------------- |---------:|---------:|---------:|----------:|
| MultiplyMatrix_Ref | 13.19 ns | 0.073 ns | 0.065 ns |   1,191 B |
| MultiplyMatrix_A   | 13.44 ns | 0.029 ns | 0.024 ns |     961 B |

However, without any hint, the performance gains can be huge:

| Method                       | Mean      | Error     | StdDev    | Code Size |
|----------------------------- |----------:|----------:|----------:|----------:|
| Rotate_Ref                   | 20.237 ns | 0.0669 ns | 0.0626 ns |   1,463 B |
| Rotate_A                     |  1.023 ns | 0.0138 ns | 0.0129 ns |     149 B |
| TranslateRotateTranslate_Ref | 11.288 ns | 0.0307 ns | 0.0287 ns |   1,757 B |
| TranslateRotateTranslate_A   |  1.796 ns | 0.0227 ns | 0.0212 ns |     195 B |
| MultiplyMatrix_Ref           | 13.068 ns | 0.0335 ns | 0.0313 ns |   1,191 B |
| MultiplyMatrix_A             |  4.940 ns | 0.0187 ns | 0.0146 ns |     719 B |

Benchmark code: https://github.com/hexawyz/WpfVectorBenchmark/blob/6513f1eb2cf4a6beb797df5ccc4f515a376c57d8/WpfVectorBenchmark/Benchmarks.cs

Risk

It will change the performance characteristics of matrix multiplication. Identity special cases will be marginally slower due to losing their special treatment, but the reduction in the number of conditional jumps should be helpful overall.
Performance gains from the method becoming inlinable can be impressive, but this could lead to code size increasing significantly in some scenarios. It will largely depend on common usage patterns of each program.

Microsoft Reviewers: Open in CodeFlow

@hexawyz hexawyz requested a review from a team as a code owner January 17, 2026 19:55
@dotnet-policy-service dotnet-policy-service bot added PR metadata: Label to tag PRs, to facilitate with triage Community Contribution A label for all community Contributions labels Jan 17, 2026
@hexawyz
Copy link
Author

hexawyz commented Jan 18, 2026

@dotnet-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution A label for all community Contributions PR metadata: Label to tag PRs, to facilitate with triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant