-
Notifications
You must be signed in to change notification settings - Fork 5k
Improve Math.BigMul to fix Decimal compare perf regression #115182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR aims to fix a regression in Decimal comparisons by improving the worst-case performance of Math.BigMul through reducing extra stack spilling.
- Removed unused TARGET_32BIT branch code in BigMul(uint, uint).
- Added conditional logic for non-ARM64 platforms and refactored BigMul methods to remove unsafe code while leveraging BMI2 support.
- Updated comments to reflect the current implementation and performance considerations.
I've updated the benchmark since it seems like i had mixed up the corelib.dll for the pr and my "old" branch. |
converting to draft since I want to run a quick test against calling the basic x86 mul as instrinct instead of bmi2 mulx |
Tagging subscribers to this area: @dotnet/area-system-runtime |
Improve the worst case performance of Math.BigMul which will hopefully resolve #112432 (regression i Decimal comparision)
This also allows Math.BigMul to do partial constant propagation (of the low part)
The regression seems to be caused by extra stack spilling caused by usage of MultiplyNoFlags
This PR applies the workaround suggested by @tevador in #11782 (comment) (se link for benchmarks) to Math.BigMul, which removes the extra push/pop which will hopefully restore performance.
See sharplab.io for codegen (Change BigMul2 to BigMul for code in main)
New

Main

Benchmark
Se in Bmi2.MultiplyNoFlags issues #11782 (comment)
Benchmark for orderby with and without this pr (main is commit where decimal code was merged)
builting x86 multiply approach (x86_multiply branch)
I've updated the code forther to make Math.BigMul a single mul instruction on all x64 platforms (not just those with BMI2) and the code
Benchmarks
currently rerunning after rebase ...
Exampels of new code
Produces the following (code from just before rebase)
Current code according to goodbolt goodbolt
Further code samples with array access