[DO NOT MERGE] perf run for rustc-hash candidate (folded multiply) #136095

orlp · 2025-01-26T14:40:22Z

rustbot · 2025-01-26T14:40:29Z

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-01-26T14:40:32Z

rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing src/librustdoc/json/conversions.rs; otherwise, make sure you bump the FORMAT_VERSION constant.

cc @CraftSpider, @aDotInTheVoid, @Enselic, @obi1kenobi

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

Some changes occurred in exhaustiveness checking

cc @Nadrieril

Noratrieb · 2025-01-26T14:44:10Z

@bors try @rust-timer queue

bors · 2025-01-26T14:45:25Z

⌛ Trying commit 7051d3c with merge bf4c342...

…, r=<try> [DO NOT MERGE] perf run for rustc-hash candidate (folded multiply) See rust-lang/rustc-hash#55.

Noratrieb · 2025-01-26T15:31:40Z

@bors try @rust-timer queue

…, r=<try> [DO NOT MERGE] perf run for rustc-hash candidate (folded multiply) See rust-lang/rustc-hash#55.

bors · 2025-01-26T15:32:52Z

⌛ Trying commit 3a6da61 with merge 3b20532...

bors · 2025-01-26T17:08:27Z

💔 Test failed - checks-actions

Noratrieb · 2025-01-26T17:12:04Z

this should also allow you to do your own try builds (but not perf I think)
@bors delegate+ try @rust-timer queue

bors · 2025-01-26T17:12:08Z

✌️ @orlp, you can now approve this pull request!

If @Noratrieb told you to "r=me" after making some further change, please make that change, then do @bors r=@Noratrieb

bors · 2025-01-26T17:12:09Z

📋 Looks like this PR is still in progress, ignoring approval.

Hint: Remove [DO NOT MERGE] from this PR's title when it is ready for review.

Noratrieb · 2025-01-26T17:12:58Z

sorry bors it looks like you're not up to the task
@bors try

…, r=<try> [DO NOT MERGE] perf run for rustc-hash candidate (folded multiply) See rust-lang/rustc-hash#55.

bors · 2025-01-26T17:13:17Z

⌛ Trying commit b0fe3f1 with merge d694429...

rust-log-analyzer · 2025-01-26T17:17:25Z

The job mingw-check-tidy failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

info: removing rustup binaries
info: rustup is uninstalled
##[group]Image checksum input
mingw-check-tidy
# We use the ghcr base image because ghcr doesn't have a rate limit
# and the mingw-check-tidy job doesn't cache docker images in CI.
FROM ghcr.io/rust-lang/ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
  g++ \
  make \
---

COPY host-x86_64/mingw-check/validate-toolstate.sh /scripts/
COPY host-x86_64/mingw-check/validate-error-codes.sh /scripts/

# NOTE: intentionally uses python2 for x.py so we can test it still works.
# validate-toolstate only runs in our CI, so it's ok for it to only support python3.
ENV SCRIPT TIDY_PRINT_DIFF=1 python2.7 ../x.py test \
           --stage 0 src/tools/tidy tidyselftest --extra-checks=py,cpp
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
#    pip-compile --allow-unsafe --generate-hashes reuse-requirements.in
---
#12 2.791 Building wheels for collected packages: reuse
#12 2.792   Building wheel for reuse (pyproject.toml): started
#12 3.002   Building wheel for reuse (pyproject.toml): finished with status 'done'
#12 3.003   Created wheel for reuse: filename=reuse-4.0.3-cp310-cp310-manylinux_2_35_x86_64.whl size=132719 sha256=be6760d5849de4a58bbe52b85ca57a55f2b32b518b17029a5ad2e530db0d4303
#12 3.003   Stored in directory: /tmp/pip-ephem-wheel-cache-afzvb1t9/wheels/3d/8d/0a/e0fc6aba4494b28a967ab5eaf951c121d9c677958714e34532
#12 3.006 Installing collected packages: boolean-py, binaryornot, tomlkit, reuse, python-debian, markupsafe, license-expression, jinja2, chardet, attrs
#12 3.403 Successfully installed attrs-23.2.0 binaryornot-0.4.4 boolean-py-4.0 chardet-5.2.0 jinja2-3.1.4 license-expression-30.3.0 markupsafe-2.1.5 python-debian-0.1.49 reuse-4.0.3 tomlkit-0.13.0
#12 3.403 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#12 3.930 Collecting virtualenv
#12 3.930 Collecting virtualenv
#12 3.971   Downloading virtualenv-20.29.1-py3-none-any.whl (4.3 MB)
#12 4.089      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 36.8 MB/s eta 0:00:00
#12 4.128 Collecting distlib<1,>=0.3.7
#12 4.133   Downloading distlib-0.3.9-py2.py3-none-any.whl (468 kB)
#12 4.144      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 469.0/469.0 KB 50.7 MB/s eta 0:00:00
#12 4.176 Collecting platformdirs<5,>=3.9.1
#12 4.181   Downloading platformdirs-4.3.6-py3-none-any.whl (18 kB)
#12 4.218 Collecting filelock<4,>=3.12.2
#12 4.224   Downloading filelock-3.17.0-py3-none-any.whl (16 kB)
#12 4.304 Installing collected packages: distlib, platformdirs, filelock, virtualenv
#12 4.488 Successfully installed distlib-0.3.9 filelock-3.17.0 platformdirs-4.3.6 virtualenv-20.29.1
#12 DONE 4.6s

#13 [7/8] COPY host-x86_64/mingw-check/validate-toolstate.sh /scripts/
#13 DONE 0.0s
---
DirectMap4k:      114624 kB
DirectMap2M:     8273920 kB
DirectMap1G:    10485760 kB
##[endgroup]
Executing TIDY_PRINT_DIFF=1 python2.7 ../x.py test            --stage 0 src/tools/tidy tidyselftest --extra-checks=py,cpp
+ TIDY_PRINT_DIFF=1 python2.7 ../x.py test --stage 0 src/tools/tidy tidyselftest --extra-checks=py,cpp
    Finished `dev` profile [unoptimized] target(s) in 0.05s
##[endgroup]
WARN: currently no CI rustc builds have rustc debug assertions enabled. Please either set `rust.debug-assertions` to `false` if you want to use download CI rustc or set `rust.download-rustc` to `false`.
downloading https://static.rust-lang.org/dist/2025-01-08/rustfmt-nightly-x86_64-unknown-linux-gnu.tar.xz
---
   Compiling build_helper v0.1.0 (/checkout/src/build_helper)
   Compiling regex v1.11.1
   Compiling termcolor v1.4.1
   Compiling miropt-test-tools v0.1.0 (/checkout/src/tools/miropt-test-tools)
   Compiling rustc-hash v2.2.0 (https://github.com/orlp/rustc-hash?rev=2ccde1ec8a948b5463011d3c7363c273fc2bb80e#2ccde1ec)
   Compiling tidy v0.1.0 (/checkout/src/tools/tidy)
    Finished `release` profile [optimized] target(s) in 31.20s
##[endgroup]
fmt check
fmt check
fmt: checked 5812 files
tidy check
tidy error: invalid source: "git+https://github.com/orlp/rustc-hash?rev=2ccde1ec8a948b5463011d3c7363c273fc2bb80e#2ccde1ec8a948b5463011d3c7363c273fc2bb80e"

thread 'deps (.)' panicked at src/tools/tidy/src/deps.rs:590:24:
cmd.exec() failed with `cargo metadata` exited with an error:     Updating crates.io index
    Updating git repository `https://github.com/orlp/rustc-hash`
error: the lock file /checkout/src/tools/rust-analyzer/Cargo.lock needs to be updated but --locked was passed to prevent this
If you want to try to generate the lock file without accessing the network, remove the --locked flag and use --offline instead.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread 'main' panicked at src/tools/tidy/src/main.rs:60:49:
called `Result::unwrap()` on an `Err` value: Any { .. }
called `Result::unwrap()` on an `Err` value: Any { .. }
Command has failed. Rerun with -v to see more details.
  local time: Sun Jan 26 17:17:13 UTC 2025
  network time: Sun, 26 Jan 2025 17:17:13 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

bors · 2025-01-26T18:57:43Z

☀️ Try build successful - checks-actions
Build commit: d694429 (d6944297e3c614ac689e05c9829840bd153ad632)

rust-timer · 2025-01-26T20:13:13Z

Finished benchmarking commit (d694429): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.1%, 0.7%]	71
Regressions ❌ (secondary)	0.3%	[0.1%, 1.1%]	56
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	2
All ❌✅ (primary)	0.3%	[0.1%, 0.7%]	71

Max RSS (memory usage)

Results (primary -0.6%, secondary 2.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.6%	[1.0%, 3.7%]	5
Improvements ✅ (primary)	-0.6%	[-0.6%, -0.6%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.6%	[-0.6%, -0.6%]	1

Cycles

Results (primary 1.4%, secondary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.4%	[0.8%, 1.9%]	6
Regressions ❌ (secondary)	2.4%	[1.5%, 3.5%]	12
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.4%	[0.8%, 1.9%]	6

Binary size

Results (secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	6
All ❌✅ (primary)	-	-	0

Bootstrap: 774.799s -> 773.882s (-0.12%)
Artifact size: 328.16 MiB -> 325.45 MiB (-0.83%)

steffahn · 2025-01-26T21:53:24Z

I wonder if the refactor to self.hash.wrapping_mul(K).wrapping_add(i); ordering does turn out relevant for any surprising performance differences.

As far as I understand, the idea behind the wrapping_mul(K).wrapping_add(i) order is that the initial 0 * K calculation can be removed by the optimizer; then you do the same as before (alternating add(i_n), and mul(K)), and finally you have saved one mul(K) multiplication, which "makes up for" some of the cost of the bit-extending multiplication+XOR.

But what if the initial 0 * K isn’t actually free? And/or the difference of order in wrapping_mul(K).wrapping_add(i) vs wrapping_mul(i).wrapping_add(K) per hashing "step", could perhaps even with inlining happen to produce code that's somehow harder to optimize well…

I suppose, we can test the performance of a variant of this change, where the new wrapping_mul(K).wrapping_add(i) in add_to_hash is kept, the finalizer is just wrapping_mul(K) followed by the old rotate_left(ROTATE) finalizer - i.e. with the reasoning above, that's a version of the code that should in theory hopefully not make any difference to the status quo, provided optimization is doing its job effectively.

…-changed-ordering-perf, r=<try> [DO NOT MERGE] perf run for only mul/add-reordering parts of rust-lang#136095 See rust-lang#136095 (comment) --- *For good comparability with the performance of rust-lang#136095, I'm keeping the `rustc-hasher` in `rustc_type_ir` at the "newly modified 2.*"… just noting this down in case it might be a difference from master that matters much.* --- ([link to the `rustc-hash` commits for context](rust-lang/rustc-hash@43e1790...1028035))

[DO NOT MERGE] perf run for rustc-hash candidate (folded multiply)

7051d3c

rustbot assigned Mark-Simulacrum Jan 26, 2025