Key hashing in const, remove atomics#669
Draft
hoxxep wants to merge 3 commits intometrics-rs:mainfrom
Draft
Conversation
Member
|
@hoxxep Just wanted to say that I haven't gone over this entirely yet, but from the surface level description, this sounds like a nice win. 🎉 |
Contributor
Author
|
@tobz seems to be! I want to do some more benchmarking to confirm it when I find the time, and then the two big decisions that will need your input are:
I can also optimise |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a work in progress, but thought I'd share an update for comments. Apologies it's become a bit of an essay, but I think the direction is promising.
This PR:
Keyin const, always cachinghashand removing the need for atomics.impl Hash for Keywith an implementation that simply hashes the cached hash value, viawrite_u64(hash).KeyHasherinto a NoHashHasher-style for the newimpl Hash for Keyimplementation, whileDefaultHashableinstead usesRapidHasher<'static>, as nohash isn't suitable for generic hashing.recency.rsto use hashbrown with KeyHasher (nohash) instead of std HashMap with SipHash. I need to benchmark the change to hashbrown here, if there's no performance gains I'll revert some of my changes to instead use std HashMap with KeyHasher (nohash).const_cowmodule inside thecow.rs, this is a copy of the Deref implementation, but allows us to Deref in const.Keyhashing implementationContinuing on from our chat on #651, I've completely overhauled the hashing of
Keyto remove the atomics, pursuing absolute speed for the 99% case and assuming inputs are trusted (HashDoS is not a concern). I've opted forrapidhash_v3_nano_inlineto hash each individual part of the key, assuming names and labels are most often keys under 48 bytes in length.The hash value of the name is reused as the seed value for the labels, and the hash of the label key as the seed for the label values. This disambiguates the name/key/label, so that swapping values between them produces different hash values, but it also creates a data-dependency, and we could go faster still if we didn't disambiguate (but then a user swapping the label key/value around would produce the same hash value). Let me know if you care about the disambiguation here?
Label hashes are combined by summing them. This ensures hashing labels is order-agnostic as per the previous implementation, but avoids having to sort them. Addition is used instead of XOR, as XOR would mean two identical labels would cancel each other out.
TODO
recency.rshashbrown change.const keybenchmark assembly withoutconst { ... }wrappers.generate_key_hash, we could improve performance further if we're willing to sacrifice disambiguating between names, label keys, and label values.KeyHasherto a no-hash hasher. Any users usingKeyHasheron other objects might face panics at runtime.#[inline]and#[inline(always)].Bench profile
I've changed the default bench profile to ensure consistency between benchmarking runs.
Decrease in performance for
const keybenchmarksI seem to have to wrap the benchmarks in
const { ... }to convince the compiler to make the Key a simple load operation. I haven't checked what assembly it otherwise generates.This applies more to the old bench profile so doesn't appear in the full benchmarks below, but I was seeing a slight decrease performance (2ns instead of 1ns) in the
const keybenchmarks (the new bench profile is always around 2ns). The main difference is coming from the new code loading the compiled-time generated Key from static memory while old code constructed the Key inline with immediate zeros. It doesn't appear to affect the performance of other benchmarks.Old code (keyhasher-rapidhash branch):
New code (keyhasher-const branch):
Assembly comparison
Old inner loop:
New inner loop:
Relevant benchmark results
Run on an M1 Max with the new bench profile (
codegen-units = 1andlto = true). This is comparing mykeyhasher-rapidhashbranch (PR #651) tokeyhasher-const(this PR), so that the improvements in switching to rapidhash is already accounted for.