You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Raise bucket weights to the power four in the historical model
Utilizing the results of probes sent once a minute to a random node
in the network for a random amount (within a reasonable range), we
were able to analyze the accuracy of our resulting success
probability estimation with various PDFs across the historical and
live-bounds models.
For each candidate PDF (as well as other parameters, including the
histogram bucket weight), we used the
`min_zero_implies_no_successes` fudge factor in
`success_probability` as well as a total probability multiple fudge
factor to get both the historical success model and the a priori
model to be neither too optimistic nor too pessimistic (as measured
by the relative log-loss between succeeding and failing hops in our
sample data).
We then compared the resulting log-loss for the historical success
model and selected the candidate PDF with the lowest log-loss,
skipping a few candidates with similar resulting log-loss but with
more extreme constants (such as a power of 11 with a higher
`min_zero_implies_no_successes` penalty).
Somewhat surprisingly (to me at least), the (fairly strongly)
preferred model was one where the bucket weights in the historical
histograms are exponentiated. In the current design, the weights
are effectively squared as we multiply the minimum- and maximum-
histogram buckets together before adding the weight*probabilities
together.
Here we multiply the weights yet again before addition. While the
simulation runs seemed to prefer a slightly stronger weight than
the 4th power we do here, the difference wasn't substantial
(log-loss 0.5058 to 0.4941), so we do the simpler single extra
multiply here.
Note that if we did this naively we'd run out of bits in our
arithmetic operations - we have 16-bit buckets, which when raised
to the 4th can fully fill a 64-bit int. Additionally, when looking
at the 0th min-bucket we occasionally add up to 32 weights together
before multiplying by the probability, requiring an additional five
bits.
Instead, we move to using floats during our histogram walks, which
further avoids some float -> int conversions because it allows for
retaining the floats we're already using to calculate probability.
Across the last handful of commits, the increased pessimism more
than makes up for the increased runtime complexity, leading to a
40-45% pathfinding speedup on a Xeon Silver 4116 and a 25-45%
speedup on a Xeon E5-2687W v3.
Thanks to @twood22 for being a sounding board and helping analyze
the resulting PDF.
0 commit comments