bimodal pathfinding probability improvements #8330

bitromortac · 2024-01-02T17:42:31Z

Change Description

Fixes normalization bugs with respect to large channels and situations where the channel doesn't obey a bimodal distribution (fixes #8263 and #9085).

The probability calculation is avoided for amounts smaller than the last success amount in order to speed things up. This also fixes a case where mission control has outdated values (fixes #7553).

coderabbitai · 2024-11-07T12:04:39Z

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)

llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

yyforyongyu

Looking good. Will do a second round to reload the bimodal to understand the implications. Meanwhile left some questions, also think this is missing a rebase (checked out locally and ignored files popped up).

yyforyongyu · 2025-04-16T08:49:44Z

docs/release-notes/release-notes-0.19.0.md

@@ -40,6 +40,12 @@
  a graceful shutdown of LND during the main chain backend sync check in certain
  cases.

+* [Bimodal pathfinding probability


is this targeting 19 or 20?

I changed it to target 20, but I think it could be included earlier since it's basically a bug fix PR.

yyforyongyu · 2025-04-16T08:51:59Z

routing/probability_bimodal_test.go

-	largeAmount = lnwire.MilliSatoshi(5_000_000)
-	capacity    = lnwire.MilliSatoshi(10_000_000)
-	scale       = lnwire.MilliSatoshi(400_000)
+	smallAmount = lnwire.MilliSatoshi(400_000_000)


Q: is the commit msg saying it doesn't matter what values to set here?

Not entirely different values, one should be able to multiply those values by any same number (here by 1000) and the tests should stay invariant, because that number cancels out if you look at primitive. I changed those values because for the default config of the bimodal scale we use 300000 sat and the 400000 sat is closer to that.

routing/probability_bimodal_test.go

yyforyongyu · 2025-04-16T09:01:24Z

routing/probability_bimodal_test.go

@@ -309,6 +309,20 @@ func TestIntegral(t *testing.T) {
 	}
 }

+// TestSpecialCase tests a value combination found by fuzz tests.
+func TestSpecialCase(t *testing.T) {


This name is too generic and gives very little info about what's being tested here...maybe add a link to #9085?

Totally, I changed it and added some description.

Copilot

Pull Request Overview

This PR improves the bimodal pathfinding probability calculations by fixing normalization bugs and optimizing performance when handling outdated mission control values. Key changes include:

Updated constants and tolerance handling in the test suite.
Revised the normalization formula in the primitive function and modified fallback logic in probabilityFormula.
Added new tests for small scale values and a fuzz-triggered edge case.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
routing/probability_bimodal_test.go	Updated amount constants, removed per-test tolerance in favor of a default value, and added new tests.
routing/probability_bimodal.go	Revised normalization in the primitive function and updated fallback logic when success exceeds fail.
docs/release-notes/release-notes-0.20.0.md	Added release notes and contributor details related to the bimodal improvements.

Comments suppressed due to low confidence (2)

routing/probability_bimodal.go:516

[nitpick] Consider updating the log message to improve grammar (e.g., 'fail amount (%v) is smaller than or equal to the success amount (%v) for capacity (%v)').

log.Tracef("fail amount (%v) is smaller than or equal the " + "success amount (%v) for capacity (%v)", failAmountMsat, successAmountMsat, capacityMsat)

routing/probability_bimodal.go:450

The updated primitive function now includes an additional term 'x/(c*s)' in its numerator; please verify that the revised indefinite integral and normalization are mathematically correct and consistent with the intended bimodal distribution behavior.

return (-exs + excs + x/(c*s)) / norm

routing/probability_bimodal.go

yyforyongyu · 2025-04-21T09:22:33Z

routing/probability_bimodal.go


 	// We end up with the primitive function of the normalized P(x).
-	return (-exs + excs) / norm
+	return (-exs + excs + x/(c*s)) / norm


This is a very interesting mix of bimodal and uniform distribution - can c here ever be 1? just thinking about edge cases here.

Still curious about the c, what happens if it is 1?

I think my comment got lost here. In practice that would mean that the combined distribution would be dominated by a uniform one. We could use a small constant as well here, but I wanted to avoid adding that. The 1/c term just represents a number that should be small with respect to the bimodal peaks and it cancels out in the norm calculation such that we only have 1/s left.

yyforyongyu

For the itest failures, the neutrino one is a known issue, the other one is weird and looks like it's related to the listunspent bug, could you rebase to the latest master to include the fix?

yyforyongyu · 2025-04-23T06:36:44Z

routing/probability_bimodal.go

-			successAmount, capacity)
-
-		successAmount = capacity
+	// Mission control may have some outdated values with regard to the


nice doc💯

yyforyongyu · 2025-04-23T06:38:56Z

routing/probability_bimodal.go

+			"%s to capacity %s", successAmountMsat,
+			failAmount, capacityMsat)
+
+		successAmount = capacity - 1


hmm what's the reason for minus 1?

Agree, it's not that elegant, but it's to make both the success amount and fail amount together (where the fail amount must be one unit larger than the success amount) fall into the range [0, c]. In principle the fail amount could be allowed to be c+1, but I decided to always use c as the upper bound in the bimodal model, since it makes formulas and code nicer and we never expect to be able to send the full capacity due to the balance reserve anyhow.

so the successAmt needs to be -1 smaller for the reNorm calculation to not be 0 basically ?

so the successAmt needs to be -1 smaller for the reNorm calculation to not be 0 basically ?

right, I think that's another way to say that there would be a logical contradiction to say that you can send the same amount as you fail to send

yyforyongyu · 2025-04-23T06:41:12Z

routing/probability_bimodal.go


+		successAmount = 0


In this case should we keep the failAmount the same and only let it be changed under if failAmount > capacity {?

This expresses that we don't know which of the fail/success amounts has the true info and overriding both expresses that we are unsure about both, which is why I prefer to have it in a symmetric way. An idea here is that we could only override the more outdated observation with the bounds (but I think that we should not encounter that line anyhow because of mission control enforcing the condition, so that would be over-engineered).

yyforyongyu · 2025-04-23T06:42:11Z

routing/probability_bimodal.go


 	// We end up with the primitive function of the normalized P(x).
-	return (-exs + excs) / norm
+	return (-exs + excs + x/(c*s)) / norm


Still curious about the c, what happens if it is 1?

yyforyongyu

LGTM🌹 Since this is a bug fix, I wonder if it's possible to include it in 0.19? cc @saubyk

lightninglabs-deploy · 2025-04-30T13:19:05Z

@bitromortac, remember to re-request review from reviewers when ready

ziggie1984

Very nice documentation of the problem. Had some minor comments, if they are resolved it's good to go.

ziggie1984 · 2025-04-24T21:38:31Z

routing/probability_bimodal_test.go

+// balance lies somewhere in the middle of the channel, a surprising result for
+// the bimodal model, which predicts two distinct modes at the edges and
+// therefore has numerical issues in the middle.
+func TestBimodalFuzz9085(t *testing.T) {


LND has the possibility to add not fuzz tests into the concrete fuzz function:

// FuzzProbability checks that we don't encounter errors related to NaNs. func FuzzProbability(f *testing.F) { // Predefined seed values resulting from // https://github.com/lightningnetwork/lnd/issues/9085. f.Add( uint64(1000000000), uint64(300000000), uint64(400000000), uint64(300000000), ) estimator := BimodalEstimator{ BimodalConfig: BimodalConfig{BimodalScaleMsat: scale}, } f.Fuzz(func(t *testing.T, capacity, successAmt, failAmt, amt uint64) { if capacity == 0 { return } _, err := estimator.probabilityFormula( lnwire.MilliSatoshi(capacity), lnwire.MilliSatoshi(successAmt), lnwire.MilliSatoshi(failAmt), lnwire.MilliSatoshi(amt), ) require.NoError(t, err, "c: %v s: %v f: %v a: %v", capacity, successAmt, failAmt, amt) }) }

or we add a new file in the lnd-fuzz directory. I think we should find a good strategy how we handle those cases resulting from fuzzing. cc @morehouse

Nice suggestion, TIL 🎉. I have added the seed. We also don't lose coverage since the other test I introduced tests the same.

ziggie1984 · 2025-04-30T13:55:48Z

routing/probability_bimodal_test.go

@@ -78,7 +81,6 @@ func TestSuccessProbability(t *testing.T) {
 			failAmount:          capacity,
 			amount:              smallAmount,
 			expectedProbability: 0.684,


Q: May I ask you how you came up with these values? Did you calculate the probability via other means and then just compared ?

Yeah not too great I didn't calculate the number independently, but it serves more like a way to pin the behavior and to spot changes. The number itself is a sanity check however, since it's larger than the 0.5 in "no info, large amount". This is expected because for an amount that is similar to the scale you expect to find some leftovers, increasing the success probability.

ziggie1984 · 2025-04-30T16:08:37Z

routing/probability_bimodal_test.go

+	_, err := estimator.probabilityFormula(
+		capacity, successAmount, failAmount, amtCloseSuccess,
+	)
+	require.ErrorContains(t, err, "normalization factor is zero")


Interesting that in only fails in the reNorm calculation not already when calculating the prob:

prob := p.integral(capacity, amount, failAmount)

I think both calculations for prob and reNorm don't really "fail", but they return float64(0), not some non-zero small amount, because

ecs := math.Exp(-c / s) exs := math.Exp(-x / s)

both give float64(0), because c/s and x/s being very large. So in that sense both calculations are almost correct, but due to the division by plain zero we fail.

ziggie1984 · 2025-04-30T16:11:14Z

routing/probability_bimodal_test.go

+	require.NoError(t, err)
+	require.InDelta(t, 0.0, p, defaultTolerance)
+
+	// In the region where the bimodal model doesn't give good forecasts, we


Q: could you describe where the region is here where the bimodal is not good, I mean it is very dependant on the constants we use of the capacity and scale or ? I wonder why it scales linearly I thought it's more uniform now where the bimodal does fall of to sharply ?

ahh ok given the integral the uniform distributions becomes linear, understand. But still interesting that a 25% into the channel-balance uncertainty we already have almost only the uniform part of the equation having the most effect. Probably needs some time to get an intuition for these values.

right, it depends on the parameter combination, if you choose a huge s, every bimodal distribution will be a uniform one, since it doesn't matter how large the capacity is and this smears out the bimodal peaks.

If s is (very) small compared to c, the probability will drop quickly. If we then learn that the balance range is within the success/fail amounts range where the non-normalized probability is ~zero, we would still be able to retain the bimodal distribution within the newly learned range, but now starting at the success/fail amounts, if we had full numerical precision.

Because the numerical values of the exponential drop to zero if the scale is very small, at amounts >> s, we get the renormalization error. I think uncovering this error was nice, because I think keeping the bimodal distribution for the learned amounts isn't great and falling back to uniform distribution is better (since we it demonstrated that a bimodal model doesn't hold).

ziggie1984 · 2025-04-30T16:17:00Z

routing/probability_bimodal.go

+			"%s to capacity %s", successAmountMsat,
+			failAmount, capacityMsat)
+
+		successAmount = capacity - 1


so the successAmt needs to be -1 smaller for the reNorm calculation to not be 0 basically ?

The bimodal model doesn't depend on the unit, which is why updating to more realistic values doesn't require changes in tests. We pin the scale in the fuzz test to not invalidate the corpus.

This test demonstrates an error found in a fuzz test by adding a previously found seed, which will be fixed in an upcoming commit. The following fuzz test is expected to fail: go test -v -fuzz=Prob ./routing/

This test demonstrates that the current bimodal model leads to numerical inaccuracies in a certain regime of successes and failures.

If the success and fail amounts indicate that a channel doesn't obey a bimodal distribution, we fall back to a uniform/linear success probability model. This also helps to avoid numerical normalization issues with the bimodal model. This is achieved by adding a very small summand to the balance distribution P(x) ~ exp(-x/s) + exp((x-c)/s), 1/c that helps to regularize the probability distribution. The distribution becomes finite for intermediate balances where the exponentials would be evaluated to an exact zero (float) otherwise. This regularization is effective in edge cases and leads to falling back to a uniform model should the bimodal model fail. This affects the normalization to be s * (-2 * exp(-c/s) + 2 + 1/s) and the primitive function to receive an extra term x/(cs). The previously added fuzz seed is expected to be resolved with this.

We skip the evaluation of probabilities when the amount is lower than the last success amount, as the probability would be evaluated to 1 in that case.

If we encounter invalid mission control data, we fall back to no knowledge about the node pair.

Mission control may have outdated success/failure amounts for node pairs that have channels with differing capacities. In that case we assume to still find the liquidity as before and rescale the amounts to the according range.

ziggie1984

LGTM

bitromortac · 2025-05-02T14:08:21Z

Thanks for the reviews @ziggie1984 and @yyforyongyu, I'm still performing some tests.

bitromortac · 2025-05-07T09:41:18Z

I'm done with testing, this is ready. (Still should maybe decide if in 19.x or 20).

bitromortac · 2025-05-08T17:12:44Z

Updated to address the 0.19 release notes.

saubyk added path finding routing labels Jan 3, 2024

saubyk added this to the v0.18.0 milestone Jan 3, 2024

saubyk assigned bitromortac Jan 3, 2024

saubyk modified the milestones: v0.18.0, v0.18.1 Feb 6, 2024

saubyk modified the milestones: v0.18.1, 0.19.0 May 6, 2024

bitromortac mentioned this pull request Sep 12, 2024

[bug]: routing: BimodalEstimator fails when channel capacity >1M sat #9085

Closed

bitromortac force-pushed the 2401-bimodal-improvements branch from 320291c to ff273aa Compare November 7, 2024 12:04

Roasbeef modified the milestones: v0.19.0, v0.19 overflow Feb 5, 2025

ziggie1984 self-requested a review February 25, 2025 12:56

saubyk modified the milestones: v0.19 overflow, v0.20.0 Mar 27, 2025

saubyk added this to lnd v0.20 Mar 27, 2025

saubyk moved this to In progress in lnd v0.20 Mar 27, 2025

saubyk requested a review from yyforyongyu April 15, 2025 16:53

yyforyongyu reviewed Apr 16, 2025

View reviewed changes

bitromortac force-pushed the 2401-bimodal-improvements branch from ff273aa to 8d96893 Compare April 16, 2025 15:06

bitromortac requested a review from yyforyongyu April 16, 2025 15:09

yyforyongyu requested a review from Copilot April 21, 2025 06:10

Copilot AI reviewed Apr 21, 2025

View reviewed changes

yyforyongyu reviewed Apr 21, 2025

View reviewed changes

bitromortac force-pushed the 2401-bimodal-improvements branch from 8d96893 to 6c1a32b Compare April 22, 2025 14:36

bitromortac requested a review from yyforyongyu April 22, 2025 14:39

yyforyongyu reviewed Apr 23, 2025

View reviewed changes

bitromortac force-pushed the 2401-bimodal-improvements branch from 6c1a32b to 8131b8a Compare April 23, 2025 07:43

bitromortac requested a review from yyforyongyu April 23, 2025 07:43

yyforyongyu approved these changes Apr 23, 2025

View reviewed changes

ziggie1984 reviewed Apr 30, 2025

View reviewed changes

bitromortac added 8 commits May 2, 2025 10:23

routing: update to realistic test values

55e7343

The bimodal model doesn't depend on the unit, which is why updating to more realistic values doesn't require changes in tests. We pin the scale in the fuzz test to not invalidate the corpus.

routing: use default tolerance for bimodal testing

04eec71

routing: add test for fuzz test special case

615b617

This test demonstrates an error found in a fuzz test by adding a previously found seed, which will be fixed in an upcoming commit. The following fuzz test is expected to fail: go test -v -fuzz=Prob ./routing/

routing: add test for small scale

fe32105

This test demonstrates that the current bimodal model leads to numerical inaccuracies in a certain regime of successes and failures.

routing: don't compute prob for success amounts

5ba9619

We skip the evaluation of probabilities when the amount is lower than the last success amount, as the probability would be evaluated to 1 in that case.

routing: forget info for contradictions

5afb0de

If we encounter invalid mission control data, we fall back to no knowledge about the node pair.

routing: refine amount scaling

07f863a

Mission control may have outdated success/failure amounts for node pairs that have channels with differing capacities. In that case we assume to still find the liquidity as before and rescale the amounts to the according range.

bitromortac force-pushed the 2401-bimodal-improvements branch from 8131b8a to 328b28d Compare May 2, 2025 09:11

ziggie1984 approved these changes May 2, 2025

View reviewed changes

saubyk removed this from lnd v0.20 May 8, 2025

saubyk modified the milestones: v0.20.0, v0.19.0 May 8, 2025

docs: update release notes

86249fb

bitromortac force-pushed the 2401-bimodal-improvements branch from 328b28d to 86249fb Compare May 8, 2025 16:32

guggero merged commit ee25c22 into lightningnetwork:master May 8, 2025
31 of 36 checks passed

bimodal pathfinding probability improvements #8330

bimodal pathfinding probability improvements #8330

Uh oh!

Conversation

bitromortac commented Jan 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Description

Uh oh!

coderabbitai bot commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

yyforyongyu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitromortac Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitromortac Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yyforyongyu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yyforyongyu left a comment

Choose a reason for hiding this comment

Uh oh!

lightninglabs-deploy commented Apr 30, 2025

Uh oh!

ziggie1984 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

bitromortac commented Jan 2, 2024 •

edited

Loading

coderabbitai bot commented Nov 7, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

bitromortac Apr 16, 2025 •

edited

Loading

bitromortac Apr 23, 2025 •

edited

Loading

bitromortac commented May 7, 2025 •

edited

Loading