-
Notifications
You must be signed in to change notification settings - Fork 404
Don't bump the next_node_counter
when using a removed counter
#3367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't bump the next_node_counter
when using a removed counter
#3367
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3367 +/- ##
==========================================
+ Coverage 89.61% 89.64% +0.03%
==========================================
Files 127 127
Lines 103533 104218 +685
Branches 103533 104218 +685
==========================================
+ Hits 92778 93426 +648
- Misses 8056 8107 +51
+ Partials 2699 2685 -14 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, great find!
If we manage to pull a `node_counter` from `removed_node_counters` for reuse, `add_channel_between_nodes` would `unwrap_or` with the `next_node_counter`-incremented value. This visually looks right, except `unwrap_or` is always called, causing us to always increment `next_node_counter` even if we don't use it. This will result in the `node_counter`s always growing any time we add a new node to our graph, leading to somewhat larger memory usage when routing and a debug assertion failure in `test_node_counter_consistency`. The fix is trivial, this is what `unwrap_or_else` is for.
8913192
to
0c0cb6f
Compare
Grr, rustfmt was mad.
|
**chan_info_node_counter = removed_node_counters | ||
.pop() | ||
.unwrap_or(self.next_node_counter.fetch_add(1, Ordering::Relaxed) as u32); | ||
**chan_info_node_counter = removed_node_counters.pop().unwrap_or_else(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a call to test_node_counter_consistency
to the end of this method? From local testing it looks like this would've caught the bug. Not sure if other methods could use it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't appear to have any test coverage which removes then adds a channel, so even with a new check in this method none of our tests fail. Ultimately we end up panicing immediately when the next message comes in, though, so running a real node with debug assertions on hit this pretty quick :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I get two test failures with this diff (i.e. reverting the fix and adding the check):
diff --git a/lightning/src/routing/gossip.rs b/lightning/src/routing/gossip.rs
index 9d85a6c58..f761e6a63 100644
--- a/lightning/src/routing/gossip.rs
+++ b/lightning/src/routing/gossip.rs
@@ -2077,9 +2077,9 @@ where
},
IndexedMapEntry::Vacant(node_entry) => {
let mut removed_node_counters = self.removed_node_counters.lock().unwrap();
- **chan_info_node_counter = removed_node_counters.pop().unwrap_or_else(|| {
+ **chan_info_node_counter = removed_node_counters.pop().unwrap_or(
self.next_node_counter.fetch_add(1, Ordering::Relaxed) as u32
- });
+ );
node_entry.insert(NodeInfo {
channels: vec![short_channel_id],
announcement_info: None,
@@ -2088,6 +2088,9 @@ where
},
};
}
+ core::mem::drop(channels);
+ core::mem::drop(nodes);
+ self.test_node_counter_consistency();
Ok(())
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, I'm an idiot, I ran the test without reverting the diff. I'll add this in a followup, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a suggestion but I'm happy to land this
If we manage to pull a
node_counter
fromremoved_node_counters
for reuse,add_channel_between_nodes
wouldunwrap_or
with thenext_node_counter
-incremented value. This visually looks right, exceptunwrap_or
is always called, causing us to always incrementnext_node_counter
even if we don't use it.This will result in the
node_counter
s always growing any time we add a new node to our graph, leading to somewhat larger memory usage when routing and a debug assertion failure intest_node_counter_consistency
.The fix is trivial, this is what
unwrap_or_else
is for.This was included in the 0.0.125 release.