-
Notifications
You must be signed in to change notification settings - Fork 405
Add default implementation for tracking metadata #636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add default implementation for tracking metadata #636
Conversation
Adds NetworkTracker trait that allows users to implement interface between their metadata and their view of NetworkGraph. Users can implement their own metadata structs, but at a minimum will have to implement the UpdateMetadata trait (even if the functions do nothing). By default, this trait uses a new struct called DefaultMetadata which just tracks previously failed routes, and can find out if a route has previously failed.
Codecov Report
@@ Coverage Diff @@
## master #636 +/- ##
==========================================
- Coverage 91.26% 91.22% -0.04%
==========================================
Files 35 35
Lines 20918 20927 +9
==========================================
+ Hits 19090 19091 +1
- Misses 1828 1836 +8
Continue to review full report at Codecov.
|
Also add route_succeeded() function which removes the successful route off the failed_route Vec.
Track the number of successes a failed channel needs to get off the "failed" list. If it is on the list, it penalty fee is u64::MAX. If route_success is called and the channel ID is on the route, its score is decremented until it hits 0, at which point it is removed from the failed list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're doing a lot of back-and-forth on this scoring interface without explicitly stating its requirements:
- We do want an interface flexible to encompass a lot different cases (see Add ChannelScore struct (WIP) #626 (comment)). Ideally this different cases may be combined by client to avoid each of them to reinvent the wheel
- It's hard to predict whole flexibility desired right now without going a bit further, and actually implementing a least a default one
- Beyond, a channel scoring algorithm, we want to be sure failed routes aren't submitted again (Channel fee penalty (WIP) #635 (comment))
- Moving forward, how design here (Channel fee penalty (WIP) #635 (review)) or its refinement is incompatible with the "I literally just tried that route and it failed", you may just pass a high severity score
score_payment_success
?
Does anyone see issues not stated ?
@@ -428,6 +428,70 @@ impl Readable for NodeInfo { | |||
} | |||
} | |||
|
|||
/// Allows for updating user's network metadata. | |||
pub trait UpdateMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me this trait is trying to encompass both inputs consuming to compute score (add_failed_route
) and scores consumption (get_channel_fee_penalty
). Second is likely to be consumed by Router and any-kind of peer selection, but first-one is really tied about the origin of data consumed.
If it's event-based, like receiving a PaymentSuccess or PaymentFailed, client is going to call a add_failed_route
method-like. If it's self-running, like a thread probing/tracking peer liveness,, you just call something like spawn_liveness_thread
. So I would favor dissociate those traits even if their are implemented on the same default struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can disassociate these traits, I think that makes sense. Maybe the NetworkGraph <---> User Metadata connection should be called UpdatesMetadata, and the User metadata <---> router should be called ScoresChannels. The user would implement both traits, and the network would, by default, send certain data to the traits, and the router would by, default, ask for channel scores when conducting its A* search.
} | ||
|
||
/// A default metadata object that is used to implement the default functionality for the | ||
/// NetworkTracker trait. A user could make their own Metadata object and extend the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a question of code sharing here, if people start to work on fancy scoring logic, they may want to combine them, and having each of them implement a custom UpdateMetadata they won't be able to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why? You could write a combiner, but ultimately we have to have a way to linearize any kind of fancy scoring logic users want into one parameter (namely fee paid) so that we can calculate a route based on it. No matter what users do, they have to be able to linearize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think we agree on the need to linearize it, my point was more on dissociating linearization method (get_channel_fee_penalty
) from score-feeding ones (route_succeeded
/route_failed
). Linearization one should be on such combiner. I don't think we have to do this now, but we may have to introduce such combiner in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool. Can you add a parameter to the get_route function to take this and use it?
I added the route_succeeded method to the RoutingFeePenalty trait and added a parameter to get_route that must implement the RoutingFeePenalty trait. I was also curious to hear what you think of the idea to separate the data feeding (eg methods used to update user metadata) and the data consumption methods (eg getting routing_fee from the user). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think API is good enough for now. Just few code improvements suggestions. Reusing tests from previous PRs could be great, but they can be added as a follow-up. It's new code there is no risk of regressions due to scoring methods not yet used.
Also, as a next step, we treat uniformly channels which have never being used and ones who have known successes, we should bookmark good ones to favor them with a decreasing fee penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, I think. We should have ChannelManager call route_succeeded and route_failed, though.
fn route_failed(&mut self, route: Vec<RouteHop>, failed_hop: RouteHop) { | ||
for route_hop in route { | ||
if route_hop == failed_hop { | ||
*self.failed_channels.entry(failed_hop.short_channel_id).or_insert(5) += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH this feels a bit weird and arbitrary - you need 5 successes to un-fail a channel but once we have even 1 failure we refuse to route through a channel at all. I think maybe a more interesting approach would be to consider the time of the last failure and de-prioritize exponentially less as a failure ages, removing it entirely if we succeed. Because we cannot call current-time functions in the core library, we can't do that exactly, however something that may approximate would be to fail a channel until the next success - ie if we fail a channel, we wont use it the next time we route, but we'll use it any time after that. A user could, of course, do something better themselves by re-implementing the API. Separately, we could maybe add a feature flag for "we can call gettimeofday()".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that it feels weird/arbitrary. I like the idea of having the exponential backoff too, but as mentioned we do not have access to time. For now, I am implementing "the opposite" where on success routing, all channels are removed from the failed_channels list, and on failure, channels have a "consecutive fails since last successful payment" count incremented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but because we fully-refuse to route through a channel after its failed once, we should never be able to send through it ever again if it fails once (because we'll never succeed routing through it if we never try...).
On success: all channels on the route are removed from the failed channels list. On failure: all channels are incremented a count tracking consecutive failed payments executated.
This needs rebase. |
Adds NetworkTracker trait that allows users to implement interface
between their metadata and their view of NetworkGraph. Users can
implement their own metadata structs, but at a minimum will have to
implement the UpdateMetadata trait (even if the functions do nothing).
By default, this trait uses a new struct called DefaultMetadata
which just tracks previously failed routes, and can find out if a
route has previously failed.