Read `ChannelManager` even if we have no-peer post-update actions #3790

TheBlueMatt · 2025-05-22T02:39:47Z

In 93b4479 we fixed an issue which could cause a ChannelMonitorUpdate to get marked as blocked on itself, leading to an eventual force-closure.

One potential side-effect of that issue, however, is that any further ChannelMonitorUpdates to the same channel while it is blocked will not have any post-update actions processed (as there is a pending blocked ChannelMonitorUpdate sitting in the channel).

This can leave a dangling MonitorUpdateCompletionAction sitting around even after the channel is closed.

In 0.1, because ChannelMonitorUpdates to closed channels were finally fully tracked, we started enforcing that any post-update completion action we had on startup corresponded to a peer entry, while at the same time no longer creating peer entries just because we had serialized one in the data we were loading (only creating them if we had channel(s) or a ChannelMonitor).

This can cause some ChannelManager to no longer deserialize on 0.1 as we might have a left-over dangling
MonitorUpdateCompletionAction and will no longer always have a peer entry just because of it.

Here we fix this issue by specifically checking for dangling MonitorUpdateCompletionAction::PaymentClaim entries and dropping them if there is no corresponding channel or peer state entry. We only check for PaymentClaimed actions rather than allowing for any dangling actions as 93b4479 was only triggerable with MPP claims, so dangling
MonitorUpdateCompletionActions for forwarded payments should be exceedingly rare.

This also adds an upgrade test to test a slightly convoluted version of this scenario.

ldk-reviews-bot · 2025-05-22T02:39:50Z

I've assigned @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

lightning/src/ln/channelmanager.rs

joostjager · 2025-05-22T13:24:47Z

lightning/src/ln/channelmanager.rs

+								let logger =
+									WithContext::from(&args.logger, Some(node_id), None, None);
+								log_error!(logger, "Got blocked actions {:?} without a per-peer-state for {}", actions, node_id);
+								return Err(DecodeError::InvalidValue);


This is all new code that adds a new error condition.

Here we fix this issue by specifically checking for dangling
MonitorUpdateCompletionAction::PaymentClaim entries and dropping
them if there is no corresponding channel or peer state entry.

How are they dropped now, compared to how it happens in main?

On main in the no-peer-state case we always return an error, now we sometimes do not and ignore the completion actions. For the have-peer-state case, we used to store the completion actions even if we have no channel or monitor the completion actions are based on, but now we do not.

ldk-reviews-bot · 2025-05-22T13:24:53Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

joostjager · 2025-05-22T13:29:02Z

lightning/src/ln/channelmanager.rs

-				log_error!(WithContext::from(&args.logger, Some(node_id), None, None), "Got blocked actions without a per-peer-state for {}", node_id);
-				return Err(DecodeError::InvalidValue);
+				for actions in monitor_update_blocked_actions.values() {
+					for action in actions.iter() {


Repeated code block

Yea, I wasn't quite sure it was worth DRYing it up any given its so short (and the comments are slightly different)

TheBlueMatt · 2025-05-22T14:36:26Z

Oops, the first set of checks were too eager, luckily test_reload_mon_update_completion_actions caught it -

$ git diff-tree -U1 710560554 bb2e29e97
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index 84251ca33b..54e453d56b 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -14179,6 +14179,14 @@ where
 				let mut max_in_flight_update_id = 0;
+				let starting_len =  $chan_in_flight_upds.len();
 				$chan_in_flight_upds.retain(|upd| upd.update_id > $monitor.get_latest_update_id());
+				if $chan_in_flight_upds.len() < starting_len {
+					log_debug!(
+						$logger,
+						"{} ChannelMonitorUpdates completed after ChannelManager was last serialized",
+						starting_len - $chan_in_flight_upds.len()
+					);
+				}
 				let funding_txo = $monitor.get_funding_txo();
 				for update in $chan_in_flight_upds.iter() {
-					log_trace!($logger, "Replaying ChannelMonitorUpdate {} for {}channel {}",
+					log_debug!($logger, "Replaying ChannelMonitorUpdate {} for {}channel {}",
 						update.update_id, $channel_info_log, &$monitor.channel_id());
@@ -14746,24 +14754,7 @@ where
 					}
-					let peer_state = peer_state.lock().unwrap();
-					let channel_id_matches = |updates: &(_, Vec<ChannelMonitorUpdate>)| {
-						updates.1.iter().any(|update| update.channel_id == Some(*channel_id))
-					};
-					if !peer_state.in_flight_monitor_updates.values().any(channel_id_matches) {
-						// If there are no pending `ChannelMonitorUpdate`s for this channel but we
-						// have pending post-update actions, its possible that one was left over
-						// from pre-0.1 payment claims where MPP claims led to a channel blocked on
-						// itself and later `ChannelMonitorUpdate`s didn't get their post-update
-						// actions run.
-						// This should only have happened for `PaymentClaimed` post-update actions,
-						// which we check here.
-						for action in actions.iter() {
-							if let MonitorUpdateCompletionAction::PaymentClaimed { .. } = action {
-							} else {
-								let logger =
-									WithContext::from(&args.logger, Some(node_id), None, None);
-								log_error!(logger, "Got blocked actions {:?} without a per-peer-state for {}", actions, node_id);
-								return Err(DecodeError::InvalidValue);
-							}
-						}
-					}
+					// Note that we may have a post-update action for a channel that has no pending
+					// `ChannelMonitorUpdate`s, but unlike the no-peer-state case, it may simply be
+					// because we had a `ChannelMonitorUpdate` complete after the last time this
+					// `ChannelManager` was serialized. In that case, we'll run the post-update
+					// actions as soon as we get going.
 				}

codecov · 2025-05-22T14:44:34Z

Codecov Report

Attention: Patch coverage is 0% with 14 lines in your changes missing coverage. Please review.

Project coverage is 89.78%. Comparing base (02b5564) to head (bb2e29e).

Files with missing lines	Patch %	Lines
lightning/src/ln/channelmanager.rs	0.00%	14 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3790      +/-   ##
==========================================
- Coverage   89.80%   89.78%   -0.03%     
==========================================
  Files         159      159              
  Lines      128691   128703      +12     
  Branches   128691   128703      +12     
==========================================
- Hits       115573   115552      -21     
- Misses      10446    10472      +26     
- Partials     2672     2679       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lightning/src/ln/channelmanager.rs

ldk-reviews-bot · 2025-05-22T15:29:57Z

✅ Added second reviewer: @valentinewallace

TheBlueMatt · 2025-05-22T16:04:20Z

rustfmt'd (and used match instead of if let)

In 93b4479 we fixed an issue which could cause a `ChannelMonitorUpdate` to get marked as blocked on itself, leading to an eventual force-closure. One potential side-effect of that issue, however, is that any further `ChannelMonitorUpdate`s to the same channel while it is blocked will not have any post-update actions processed (as there is a pending blocked `ChannelMonitorUpdate` sitting in the channel). This can leave a dangling `MonitorUpdateCompletionAction` sitting around even after the channel is closed. In 0.1, because `ChannelMonitorUpdate`s to closed channels were finally fully tracked, we started enforcing that any post-update completion action we had on startup corresponded to a peer entry, while at the same time no longer creating peer entries just because we had serialized one in the data we were loading (only creating them if we had channel(s) or a `ChannelMonitor`). This can cause some `ChannelManager` to no longer deserialize on 0.1 as we might have a left-over dangling `MonitorUpdateCompletionAction` and will no longer always have a peer entry just because of it. Here we fix this issue by specifically checking for dangling `MonitorUpdateCompletionAction::PaymentClaim` entries and dropping them if there is no corresponding channel or peer state entry. We only check for `PaymentClaimed` actions rather than allowing for any dangling actions as 93b4479 was only triggerable with MPP claims, so dangling `MonitorUpdateCompletionAction`s for forwarded payments should be exceedingly rare. This also adds an upgrade test to test a slightly convoluted version of this scenario.

joostjager · 2025-05-22T16:14:06Z

lightning/src/ln/channelmanager.rs

-				return Err(DecodeError::InvalidValue);
+				for actions in monitor_update_blocked_actions.values() {
+					for action in actions.iter() {
+						if matches!(action, MonitorUpdateCompletionAction::PaymentClaimed { .. }) {


Hm I meant !matches! to avoid the else.

Yea, I liked the comment in the block that's ignoring the PaymentClaimeds. I mean I can move it outside if you prefer but I often like comments in empty if blocks 🤷‍♂️

valentinewallace

Code LGTM

valentinewallace · 2025-05-22T22:34:14Z

lightning-tests/src/upgrade_downgrade_tests.rs

+
+	// Finally, reload the node in the latest LDK. This previously failed.
+	let config = test_default_channel_config();
+	reload_node!(nodes[3], config, &node_d_ser, &[&mon_ser], persister, chain_mon, node);


Test LGTM, just took a slightly closer look. It also fails as expected on main.

TheBlueMatt · 2025-05-23T14:07:34Z

Backported in #3794

TheBlueMatt added the backport 0.1 label May 22, 2025

ldk-reviews-bot requested a review from joostjager May 22, 2025 02:50

joostjager reviewed May 22, 2025

View reviewed changes

TheBlueMatt force-pushed the 2025-05-0.1-upgrade-fail branch from 7105605 to bb2e29e Compare May 22, 2025 14:35

joostjager reviewed May 22, 2025

View reviewed changes

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved

ldk-reviews-bot requested a review from valentinewallace May 22, 2025 15:29

TheBlueMatt force-pushed the 2025-05-0.1-upgrade-fail branch from bb2e29e to 5ca1019 Compare May 22, 2025 16:04

TheBlueMatt force-pushed the 2025-05-0.1-upgrade-fail branch from 5ca1019 to 70b5552 Compare May 22, 2025 16:05

joostjager reviewed May 22, 2025

View reviewed changes

valentinewallace approved these changes May 22, 2025

View reviewed changes

valentinewallace reviewed May 22, 2025

View reviewed changes

wpaulino approved these changes May 22, 2025

View reviewed changes

TheBlueMatt merged commit 63a5e03 into lightningdevkit:main May 23, 2025
28 checks passed

TheBlueMatt mentioned this pull request May 23, 2025

[0.1] Backports for 0.1.4 #3794

Merged

TheBlueMatt removed the backport 0.1 label May 23, 2025

Read ChannelManager even if we have no-peer post-update actions #3790

Read ChannelManager even if we have no-peer post-update actions #3790

Uh oh!

Conversation

TheBlueMatt commented May 22, 2025

Uh oh!

ldk-reviews-bot commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented May 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented May 22, 2025

Uh oh!

codecov bot commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

ldk-reviews-bot commented May 22, 2025

Uh oh!

TheBlueMatt commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented May 23, 2025

Uh oh!

Uh oh!

Read `ChannelManager` even if we have no-peer post-update actions #3790

Read `ChannelManager` even if we have no-peer post-update actions #3790

ldk-reviews-bot commented May 22, 2025 •

edited

Loading

codecov bot commented May 22, 2025 •

edited

Loading

TheBlueMatt commented May 22, 2025 •

edited

Loading