Skip to content

Drop debug_assert that would have us panic for replayed PaymentClaimeds #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tnull
Copy link
Collaborator

@tnull tnull commented Mar 27, 2025

Fixes #506

In cd8958f we changed the internal API behavior of PaymentStore::update. While previously it would return Ok(true) the to-be-updated entry was found in the store, it now returns Ok(true) if not only the entry was found but it was actually updated and re-persisted. This was an improvement as it allows us to avoid unnecessary re-persists if nothing changed.

However, there were 1-2 places that implicitly relied on that behavior for logging purposes which we didn't correctly update to the new behavior. Unfortunately, one instance in handling PaymentClaimed events actually even debug_asserted on the return value, which lead to some unnecessary panics in debug in case PaymentClaimed got replayed. Here we rectify this by dropping the debug_assert.

return Err(Error::InvalidPaymentHash);
match self.payment_store.update(&update) {
Ok(true) => (),
Ok(false) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly confused: how did we determine that it wasn't a hash for which fail_for_hash has already been called hence nothing changed?
iiuc, we will get Ok(false) in that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's indeed the case, in which case the logging will be slightly off. I considered possible mitigations to this, but all I could think of were rather invasive just to fix this small edge case. Let me know if you think it's very important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the payment_store.update function, and I think it would make sense to return an error with a message like "payment hash not found" in an else block when the payment is not found in the store. This way, returning false would only indicate that no new update was made that is the payment exists but no changes were applied, while a missing payment would result in a clear error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just a logging edge case, or also that the user gets an InvalidPaymentHash error when calling this function twice indeed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just a logging edge case, or also that the user gets an InvalidPaymentHash error when calling this function twice indeed?

I think this is correct: previously, calling it twice would succeed and have us call fail_htlc_backwards twice if the payment was known at all. Now, we'd only have it succeed once, and would error out on the second attempt and/or if the payment had previously failed already. Added some clarifications to the docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is optimal API design. I've seen that callers oftentimes also need to implement idempotency, and that is difficult now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, now added a ternary enum that represents all update results.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(now also fixed CI)

@tnull tnull removed the request for review from joostjager March 28, 2025 08:57
@tnull tnull added this to the 0.5 milestone Mar 28, 2025
@tnull tnull requested a review from G8XSU March 31, 2025 13:26
Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT, but not sure how the update info need to be return Ok(false) and not just a specific error?

Looks like to me that this will make the state machine a little bit more complex?

@tnull tnull requested a review from joostjager April 17, 2025 08:18
@tnull tnull force-pushed the 2025-03-fix-paymentdetails-panic-debug-assert branch from def24fd to 6af64b6 Compare April 17, 2025 08:19
src/event.rs Outdated
);
debug_assert!(false);
},
Ok(_) => (),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps document why we can ignore this result and mention the replay possbility?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if you prefer, although the comment is a bit redundant tbh. Added a fixup.

Copy link
Contributor

@joostjager joostjager Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually meant to suggest to explain why this function is called multiple times and that that is the reason for ignoring the result value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for ignoring it is that not doing so is a bug stemming from the slight shift in our internal API semantics. Seems weird to document something that changed by now?

Copy link
Contributor

@joostjager joostjager Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone might still wonder why the return value is ignored, and to understand this they need to know that the method can be called multiple times. Perhaps it can work too as a comment on the method itself.

It is a question that I had while reviewing, so others might too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, now added a fixup that expands the comment and also discerns between the new PaymentStoreUpdateResult variants.

return Err(Error::InvalidPaymentHash);
match self.payment_store.update(&update) {
Ok(true) => (),
Ok(false) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just a logging edge case, or also that the user gets an InvalidPaymentHash error when calling this function twice indeed?

@tnull
Copy link
Collaborator Author

tnull commented Apr 17, 2025

CI failure is unrelated and should be fixed by #524

@tnull tnull requested a review from joostjager April 18, 2025 09:20
@tnull tnull force-pushed the 2025-03-fix-paymentdetails-panic-debug-assert branch from ebccab7 to 3711ed1 Compare April 18, 2025 09:25
@tnull tnull force-pushed the 2025-03-fix-paymentdetails-panic-debug-assert branch from 58ba5bf to 8195d2e Compare April 18, 2025 09:37
…imed`s

In cd8958f we changed the internal API behavior of
`PaymentStore::update`. While previously it would return `Ok(true)` the
to-be-updated entry was found in the store, it now returns `Ok(true)` if
not only the entry was found but it was actually updated and
re-persisted. This was an improvement as it allows us to avoid
unnecessary re-persists if nothing changed.

However, there were 1-2 places that implicitly relied on that behavior
for logging purposes which we didn't correctly update to the new
behavior. Unfortunately, one instance in handling `PaymentClaimed`
events actually even `debug_assert`ed on the return value, which lead to
some unnecessary panics in `debug` in case `PaymentClaimed` got
replayed. Here we rectify this by dropping the `debug_assert`.
@tnull tnull force-pushed the 2025-03-fix-paymentdetails-panic-debug-assert branch from 8195d2e to 4c2e5f3 Compare April 18, 2025 09:40
@tnull
Copy link
Collaborator Author

tnull commented Apr 18, 2025

Squashed fixups without further changes.

@tnull tnull merged commit 273106d into lightningdevkit:main Apr 18, 2025
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Debug panic at event.rs:896
5 participants