-
Notifications
You must be signed in to change notification settings - Fork 404
Improve privacy for Blinded Message Paths using Dummy Hops #3726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Thanks for assigning @joostjager as a reviewer! |
|
||
match next_hop { | ||
NextMessageHop::NodeId(ref id) if id == &our_node_id => { | ||
peel_onion_message(&onion_message, secp_ctx, node_signer, logger, custom_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any risk somehow of infinite recursion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for pointing this out!
I took some time to sit with it, and here’s what I gathered —
Since blinded paths have a fixed upper limit on their size (as the number of hop_payloads
is bounded), there can only be a finite number of ForwardTlvs
. And at each step, we’re peeling off a layer and moving forward—not looping back—so the processing always progresses toward completion.
Given that, I believe there’s no risk of infinite recursion in this setup.
That said, if there’s a subtle case I’ve missed or something you see differently, I’d really appreciate hearing your thoughts. Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't spotted a specific problem, was just curious how certain it is that infinite recursion cannot happen. If the explanation isn't straight-forward, I'd add it as a comment to the code. And refer to point where the limitation is, something with new_packet_bytes
I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be infinite, but with an extra-long onion message ISTM it could still be a lot of processing time. If someone created a blinded path terminating at us with a bunch of dummy hops and we only find out that it's an inauthentic path when we get to the final layer, seems like that might be an issue.
@shaavan would it be possible to explore adding a new ControlTlvs::Dummy
variant that contains an HMAC and nonce, similar to how we authenticate blinded receive payloads elsewhere? And authenticate those fields before peeling the onion further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Let me give it a shot! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! I’ve updated the approach in pr3728.04 to introduce ControlTlvs::Dummy
with authentication.
Thanks so much, @valentinewallace, for all your guidance offline!
) -> Result<Vec<BlindedHop>, secp256k1::Error> { | ||
let pks = intermediate_nodes | ||
.iter() | ||
.map(|node| node.node_id) | ||
.chain((0..dummy_hops_count).map(|_| recipient_node_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked up the bolt spec and read
"MAY add additional "dummy" hops at the end of the path (which it will ignore on receipt) to obscure the path length."
What does ignore mean exactly? It seems in the next commit that it means to keep peeling? Using the recipient node id for all the dummy hops isn't really described in the bolt I think. Maybe mistaking.
Also a mention of padding is made:
"The padding field can be used to ensure that all encrypted_recipient_data have the same length. It's particularly useful when adding dummy hops at the end of a blinded route, to prevent the sender from figuring out which node is the final recipient"
Not sure if that is done now too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does “ignore” mean exactly? It seems in the next commit that it means to keep peeling? Using the recipient node id for all the dummy hops isn't really described in the bolt I think. Maybe mistaking.
The thinking behind this approach was that if dummy hops were added after the ReceiveTlvs
, it could open up timing-based attacks—where an attacker might estimate the position of the actual recipient based on how quickly a response is returned.
To avoid that, I added the dummy hops just before the final node. This way, even after receiving a dummy hop (with ForwardTlvs
directed to self), the node still has to keep peeling until it reaches the actual ReceiveTlvs
. This helps make response timing more uniform and avoids leaking information about path length.
Also a mention of padding is made
Yes! In PR #3177, we added support for padding in both BlindedMessagePaths
and BlindedPaymentPaths
, ensuring all payloads are a multiple of PADDING_ROUND_OFF
.
Since the MESSAGE_PADDING_ROUND_OFF
buffer is large enough, every payload—whether it's a ForwardTlvs
, dummy hop, or ReceiveTlvs
—ends up with the same total length. This helps prevent the sender from inferring the number of hops based on packet size.
I've also updated the padding tests to use new_with_dummy_hops
, so we make sure even dummy hops are padded the same way as real ones.
Thanks so much again for the super helpful feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid that, I added the dummy hops just before the final node.
If this is strictly better, it could be worth a PR to the bolt spec? At the minimum it might get you some feedback on this line of thinking.
I am not sure if the timing attack is avoided though, and worth the extra complexity. Peeling seems to be so much faster than an actual hop with network latency etc. Some random delay might be more effective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code also still works for blinded paths where dummy hops are added after the ReceiveTlvs right? Just making sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code also still works for blinded paths where dummy hops are added after the ReceiveTlvs right? Just making sure.
There shouldn't be a need to support that because we only support receiving to blinded paths that we create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timing attack does seem like a potential issue though. Not sure how to address that without adding some kind of ProcessPendingHtlcsForwardable
event for onion messages, which seems like overkill. I think we can maybe document it on the issue and push to follow-up? @TheBlueMatt do you have any thoughts on how to simulate a fake onion message forward when processing dummy hops?
Updated from pr3728.01 to pr3728.02 (diff): Changes:
|
🔔 1st Reminder Hey @valentinewallace! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
) -> Result<Vec<BlindedHop>, secp256k1::Error> { | ||
let pks = intermediate_nodes | ||
.iter() | ||
.map(|node| node.node_id) | ||
.chain((0..dummy_hops_count).map(|_| recipient_node_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timing attack does seem like a potential issue though. Not sure how to address that without adding some kind of ProcessPendingHtlcsForwardable
event for onion messages, which seems like overkill. I think we can maybe document it on the issue and push to follow-up? @TheBlueMatt do you have any thoughts on how to simulate a fake onion message forward when processing dummy hops?
80a4810
to
14e1e0d
Compare
Updated from pr3728.02 to pr3728.03 (diff): Changes:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3726 +/- ##
==========================================
+ Coverage 89.35% 89.39% +0.03%
==========================================
Files 157 157
Lines 124079 124350 +271
Branches 124079 124350 +271
==========================================
+ Hits 110876 111159 +283
+ Misses 10485 10479 -6
+ Partials 2718 2712 -6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Just re-request review when this PR is ready again |
Updated from pr3728.03 to pr3728.04 (diff):
|
This prepares the trait for use in dummy hop verification and Offer messages. Renaming helps generalize its purpose ahead of upcoming changes.
e3a5820
to
c7f158c
Compare
Updated from pr3728.04 to pr3728.05 (diff): Changes:
|
🔔 1st Reminder Hey @joostjager! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @joostjager! This PR has been waiting for your review. |
🔔 3rd Reminder Hey @joostjager! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few loose comments.
Main thing I try to understand is how the hmac authentication of the dummy hops helps with a time attack. Isn't this accomplishing the exact opposite? Peeling is stopped early when an hmac doesn't check out, but I thought you wanted to always process all the hops to make it look real?
I'd also add as much explanation and rationale in code comments as you can.
lightning/src/offers/signer.rs
Outdated
Hmac::from_engine(hmac) | ||
} | ||
|
||
pub(crate) fn verify_dummy_tlvs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be inside of impl Verification
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I introduced it within the signer
module to keep all the actual signing and verifying logic centralized in signer.rs
, while still invoking it from within the Verification
impl.
This follows the structure we’ve used for other structs — where the Verification
impl acts as a bridge, and the core logic stays with the signer.
Let me know what you think — happy to tweak it if you feel it fits better within Verification
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know that. If that's the way it is done, it's fine with me.
Payload::Dummy(DummyControlTlvs::Unblinded(DummyTlvs { dummy_tlvs, authentication })), | ||
Some((next_hop_hmac, new_packet_bytes)), | ||
)) => { | ||
let expanded_key = node_signer.get_inbound_payment_key(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a side question. I am not familiar enough with onion messages I think. But why is a payment key retrieved here for an onion message which is not a payment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! You're right — this isn't a payment per se.
We're using the payment key here primarily as a readily available signing key, not to imply a payment is being made.
A few reasons for using it:
- This key is already available in the signer and commonly used in the codebase for message-level authentication (e.g., verifying
invoice_request
s). - It offers a consistent and secure way to generate and verify authentication data, without introducing a separate signing key just for dummy TLVs.
- We're not treating it as a "payment key" functionally here — it's just being used to sign and verify the TLV structure.
- That said, the naming could be misleading — happy to explore renaming it or clarifying usage in the comments if needed.
Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that makes this more clear is helpful I think. Perhaps at least update the get_inbound_payment_key
docs.
Adds new `Dummy` variant to `ControlTlvs`, allowing insertion of arbitrary dummy hops before the final `ReceiveTlvs`. This increases the length of the blinded path, making it harder for a malicious actor to infer the position of the true final hop.
Adds a new constructor for blinded paths that allows specifying the number of dummy hops. This enables users to insert arbitrary hops before the real destination, enhancing privacy by making it harder to infer the sender–receiver distance or identify the final destination. Lays the groundwork for future use of dummy hops in blinded path construction. Co-authored-by: valentinewallace <[email protected]>
Applies dummy hops by default when constructing blinded paths via `DefaultMessageRouter`, enhancing privacy by obscuring the true path length. Uses a predefined `DUMMY_HOPS_COUNT` to apply dummy hops consistently without requiring explicit user input.
Introduces a test to verify correct handling of dummy hops in constructed blinded paths. Ensures that the added dummy hops are properly included and do not interfere with the real path. Co-authored-by: valentinewallace <[email protected]>
Updated from pr3728.05 to pr3728.06 (diff): Changes:
|
Yeah, you're absolutely right — authenticating dummy hops does reveal how many hops are real based on timing, since we stop peeling once the HMAC fails. That can open up timing-based attacks. That said, I believe @valentinewallace was pointing to a different risk: if the dummy TLVs are left unauthenticated, an attacker could craft a fake blinded path with a large number of bogus hops and arbitrary So we’re kind of stuck between two concerns:
Open to any suggestions you might have on how to walk this line — happy to explore if there's a middle ground that avoids both leakage and performance abuse. |
@joostjager not sure how authenticating dummy hops creates a timing attack? We'll fail on the first hop if an attacker creates a bogus path that fails authentication, which seems correct because in this case the attacker already knows who we are, otherwise they wouldn't have been able to create the path 🤔 There may still be a timing attack here in the sense that dummy hops take a different amount of processing time to "regular" forward hops, though. IMO it's okay to punt on that though. |
Okay, indeed, I see that an attacker crafting dummy hops with a bad authentication code can't learn anything from that. It will fail immediately, and they already knew it would. Do you also see it this way @shaavan, or is there still something there? Regarding the DoS-style attack, it isn't immediately clear to me that that hop processing time is significant. But I guess it is not worth spending the cycles if not necessary. Definitely appreciate the docs that you added. In my opinion, we should do that a lot in pull requests. Explain it extensively when everything is still fresh. |
Resolves #3252
This PR improves privacy in blinded path construction by introducing support for dummy hops.
While blinded paths obscure node identities, they still might reveal the number of hops—potentially leaking information about the distance between sender and receiver.
To mitigate this, we now prepend dummy hops for
BlindedMessagePath
s, that serve no routing purpose but act as decoys. This makes it significantly harder to estimate the true position of the destination node based on path length.