[lightning-net-tokio] Fix race-y unwrap fetching peer socket address #1449

TheBlueMatt · 2022-04-24T21:06:20Z

I recently saw the following panic on one of my test nodes:

thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()`
    on an `Err` value: Os { code: 107, kind: NotConnected, message:
    "Transport endpoint is not connected" }',
    rust-lightning/lightning-net-tokio/src/lib.rs:250:38

Presumably what happened is somehow the connection was closed in
between us accepting it and us going to start processing it. While
this is a somewhat surprising race, its clearly reachable. The fix
proposed here is quite trivial - simply don't unwrap trying to
fetch our peer's socket address, instead treat the peer address as
None and discover the disconnection later when we go to read.

I recently saw the following panic on one of my test nodes: ``` thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 107, kind: NotConnected, message: "Transport endpoint is not connected" }', rust-lightning/lightning-net-tokio/src/lib.rs:250:38 ``` Presumably what happened is somehow the connection was closed in between us accepting it and us going to start processing it. While this is a somewhat surprising race, its clearly reachable. The fix proposed here is quite trivial - simply don't `unwrap` trying to fetch our peer's socket address, instead treat the peer address as `None` and discover the disconnection later when we go to read.

codecov-commenter · 2022-04-24T21:30:15Z

Codecov Report

Merging #1449 (373dfcc) into main (637fb88) will increase coverage by 0.74%.
The diff coverage is 70.49%.

❗ Current head 373dfcc differs from pull request most recent head 3f22d81. Consider uploading reports for the commit 3f22d81 to get more accurate results

@@            Coverage Diff             @@
##             main    #1449      +/-   ##
==========================================
+ Coverage   90.88%   91.63%   +0.74%     
==========================================
  Files          75       75              
  Lines       41474    46120    +4646     
  Branches    41474    46120    +4646     
==========================================
+ Hits        37695    42260    +4565     
- Misses       3779     3860      +81

Impacted Files	Coverage Δ
lightning-net-tokio/src/lib.rs	`75.78% <70.49%> (-0.11%)`	⬇️
lightning/src/chain/mod.rs	`59.25% <0.00%> (-1.86%)`	⬇️
lightning/src/routing/scoring.rs	`94.00% <0.00%> (-0.36%)`	⬇️
lightning-background-processor/src/lib.rs	`95.11% <0.00%> (-0.11%)`	⬇️
lightning/src/ln/functional_tests.rs	`97.13% <0.00%> (-0.02%)`	⬇️
lightning/src/chain/chainmonitor.rs	`98.26% <0.00%> (+0.61%)`	⬆️
lightning/src/chain/channelmonitor.rs	`91.88% <0.00%> (+0.83%)`	⬆️
lightning/src/routing/router.rs	`93.87% <0.00%> (+1.28%)`	⬆️
lightning-persister/src/util.rs	`98.87% <0.00%> (+2.82%)`	⬆️
lightning/src/ln/channelmanager.rs	`87.78% <0.00%> (+3.04%)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 637fb88...3f22d81. Read the comment docs.

vincenzopalazzo

Yeah! good refactoring! unwrap() sometimes is messy into help catching crash

tnull

LGTM

arik-so

I don't imagine there's any reasonable way to add a unit test simulating this exact scenario, considering you'd need an actual TCP connection?

TheBlueMatt · 2022-04-25T18:40:32Z

add a unit test

Hmmm, yea, I tried, wrote a test, but it didn't reproduce the issue. I still committed it because maybe there's a world where it does manage to reproduce the issue on some hosts, dunno.

tnull

Two minor comments.

lightning-net-tokio/src/lib.rs

dunxen

ACK 464175e

lightning-net-tokio/src/lib.rs

Sadly this does not reproduce the issue fixed in the previous commit.

TheBlueMatt · 2022-05-02T15:23:12Z

Squashed fixup commits.

jkczyz · 2022-05-04T17:03:11Z

lightning-net-tokio/src/lib.rs

+		// This attempts to find other similar races by opening connections and shutting them down
+		// while connecting. Sadly in testing this did *not* reproduce the previous issue.


Where does the "and shutting them down while connecting" happen?

We drop the other TcpStream, which should close it.

TheBlueMatt added this to the 0.0.107 milestone Apr 24, 2022

TheBlueMatt added the Seeking Code Review label Apr 24, 2022

vincenzopalazzo previously approved these changes Apr 25, 2022

View reviewed changes

tnull previously approved these changes Apr 25, 2022

View reviewed changes

arik-so previously approved these changes Apr 25, 2022

View reviewed changes

TheBlueMatt dismissed stale reviews from arik-so, tnull, and vincenzopalazzo via 9490393 April 25, 2022 18:40

arik-so previously approved these changes Apr 27, 2022

View reviewed changes

vincenzopalazzo previously approved these changes Apr 28, 2022

View reviewed changes

tnull requested changes Apr 28, 2022

View reviewed changes

lightning-net-tokio/src/lib.rs Outdated Show resolved Hide resolved

lightning-net-tokio/src/lib.rs Outdated Show resolved Hide resolved

TheBlueMatt dismissed stale reviews from vincenzopalazzo and arik-so via 464175e April 28, 2022 14:50

dunxen previously approved these changes Apr 28, 2022

View reviewed changes

lightning-net-tokio/src/lib.rs Outdated Show resolved Hide resolved

TheBlueMatt dismissed dunxen’s stale review via 373dfcc April 28, 2022 19:38

tnull previously approved these changes May 2, 2022

View reviewed changes

Add a test for socket connection races

3f22d81

Sadly this does not reproduce the issue fixed in the previous commit.

TheBlueMatt dismissed tnull’s stale review via 3f22d81 May 2, 2022 15:23

TheBlueMatt force-pushed the 2022-04-fix-remote_ip_race branch from 373dfcc to 3f22d81 Compare May 2, 2022 15:23

arik-so approved these changes May 3, 2022

View reviewed changes

jkczyz reviewed May 4, 2022

View reviewed changes

jkczyz approved these changes May 4, 2022

View reviewed changes

TheBlueMatt merged commit 132b072 into lightningdevkit:main May 4, 2022

TheBlueMatt mentioned this pull request May 6, 2022

lightning-net crate #1469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lightning-net-tokio] Fix race-y unwrap fetching peer socket address #1449

[lightning-net-tokio] Fix race-y unwrap fetching peer socket address #1449

Uh oh!

TheBlueMatt commented Apr 24, 2022

Uh oh!

codecov-commenter commented Apr 24, 2022 •

edited

Loading

Uh oh!

vincenzopalazzo left a comment

Uh oh!

tnull left a comment

Uh oh!

arik-so left a comment

Uh oh!

TheBlueMatt commented Apr 25, 2022

Uh oh!

tnull left a comment

Uh oh!

Uh oh!

Uh oh!

dunxen left a comment

Uh oh!

Uh oh!

TheBlueMatt commented May 2, 2022

Uh oh!

jkczyz May 4, 2022

Uh oh!

TheBlueMatt May 4, 2022 •

edited

Loading

Uh oh!

Uh oh!

		// This attempts to find other similar races by opening connections and shutting them down
		// while connecting. Sadly in testing this did not reproduce the previous issue.

[lightning-net-tokio] Fix race-y unwrap fetching peer socket address #1449

[lightning-net-tokio] Fix race-y unwrap fetching peer socket address #1449

Uh oh!

Conversation

TheBlueMatt commented Apr 24, 2022

Uh oh!

codecov-commenter commented Apr 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vincenzopalazzo left a comment

Choose a reason for hiding this comment

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

arik-so left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Apr 25, 2022

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dunxen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented May 2, 2022

Uh oh!

jkczyz May 4, 2022

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented Apr 24, 2022 •

edited

Loading

TheBlueMatt May 4, 2022 •

edited

Loading