net/transport: fix yamux dial fd-reuse race (H1) and outgoing sub-stream leak (H2)#707
Merged
Merged
Conversation
Coverage provided by https://github.com/seriousben/go-patch-cover-action |
H1: yamux Dial installed a net.Dialer.Control that, on ctx cancellation,
called Shutdown+Close on the raw socket fd. DialContext(ctx) already aborts
the connect and closes its own fd, so the two raced to close the same fd
number (classic fd-reuse hazard). Drop the Control/abortOnCancel machinery
and the abort_{unix,windows,other}.go files; rely on net.Dialer.Timeout +
DialContext(ctx), which handle timeout and cancellation safely.
H2: peer.openDrpcConn opened a sub-stream then ran OutgoingProtoHandshake,
returning the error without closing the stream. The handshake does not close
the conn on three protocol-level error paths (incompatible/declined/
unexpected proto), so a version-skewed peer leaked streams unboundedly. Close
the sub-stream on any handshake error, matching the incoming serve() path.
Refs GO-7313
039c05d to
b47706c
Compare
cheggaaa
approved these changes
Jun 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Audit of the net transport stack (
net/transport/{yamux,quic},net/peer,net/pool,net/peerservice,net/secureservice+ handshake,net/connutil), cross-checked against hashicorp/yamux 0.1.2, quic-go 0.60.0, libp2p tls 0.48.0, storj.io/drpc, and app/ocache.Fixes in this PR
H1 — yamux dial closed the raw socket fd (fd-reuse / double-close)
Dialinstalled anet.Dialer.Controlthat, on ctx cancellation, ranunix.Shutdown(fd); unix.Close(fd)(and a WindowsClosesocketvariant) on a descriptor the Go netpoller owns.DialContext(ctx)already aborts the connect and closes its own fd, so the two raced to close the same fd number — the classic fd-reuse hazard, where a concurrently-opened fd can be torn down. Removed theControl/abortOnCancelmachinery and theabort_{unix,windows,other}.gofiles;Dialnow relies onnet.Dialer.Timeout+DialContext(ctx), which handle timeout and cancellation safely.TestDialContextCancellationstill aborts in ~100ms.H2 — outgoing drpc sub-stream leaked on proto-handshake errors
peer.openDrpcConnopened a yamux/quic sub-stream then ranOutgoingProtoHandshake, returning the error without closing the stream. The handshake closes the conn on I/O errors and ctx cancellation, but not on three protocol-level paths (ErrRemoteIncompatibleProto, non-null ack,ErrUnexpectedPayload), so a version-skewed/misbehaving peer caused unbounded stream accumulation.openDrpcConnnow closes the sub-stream on any handshake error, matching the incoming side'sdefer conn.Close().Verification
go build ./net/...andgo vet ./net/transport/yamux/... ./net/peer/...— cleango test ./net/transport/yamux/... ./net/peer/... ./net/pool/... ./net/peerservice/...— passgo test -race -run 'TestDialContextCancellation|TestYamuxTransport_Dial' ./net/transport/yamux/...— pass, no racesRefs GO-7313