Skip to content

Commit 61c0c34

Browse files
Branimir Rakiccursoragent
authored andcommitted
fix(network): preserve relay-safe + handler-error semantics in LibP2PNetwork (Codex PR #494)
LibP2PNetwork's first cut silently dropped two behaviours that the existing pre-RFC-07 dial path (ProtocolRouter) relied on. The first consumer migrated to Network.dialProtocol() / Network.handle() would have regressed on relay-only links and on async handler rejections. Two fixes: 1. dialProtocol() and handle() now both pass `runOnLimitedConnection: true` like ProtocolRouter.send / ProtocolRouter.register already do. V10 edge↔core traffic regularly traverses circuit-relay limited connections (RFC 04); without this flag, libp2p refuses to use a limited-only connection for protocol streams. Codex called this out on dialProtocol; same fix applies to inbound handle(). 2. handle() now awaits the user-supplied handler in a try/catch and stream.abort()s on failure. The previous wrapper fire-and-forgot the promise, so an async handler rejection became an unhandledRejection AND left the inbound stream half-open until libp2p's own timeout fired. Mirrors the try/catch+abort pattern in ProtocolRouter.register exactly. New test (handler-rejects/1.0.0) is a regression guard: a handler that intentionally throws after a 50ms barrier; the dialer's drain must terminate in well under 5s. Without the fix it hangs to the 10s vitest test timeout. 8/8 LibP2PNetwork tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 5b2aa78 commit 61c0c34

2 files changed

Lines changed: 89 additions & 4 deletions

File tree

packages/core/src/network/libp2p-network.ts

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,13 +60,43 @@ export class LibP2PNetwork implements Network {
6060
// caller supplied one, honour it as-is; otherwise mint one off the
6161
// timeout so a runaway dial doesn't pin a stream slot indefinitely.
6262
const signal = opts?.signal ?? AbortSignal.timeout(timeoutMs);
63-
return this.node.libp2p.dialProtocol(pid, protocolId, { signal });
63+
// V10 edge/core traffic regularly traverses circuit-relay limited
64+
// connections (RFC 04). The pre-RFC-07 dial path
65+
// (ProtocolRouter.send) sets runOnLimitedConnection: true for that
66+
// reason; preserve the same default here so that the first
67+
// consumer migrated to Network.dialProtocol() doesn't regress on
68+
// relay-only links. Codex PR #494 round 1.
69+
return this.node.libp2p.dialProtocol(pid, protocolId, {
70+
signal,
71+
runOnLimitedConnection: true,
72+
});
6473
}
6574

6675
async handle(protocolId: string, handler: ProtocolHandler): Promise<void> {
67-
await this.node.libp2p.handle(protocolId, (stream, connection) => {
68-
void handler(stream, connection.remotePeer.toString());
69-
});
76+
// Mirror ProtocolRouter.register semantics so the wrapper doesn't
77+
// silently regress two behaviours the existing dial path relies on:
78+
// 1. await the handler so an async rejection isn't swallowed as
79+
// an unhandled promise — abort the stream on failure instead
80+
// of leaving it half-open.
81+
// 2. pass runOnLimitedConnection: true so inbound protocols keep
82+
// working when the only available connection is a relay-limited
83+
// one (RFC 04 / V10 edge↔core via circuit-relay).
84+
// Codex PR #494 round 1.
85+
await this.node.libp2p.handle(
86+
protocolId,
87+
async (stream, connection) => {
88+
try {
89+
await handler(stream, connection.remotePeer.toString());
90+
} catch (err) {
91+
try {
92+
stream.abort(err instanceof Error ? err : new Error('handler error'));
93+
} catch {
94+
// stream already closed/aborted — nothing to do.
95+
}
96+
}
97+
},
98+
{ runOnLimitedConnection: true },
99+
);
70100
}
71101

72102
async unhandle(protocolId: string): Promise<void> {

packages/core/test/libp2p-network.test.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,61 @@ describe('LibP2PNetwork', () => {
139139
await expect(netA.addKnownAddresses(b.peerId, [])).resolves.toBeUndefined();
140140
});
141141

142+
it('handle aborts the stream when an async handler rejects (Codex PR #494)', async () => {
143+
// Regression guard: the wrapper used to fire-and-forget the handler
144+
// promise, so an async rejection became an unhandledRejection AND
145+
// left the inbound stream half-open. The dialer would only notice
146+
// when its own read timeout fired, by which time minutes might have
147+
// passed. The fix awaits the handler in try/catch and aborts the
148+
// stream on failure.
149+
//
150+
// Test strategy: handler awaits a barrier, THEN throws. Dialer
151+
// sends, half-closes, then drains. If the fix works, the drain
152+
// either returns clean EOF (handler aborted before any reply was
153+
// sent) or throws a stream error — both well under the 5s deadline
154+
// we assert on. Without the fix, the drain hangs until vitest's
155+
// 10s test timeout.
156+
const a = spawn();
157+
const b = spawn();
158+
const netA = new LibP2PNetwork(a);
159+
const netB = new LibP2PNetwork(b);
160+
await netA.start();
161+
await netB.start();
162+
await netA.addKnownAddresses(b.peerId, b.multiaddrs);
163+
164+
const protocol = '/test/handler-rejects/1.0.0';
165+
let handlerEntered = false;
166+
await netB.handle(protocol, async () => {
167+
handlerEntered = true;
168+
// Brief async barrier so the dialer-side send/close completes
169+
// before the handler aborts the stream — gives us a clean
170+
// observation point for the drain timing assertion.
171+
await new Promise((r) => setTimeout(r, 50));
172+
throw new Error('intentional handler failure');
173+
});
174+
175+
const stream = await netA.dialProtocol(b.peerId, protocol);
176+
try {
177+
stream.send(new TextEncoder().encode('ping'));
178+
await stream.close();
179+
} catch {
180+
// Stream may already be aborted by the time we send — that's OK,
181+
// it's the same root cause we're testing for.
182+
}
183+
184+
const drainStart = Date.now();
185+
try {
186+
for await (const _chunk of stream) { /* discard */ }
187+
} catch {
188+
// Aborted-stream error is one of the two acceptable outcomes.
189+
}
190+
const drainMs = Date.now() - drainStart;
191+
expect(handlerEntered).toBe(true);
192+
// Without the fix, the drain hangs until libp2p's own timeout (>>5s).
193+
// With the fix, abort propagates immediately.
194+
expect(drainMs).toBeLessThan(5000);
195+
}, 10000);
196+
142197
it('unhandle removes a previously-registered protocol', async () => {
143198
const a = spawn();
144199
const b = spawn();

0 commit comments

Comments
 (0)