fix(streaming): detach vmConn lifetime from per-RPC Stream context#177
Draft
ndeloof wants to merge 1 commit intocontainerd:mainfrom
Draft
fix(streaming): detach vmConn lifetime from per-RPC Stream context#177ndeloof wants to merge 1 commit intocontainerd:mainfrom
ndeloof wants to merge 1 commit intocontainerd:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the shim streaming bridge so the VM-side stream connection is no longer tied directly to the per-RPC ttrpc context, aiming to stop normal end-of-exec context cancellation from cascading into a full VM teardown on Windows.
Changes:
- Starts the VM stream with
context.WithoutCancel(ctx)instead of the per-RPC context. - Removes the handler’s early return on
ctx.Done()and waits only for the VM-to-client bridge to finish. - Expands inline documentation around the stream-closing protocol and the Windows-specific shutdown cascade.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
On Windows, vmConn.Close() traverses the AF_UNIX <-> vsock proxy and arrives at vminitd as a vsock SHUTDOWN packet. When the per-RPC ctx cancels mid-stream, the deferred Close() fires while the VM may still have in-flight data; the SHUTDOWN cascades into Task.Kill -> Delete -> Shutdown and tears down the entire VM, surfacing as state_error="rpc error: code = Unknown desc = ttrpc: closed". Linux/macOS use native vsock without the AF_UNIX intermediate, so the cascade does not occur there. Empirical evidence (docker/sandboxes#2529): across 10 parallel CI runs of identical content, every PASS produced exactly 1 EOF event on the proxy AF_UNIX socket while every FAIL produced 13-30. Zero overlap. Fix: in the <-ctx.Done() branch, instead of returning immediately (which lets the deferred Close() race with an in-flight bridge read), call vmConn.SetReadDeadline(time.Now()) to interrupt binary.Read via the Go runtime poller -- a purely runtime construct that emits nothing on the wire. Then drain v2t so the bridge has fully exited before Close() runs. Two regression tests pin the behaviour: - TestStreamReturnsAfterContextCancelOnIdleStream: cancel ctx while neither side is sending; handler must drain and return promptly. - TestStreamPreservesStartStreamCancel: cancel ctx during a slow StartStream; abort must propagate (regression guard against passing context.WithoutCancel into StartStream). Refs: docker/sandboxes#2529 Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
ba85164 to
338339c
Compare
Comment on lines
146
to
+152
| select { | ||
| case err := <-v2t: | ||
| if err != nil && !errors.Is(err, io.EOF) { | ||
| log.G(ctx).WithError(err).WithField("stream", i.ID).Debug("server->client bridge ended") | ||
| } | ||
| case <-ctx.Done(): | ||
| // On Windows, the AF_UNIX <-> vsock proxy turns vmConn.Close() |
Comment on lines
+159
to
+160
| if dc, ok := vmConn.(interface{ SetReadDeadline(time.Time) error }); ok { | ||
| _ = dc.SetReadDeadline(time.Now()) |
Comment on lines
+334
to
+339
| // Without the SetReadDeadline-driven exit path, bridgeVMToTTRPC blocks | ||
| // indefinitely in binary.Read(vmConn) because vmConn's lifetime was | ||
| // detached from ctx. The c2v goroutine exits cleanly when srv.Recv() | ||
| // observes ctx.Err(), but its zero-length EOF marker is written into | ||
| // the shim->VM half of the pipe; v2t reads from the VM->shim half and | ||
| // therefore never sees it. |
Comment on lines
+389
to
+394
| // TestStreamPreservesStartStreamCancel covers Copilot's second concern | ||
| // on #177: passing context.WithoutCancel(ctx) into StartStream strips the | ||
| // caller's deadline and cancellation, so the libkrun retry loop polls | ||
| // for the full ~50s window even after the RPC has been torn down. The | ||
| // handler must instead pass the original ctx to StartStream so a | ||
| // timed-out or cancelled RPC aborts connection establishment promptly. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
On Windows,
vmConn.Close()traverses the AF_UNIX ↔ vsock proxy and arrives atvminitdas a vsockSHUTDOWN. When the per-RPC ctx cancels mid-stream, the deferredClose()races with in-flight bridge reads; theSHUTDOWNcascades intoTask.Kill -> Delete -> Shutdown, tearing down the VM (state_error="ttrpc: closed"). Linux/macOS use native vsock without the proxy, so no cascade.Empirical evidence in docker/sandboxes#2529: 10 parallel CI runs of identical content — every PASS = 1 EOF event on the proxy socket, every FAIL = 13–30. Zero overlap.
Fix
In the
<-ctx.Done()branch, callvmConn.SetReadDeadline(time.Now())to interrupt the bridge'sbinary.Readvia the Go runtime poller (no wire-level packet) and drainv2tbefore the deferredClose()runs.Tests
TestStreamReturnsAfterContextCancelOnIdleStream— cancel ctx with no traffic; handler must drain.TestStreamPreservesStartStreamCancel— cancel ctx during a slowStartStream; abort must propagate.Both fail on baseline, pass on this PR. The 4 pre-existing tests still pass.