Skip to content

fix(streaming): detach vmConn lifetime from per-RPC Stream context#177

Draft
ndeloof wants to merge 1 commit intocontainerd:mainfrom
ndeloof:fix-streaming-vmconn-lifetime
Draft

fix(streaming): detach vmConn lifetime from per-RPC Stream context#177
ndeloof wants to merge 1 commit intocontainerd:mainfrom
ndeloof:fix-streaming-vmconn-lifetime

Conversation

@ndeloof
Copy link
Copy Markdown

@ndeloof ndeloof commented May 5, 2026

Bug

On Windows, vmConn.Close() traverses the AF_UNIX ↔ vsock proxy and arrives at vminitd as a vsock SHUTDOWN. When the per-RPC ctx cancels mid-stream, the deferred Close() races with in-flight bridge reads; the SHUTDOWN cascades into Task.Kill -> Delete -> Shutdown, tearing down the VM (state_error="ttrpc: closed"). Linux/macOS use native vsock without the proxy, so no cascade.

Empirical evidence in docker/sandboxes#2529: 10 parallel CI runs of identical content — every PASS = 1 EOF event on the proxy socket, every FAIL = 13–30. Zero overlap.

Fix

In the <-ctx.Done() branch, call vmConn.SetReadDeadline(time.Now()) to interrupt the bridge's binary.Read via the Go runtime poller (no wire-level packet) and drain v2t before the deferred Close() runs.

Tests

  • TestStreamReturnsAfterContextCancelOnIdleStream — cancel ctx with no traffic; handler must drain.
  • TestStreamPreservesStartStreamCancel — cancel ctx during a slow StartStream; abort must propagate.

Both fail on baseline, pass on this PR. The 4 pre-existing tests still pass.

Copilot AI review requested due to automatic review settings May 5, 2026 20:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the shim streaming bridge so the VM-side stream connection is no longer tied directly to the per-RPC ttrpc context, aiming to stop normal end-of-exec context cancellation from cascading into a full VM teardown on Windows.

Changes:

  • Starts the VM stream with context.WithoutCancel(ctx) instead of the per-RPC context.
  • Removes the handler’s early return on ctx.Done() and waits only for the VM-to-client bridge to finish.
  • Expands inline documentation around the stream-closing protocol and the Windows-specific shutdown cascade.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/shim/streaming/plugin.go Outdated
Comment thread plugins/shim/streaming/plugin.go Outdated
Comment thread plugins/shim/streaming/plugin.go Outdated
@ndeloof ndeloof marked this pull request as ready for review May 6, 2026 05:09
@ndeloof ndeloof marked this pull request as draft May 6, 2026 06:13
Copilot AI review requested due to automatic review settings May 6, 2026 06:44
On Windows, vmConn.Close() traverses the AF_UNIX <-> vsock proxy and
arrives at vminitd as a vsock SHUTDOWN packet. When the per-RPC ctx
cancels mid-stream, the deferred Close() fires while the VM may still
have in-flight data; the SHUTDOWN cascades into Task.Kill -> Delete ->
Shutdown and tears down the entire VM, surfacing as
state_error="rpc error: code = Unknown desc = ttrpc: closed".

Linux/macOS use native vsock without the AF_UNIX intermediate, so the
cascade does not occur there.

Empirical evidence (docker/sandboxes#2529): across 10 parallel CI runs
of identical content, every PASS produced exactly 1 EOF event on the
proxy AF_UNIX socket while every FAIL produced 13-30. Zero overlap.

Fix: in the <-ctx.Done() branch, instead of returning immediately
(which lets the deferred Close() race with an in-flight bridge read),
call vmConn.SetReadDeadline(time.Now()) to interrupt binary.Read via
the Go runtime poller -- a purely runtime construct that emits nothing
on the wire. Then drain v2t so the bridge has fully exited before
Close() runs.

Two regression tests pin the behaviour:

- TestStreamReturnsAfterContextCancelOnIdleStream: cancel ctx while
  neither side is sending; handler must drain and return promptly.
- TestStreamPreservesStartStreamCancel: cancel ctx during a slow
  StartStream; abort must propagate (regression guard against passing
  context.WithoutCancel into StartStream).

Refs: docker/sandboxes#2529

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
@ndeloof ndeloof force-pushed the fix-streaming-vmconn-lifetime branch from ba85164 to 338339c Compare May 6, 2026 06:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

Comment on lines 146 to +152
select {
case err := <-v2t:
if err != nil && !errors.Is(err, io.EOF) {
log.G(ctx).WithError(err).WithField("stream", i.ID).Debug("server->client bridge ended")
}
case <-ctx.Done():
// On Windows, the AF_UNIX <-> vsock proxy turns vmConn.Close()
Comment on lines +159 to +160
if dc, ok := vmConn.(interface{ SetReadDeadline(time.Time) error }); ok {
_ = dc.SetReadDeadline(time.Now())
Comment on lines +334 to +339
// Without the SetReadDeadline-driven exit path, bridgeVMToTTRPC blocks
// indefinitely in binary.Read(vmConn) because vmConn's lifetime was
// detached from ctx. The c2v goroutine exits cleanly when srv.Recv()
// observes ctx.Err(), but its zero-length EOF marker is written into
// the shim->VM half of the pipe; v2t reads from the VM->shim half and
// therefore never sees it.
Comment on lines +389 to +394
// TestStreamPreservesStartStreamCancel covers Copilot's second concern
// on #177: passing context.WithoutCancel(ctx) into StartStream strips the
// caller's deadline and cancellation, so the libkrun retry loop polls
// for the full ~50s window even after the RPC has been torn down. The
// handler must instead pass the original ctx to StartStream so a
// timed-out or cancelled RPC aborts connection establishment promptly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants