Skip to content

[codex] Fix libxev poller state leaks#432

Merged
heartwilltell merged 5 commits into
mainfrom
codex/runtime-poller
Apr 26, 2026
Merged

[codex] Fix libxev poller state leaks#432
heartwilltell merged 5 commits into
mainfrom
codex/runtime-poller

Conversation

@heartwilltell

Copy link
Copy Markdown
Contributor

Summary

  • keep libxev fd poll completions armed until explicit close, then cancel and drain them before fd reuse
  • clear C-side poll descriptor waiter pointers after a parked G resumes so close cannot wake freed Gs
  • ignore stale dead kqueue events in the vendored libxev backend and re-enable the gated poller tests, including a reverse-order regression

Issues

Fixes #397
Fixes #416
Fixes #424
Fixes #426

Audited #395 and #396: the libxev adapter and default poller wiring are already present on origin/main; this PR fixes the remaining state leak/test gating around that implementation.

Validation

  • zig build
  • zig build test (passes; emits the expected negative-test diagnostic for missing())
  • zig build test-runtime
  • zig build test-runtime -Dsanitize=true could not run on this macOS host because libasan and libubsan are not available in the SDK/Homebrew library paths.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Apr 25, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
runlang 6e1ef12 Commit Preview URL

Branch Preview URL
Apr 26 2026, 10:39 AM

@heartwilltell heartwilltell marked this pull request as ready for review April 26, 2026 10:04
The drain loop in run_xev_close_fd waited only for the original read/write
completions to retire, ignoring the cancel completions we submitted to
libxev. When the cancel CQE was still pending in the io_uring ring, the
function would zero slot.read_cancel = .{}, setting its op to .noop. When
the kernel eventually delivered that CQE, libxev's Completion.invoke()
hit `.noop => unreachable`, crashing with SIGSEGV in
test_poller_close_while_waiting.

Three layered fixes:

1. run_xev_bridge.zig: Drain loop now also waits for read_cancel and
   write_cancel completions to reach .dead before resetting their
   storage. Reset is now conditional: any completion that is somehow
   still .active (drain exhausted) is left alone so its eventual CQE
   finds a valid op.

2. libxev io_uring.zig: Mirror the existing kqueue.zig defensive guard —
   skip processing CQEs for completions whose flags.state is no longer
   .active. This prevents the unreachable crash even if a future caller
   resets a completion mid-flight.

3. libxev io_uring.zig: Handle .CANCELED in the .poll case so the
   cancelled poll's CQE no longer prints "unexpected errno: 125" to
   stderr.

https://claude.ai/code/session_01CSYWLMHrkzjcwCbEwHtCLT
@heartwilltell heartwilltell self-assigned this Apr 26, 2026
@heartwilltell heartwilltell merged commit 4d3f794 into main Apr 26, 2026
14 checks passed
@heartwilltell heartwilltell deleted the codex/runtime-poller branch April 26, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants