Skip to content

hostinet: drain recv buffer before EOF#1368

Open
russellromney wants to merge 1 commit into
QuarkContainer:mainfrom
russellromney:codex/recv-eof-drain
Open

hostinet: drain recv buffer before EOF#1368
russellromney wants to merge 1 commit into
QuarkContainer:mainfrom
russellromney:codex/recv-eof-drain

Conversation

@russellromney

Copy link
Copy Markdown

Summary

  • remove the early RClosed() EOF return in hostinet RecvMsg
  • let the socket buffer read path return buffered bytes before EOF
  • add a TCP regression client for data-plus-FIN close timing

Fixes #1367.

Root cause

When a peer sends data and FIN together, qvisor can process the file-read data completion and then the EOF completion before the guest application enters recv(). The receive buffer already contains bytes, but RecvMsg checked buf.RClosed() before calling ReadFromBuf, so it returned clean EOF without consulting the buffer. Retrying on the same fd repeated the same early EOF shortcut.

SocketBuffIntern::Readv already has the correct ordering: copy buffered data first, then report EOF only when the buffer is empty and the read side is closed. This patch routes closed sockets through that path.

Verification

On the DigitalOcean KVM repro host:

runc    close=immediate ok=16 empty=0 other=0
runc    close=delay50ms ok=16 empty=0 other=0
runsc   close=immediate ok=16 empty=0 other=0
runsc   close=delay50ms ok=16 empty=0 other=0
quark   close=immediate ok=16 empty=0 other=0
quark   close=delay50ms ok=16 empty=0 other=0

Additional fixed-Quark immediate-close image checks with the same static repro client:

debian:bookworm-slim ok=16 empty=0 other=0
ubuntu:24.04         ok=16 empty=0 other=0

Regression test binary under fixed Quark:

tcp_fin_recv: 32 iterations ok

Build note: the repro host needed the existing local qkernel compile shims for drivers::tee / sync_unsafe_cell, unrelated to this networking change.

A TCP peer can send data and FIN close enough together that qvisor marks the socket read side closed while bytes are already buffered. RecvMsg returned EOF from an early RClosed check before consulting the receive buffer, so the buffered bytes were never surfaced and retries kept returning EOF.

Let the existing Readv path decide the result instead. It already returns buffered data before EOF and only reports EOF once the read side is closed and the buffer is empty. Add a small TCP regression client for the data-plus-FIN case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

recv() returns EOF and silently drops received data when the peer's data and FIN are coalesced

1 participant