Skip to content

fix: add configurable timeout_keep_alive to prevent ReadError on keep-alive connection reuse#468

Open
nihongye wants to merge 2 commits intoagentscope-ai:mainfrom
nihongye:fix/uvicorn-timeout-keep-alive
Open

fix: add configurable timeout_keep_alive to prevent ReadError on keep-alive connection reuse#468
nihongye wants to merge 2 commits intoagentscope-ai:mainfrom
nihongye:fix/uvicorn-timeout-keep-alive

Conversation

@nihongye
Copy link
Copy Markdown

Problem

The SandboxManager client uses a long-lived httpx.AsyncClient with connection pooling to access the sandbox manager server.

uvicorn's default timeout_keep_alive is 5 seconds, while httpx's default keepalive_expiry is also 5 seconds. This creates a race condition:

  1. Client sends request, server responds, connection goes idle
  2. After ~5s, server sends FIN to close the idle connection
  3. Client (unaware of FIN) tries to reuse the connection for a new request
  4. Server responds with RST since the connection is closing
  5. Client receives httpx.ReadError instead of a response

Timing pattern observed:
T+0.000s: Request/Response completes, connection idle T+5.000s: Server sends FIN (timeout_keep_alive=5s default) T+5.001s: Client sends new request on same connection T+5.002s: Server sends RST → httpx.ReadError

Solution

Set timeout_keep_alive to 120 seconds (configurable) so the server keeps idle connections alive longer than clients expect. This ensures the server never closes a connection that the client might still try to reuse.

Key principle: Server timeout_keep_alive > Client keepalive_expiry

Changes

  • config.py: Add TIMEOUT_KEEP_ALIVE: int = 120 to Settings (configurable via env var)
  • sandbox manager server app.py: Pass timeout_keep_alive=settings.TIMEOUT_KEEP_ALIVE to uvicorn.run()
  • agent_app.py: Add timeout_keep_alive=120 parameter to AgentApp.run() for clients using connection pools

@nihongye nihongye requested a review from a team March 21, 2026 09:25
@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Mar 21, 2026

CLA assistant check
All committers have signed the CLA.

…-alive connection reuse

The SandboxManager client uses a long-lived httpx.AsyncClient with connection
pooling to access the sandbox manager server. uvicorn's default timeout_keep_alive
is 5 seconds, while httpx's default keepalive_expiry is also 5 seconds. When the
client reuses a connection that the server has already started closing (FIN sent),
the client receives a RST packet instead of a response, causing httpx.ReadError.

Changes:
- Add TIMEOUT_KEEP_ALIVE setting (default 120s) to sandbox manager Settings
- Pass timeout_keep_alive to uvicorn.run() in sandbox manager server
- Add timeout_keep_alive parameter to AgentApp.run() (default 120s)

By setting server timeout_keep_alive > client keepalive_expiry, the server
keeps idle connections alive longer than clients expect, eliminating the
race condition.
@nihongye nihongye force-pushed the fix/uvicorn-timeout-keep-alive branch from 86f9a02 to a1d859b Compare March 21, 2026 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant