Skip to content

Connection rotation / max lifetime #3033

@adriangb

Description

@adriangb

We run a lot of bursty applications on Kubernetes. They tend to look like 1–3 pods of a low-replica coordinator and 4→128 worker pods (depending on load). A problem I've consistently encountered across systems, made worse by this setup, is that connection pools don't respond to scale-up events: the reqwest pool on the controller pods keeps sending work to the same initial 4 worker pods even when those are saturated and another 124 pods are idle waiting for work.

The reason is that DNS resolution happens once per connection establishment. The k8s Service round-robins pod IPs across DNS responses, but as long as the existing connections stay alive and keep getting reused out of the pool, the client never asks DNS again and never sees the new pods. To redistribute load we need to close connections periodically so that new ones get established against freshly resolved (and round-robined) endpoints.

There are two complementary places to fix this:

  1. Server side — the server pressures or forces clients to rotate (HTTP/1 Connection: close, HTTP/2 GOAWAY). I've opened a sibling issue against axum-server for that. Tonic already supports this via Server::max_connection_age.
  2. Client side — the client retires pooled connections after a bounded lifetime or request count, even if the server hasn't asked it to.

Both are useful and complementary. The server-side fix only works if you control the server and it has the feature. The client-side fix is the one you reach for when you're talking to a server you don't control (or one that doesn't bother). Database connection pools have had this for years — see max_lifetime in sqlx.

reqwest today exposes pool_idle_timeout and pool_max_idle_per_host, but neither of these helps if the connections are actively being used — which is exactly the case where load redistribution matters most.

Proposed API

use std::time::Duration;

let client = reqwest::Client::builder()
    // Soft cap on total connection lifetime. Once a pooled connection
    // reaches this age, it is not returned to the pool after its current
    // request completes; it is closed instead. This forces the next
    // request to that host to re-resolve DNS and establish a new
    // connection (which is the mechanism that lets k8s Service
    // round-robin route traffic to newly scaled-up pods).
    .pool_max_connection_age(Duration::from_secs(10 * 60))

    // Per-connection random jitter added to `pool_max_connection_age`.
    // Without jitter, a burst of connections established at the same
    // moment all expire at the same moment, producing a synchronized
    // reconnect storm.
    .pool_max_connection_age_jitter(Duration::from_secs(60))

    // Optional: retire a connection after it has served this many
    // requests, regardless of age. Useful when traffic is uneven —
    // an age cap alone may not rotate a heavily-used connection
    // often enough.
    .pool_max_requests_per_connection(10_000)

    .build()?;

Design notes / open questions

  • Naming. I've prefixed with pool_ to match the existing pool_idle_timeout / pool_max_idle_per_host naming.
  • Where to enforce. The natural place is at pool checkout/checkin: when a connection comes back to the pool, check its age and request count and drop it instead of reinserting. This avoids interrupting in-flight requests and keeps the implementation simple. It may belong upstream in hyper-util's pool rather than purely in reqwest — happy to file there instead / additionally if that's preferred.
  • HTTP/2 multiplexing. For HTTP/2 there's typically one connection per host carrying many concurrent streams. The age cap still works (close the connection once all streams finish, or after a grace period), but pool_max_requests_per_connection is more ambiguous — do we count streams? I'd suggest "yes, streams count as requests" and call it out in the doc, but it's worth a design decision.
  • Interaction with pool_idle_timeout. These are orthogonal: idle timeout retires unused connections, age cap retires used ones. Both should be settable independently.
  • Defaults. I'd leave all of these None by default to preserve current behavior. Opt-in only.
  • Relationship to retries / failures. Closing a connection at the pool layer should be invisible to callers — the next request just establishes a new one. No retry semantics needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions