perf: enable TLS session resumption and 0-RTT in ingress nginx#17
perf: enable TLS session resumption and 0-RTT in ingress nginx#17Evrard-Nil wants to merge 1 commit into
Conversation
Flip the existing ssl_session_tickets and ssl_early_data directives from off to on in both nginx config generators (setup_nginx_conf in entrypoint.sh and generate-nginx-upstream.sh). The session cache size (shared:SSL:50m) and timeout (1d) were already configured. For repeat clients that don't keep TLS connections alive (curl, mobile, some SDKs), this eliminates ~1 RTT per reconnect: tickets give 1-RTT TLS 1.3 resumption, ssl_early_data gives 0-RTT. From a typical client at ~100ms RTT to a CPU CVM, that is ~100ms saved per cold handshake. 0-RTT replay risk is mitigated by forwarding the Early-Data header to backends in every proxied location, so cloud-api / chat-api / inference-proxy can reject Early-Data on non-idempotent methods. None of them act on this header today; follow-up audits are tracked in the PR description. Scope is intentionally narrow: no base-image bump, no UDP/QUIC listen directives, no layout changes. There is a parallel PR adding HTTP/3 + Alt-Svc; this one only touches SSL session / ticket / early-data directives to minimize merge conflict. Verified: docker build succeeds; nginx -t passes for all four config-generation paths (single-target +/- rate-limit, upstream LB +/- rate-limit) using the project's own pinned base image.
Review: TLS session resumption + 0-RTTThanks for the thorough write-up — the risk framing in the PR description is what made this easy to review. Most of the comments below are nits or follow-up suggestions; nothing here blocks merge in my opinion. Higher-priority concerns1. Ticket-key rotation / forward secrecy is overstated by the inline comment. The comment says:
Two things to tighten here:
Given these are CVMs, the threat model is interesting — the worker memory is inside the TEE, so the surface is narrower than a generic VM. But it's worth either (a) softening the comment to be accurate, or (b) following up with a cron-driven `kill -HUP $(pidof nginx)` or an external `ssl_session_ticket_key` file rotated daily. Probably a separate PR. 2. Half-deployed mitigation window. The PR description acknowledges that none of the current backends inspect `Early-Data` yet, and lists the audits as separate follow-ups. That's fine as a sequencing choice, but the practical implication is: between merge and the backend-audit PRs landing, the deployed system has `ssl_early_data on` with no replay rejection at all. The mitigation hook (`Early-Data` header) is wired but no one is reading it. For `chat-api` in particular — which the PR itself flags as the highest replay risk — it might be worth landing the backend `425 Too Early` check first, then this PR, rather than the other way around. Or at minimum, link the follow-up tickets here so the sequencing is visible. 3. WebSocket upgrades + 0-RTT. The `location ~ ^/(ws|socket.io)/` block also gets `Early-Data` forwarded, but a WebSocket upgrade handshake replayed in 0-RTT data would open a duplicate connection (and potentially bypass any one-shot auth-token-in-upgrade-request flow). I do not see WS auth here, but if any backend behind this ingress uses upgrade-time tokens, it is a real consideration. RFC 8470 explicitly calls out that protocol switches should not be allowed in 0-RTT. Easiest local fix: `if ($ssl_early_data) { return 425; }` inside the WS location, since `ssl_early_data` itself is only valid in `server` context. Smaller things4. gRPC mode. `PROXY_CMD=grpc` is set when `TARGET_ENDPOINT` starts with `grpc://`. All gRPC calls are POSTs over HTTP/2, and per-call idempotency is service-defined — so the same Early-Data forwarding rationale applies but backend-side mitigation is harder (most gRPC frameworks do not expose request headers to service code in a uniform way). Worth adding to the follow-up audit list if any gRPC backends sit behind this ingress. 5. The "strictly speaking idempotent" claim in the PR description. `POST /v1/chat/completions` is not idempotent per HTTP semantics (RFC 9110 §9.2.2) — POST is one of the methods explicitly defined as non-idempotent. I think you mean "logically idempotent in the application sense" — the underlying inference is a pure function, but billing, audit logging, and rate-limit counters are side effects. Worth phrasing precisely in the PR description so a reviewer does not conclude it is safe to replay. 6. Verification step naming. In the test plan, "Session-ID reused" is a TLS 1.2 concept (Session ID from RFC 5246). TLS 1.3 does not have a session ID in the same sense — it uses session tickets / PSK identity. `openssl s_client -reconnect` against a TLS 1.3 server will show `Reused, TLSv1.3` (which the next bullet already calls out correctly). Minor wording fix only. 7. Empty-value header behavior — confirm this is intentional. `proxy_set_header Early-Data $ssl_early_data;` will set `Early-Data: 1` when 0-RTT is used and omit the header entirely when it is not (nginx drops empty proxy_set_header values). That matches RFC 8470 §5.1, which is the right behavior — but worth a one-line code comment because the alternative ("always set, value empty when not 0-RTT") would look identical in the diff and be wrong. The existing comment says what the header is for but not that it intentionally relies on nginx's empty-value-elision behavior. Test coverageThe repo does not have an automated test path for the generated nginx configs (the only checks are `bash -n` and a manual `nginx -t` matrix per the PR description). Pre-existing limitation, not introduced here. For a TLS-config PR specifically, a regression test that:
would be cheap and prevent quiet regressions. Not in scope for this PR but worth a follow-up if these scripts keep accumulating SSL config. Things this PR does well
Overall: I'd be comfortable merging this, with a preference for landing the `chat-api` Early-Data check first if that's possible without much extra effort. Otherwise the follow-up audit tickets should be filed before this lands so the sequencing is tracked. 🤖 Generated with Claude Code |
|
Scope note (post-investigation): this PR affects chat-api ( |
Summary
Flip
ssl_session_ticketsandssl_early_datafromofftooninthe two nginx config generators (
scripts/entrypoint.shandscripts/generate-nginx-upstream.sh). The session cache(
shared:SSL:50m) and timeout (1d) were already in place — onlytickets and 0-RTT needed enabling.
Also forward the
Early-Dataheader to all proxied locations(
proxy_set_header Early-Data $ssl_early_data;) so downstream backendscan reject 0-RTT requests on non-idempotent paths if they choose.
Why
For repeat clients that do not keep their TLS connection alive
—
curl, mobile apps, some SDKs that close sockets between requests— each new request currently pays a fresh TLS handshake on top of TCP.
the request along with the ClientHello.
From a developer machine at ~99ms RTT to cpu01, that's ~100ms saved
per cold reconnect. The ingress sits on every CVM (cloud-api,
chat-api, inference CVMs), so the win applies to every public endpoint
behind it:
cloud-api.near.ai,cloud.near.ai,agent.near.ai,*.completions.near.ai.0-RTT replay risk
RFC 8446 §8 documents that 0-RTT data can be replayed by an attacker
who captures the early-data payload. The mitigation surface is:
nginx forwards
Early-Data: 1to the upstream when the requestwas carried in 0-RTT. Each location block in the generated configs
now sets:
Backends can reject Early-Data on side-effectful methods. None
of the backends currently behind this ingress check that header.
Follow-ups to track (separate PRs):
cloud-api(cloud-api.near.ai) — POST/v1/chat/completions,POST
/v1/responses, POST/v1/embeddings, etc. Strictlyspeaking these are idempotent in spec, but billing and audit
logging mean we'd rather not replay them. Audit needed.
inference-proxy— similar; POST inference + GPU attestation.chat-api(agent.near.ai) — POST chat / agent state mutations.Replay risk is highest here.
Until those land, the worst case for a single replay is: an
attacker who captured a ciphertext within the
ssl_session_timeoutwindow (1d) replays it; the backend processesit again. For LLM completions and stateless inference reads, the
user-observable damage is minimal — duplicate billing event at
worst. For chat-api the audit is more nuanced and is the most
pressing follow-up.
The conservative alternative is
ssl_early_data off;and accept the~100ms loss per cold reconnect. I'm choosing on here because (a)
TLS 1.3 0-RTT is widely deployed (Cloudflare, Google, AWS ALB default
it on), (b) Early-Data is propagated downstream so the mitigation
hook is in place, and (c) we can flip back to
offin a one-linefollow-up if any of the follow-up audits surface a real risk.
Scope
Intentionally narrow:
ssl_session_tickets,ssl_early_data, and theEarly-Dataproxy header.listen 443 quic, noAlt-Svc, no layoutchanges.
There is a parallel HTTP/3 PR in flight that bumps the base nginx
image and adds QUIC. To minimize merge conflict, this PR steers clear
of the listen blocks and the base image. Either PR can land first; the
other rebases trivially.
Deployment
Requires rebuilding the
dstack-ingress-vpcimage and rolling it outto every CVM that runs this ingress (cpu01/cpu02 cloud-api prod+stg,
agent0/agent1 CVMs, all inference CVMs). This PR does not initiate
rollout; the image build job runs on merge to
main, and the actualCVM updates go through compose-manager / cvm-compose-files as usual.
Verification
bash -nclean on both modified scripts.docker build .against the pinned base image (nginx@sha256:b6653fca…,which is nginx 1.27.4) succeeds.
nginx -tclean for all four config-generation paths, using theproject's own built image:
RATE_LIMIT_PATHSRATE_LIMIT_PATHScpu01:9450staging) beforerolling fleet-wide: confirm
Session-IDreused and0-RTTindicatedvia
openssl s_client -reconnectandcurl --tls-max 1.3 -vover twoseparate connections.
Test plan
Build & Deploy) succeedsopenssl s_client -connect cloud-stg-api.near.ai:443 -reconnectshows
Reused, TLSv1.3on second-and-later handshakescurl -v https://cloud-stg-api.near.ai/healthover a freshconnection shows
Early data was accepted by the server(orequivalent client-side indicator) on a resumed handshake
POST /v1/chat/completionsvia the existing infra-tests run)validated