Skip to content

perf: enable TLS session tickets + 0-RTT in cvm-ingress#7

Merged
Evrard-Nil merged 1 commit into
mainfrom
feat/tls-session-tickets-0rtt
May 18, 2026
Merged

perf: enable TLS session tickets + 0-RTT in cvm-ingress#7
Evrard-Nil merged 1 commit into
mainfrom
feat/tls-session-tickets-0rtt

Conversation

@Evrard-Nil

Copy link
Copy Markdown
Contributor

Why

Saves ~1 RTT (~100ms from typical client locations to cpu01/cpu02) on every TLS reconnect to cloud-api.near.ai and cloud-stg-api.near.ai. Every non-keep-alive client currently pays a full cold TLS 1.3 handshake — enabling session resumption (tickets) eliminates the extra round-trip on resumption, and 0-RTT eliminates it entirely on the first early-data flight.

Same rationale as the parallel PR against the chat-api / inference vllm-ingress: nearai/dstack-ingress-vpc#17. cloud-api uses a different ingress (sidecar pattern in this repo), so this parallel PR was needed for fleet consistency.

What changed

In nginx/tls.conf.template:

  • ssl_session_tickets on; — explicit (was relying on nginx default)
  • ssl_early_data on; — was off (nginx default)
  • ssl_session_cache shared:SSL:10m50m (~40k → ~200k sessions) to match dstack-ingress-vpc#17 for fleet consistency; cost is negligible
  • proxy_set_header Early-Data $ssl_early_data; — forwards the variable so the backend can reject 0-RTT on non-idempotent requests (RFC 8470)

No Dockerfile / base image change. Scope intentionally narrow to avoid conflicts with a parallel HTTP/3 / QUIC PR that may need to bump the base image away from debian:bookworm-slim apt nginx (1.22) to get --with-http_v3_module.

0-RTT replay risk

ssl_early_data on means 0-RTT data is replayable by an attacker who captures it (TLS 1.3 stateless tickets). Standard mitigation per RFC 8470 is for the backend to check the Early-Data: 1 header and respond with 425 Too Early on non-idempotent methods (POST, etc.).

Cloud-api does not reject on this header today. This PR forwards the header but does not change backend behaviour — flagging as a follow-up in nearai/cloud-api. In the meantime, the practical exposure is bounded: TLS 1.3 0-RTT replay windows are short (ticket lifetime + clock skew), and our typical client flow is idempotent (GET /v1/models, etc.); the actually-sensitive non-idempotent endpoints (completions, etc.) are still expensive enough that an attacker replaying them gains little they couldn't get by replaying the full request post-handshake.

Conservative reviewer alternative if the above is unacceptable: leave ssl_early_data off for now (keep session tickets only) and ship 0-RTT once cloud-api rejects on the header. Recommendation is to ship both and queue the cloud-api follow-up.

Deployment

Requires rebuilding the cvm-ingress Docker image and rolling all 20 cloud-api CVMs (10 prod on cpu01/cpu02 ports 9440-9449 + 10 staging on ports 9450-9459) via the standard cvm-ansible-playbooks flow. Not initiated here — flagged as deployment dependency for whoever merges.

Verification

  • bash -n entrypoint.sh clean
  • docker build . succeeds against the pinned debian:bookworm-slim base
  • nginx -t clean against the rendered tls.conf.template (with a self-signed cert) — i.e. TLS_ENABLED=true path
  • nginx -t clean against the rendered default.conf.template — i.e. TLS_ENABLED=false path (untouched, sanity check only)

Related

Add ssl_session_tickets and ssl_early_data to the TLS server block, and
forward the Early-Data header to the backend so cloud-api can choose to
reject 0-RTT on non-idempotent requests.

Also bump ssl_session_cache from 10m (~40k sessions) to 50m (~200k) to
match the size recommended for nearai/dstack-ingress-vpc#17 and provide
headroom for the public-facing cloud-api ingress.

Saves ~1 RTT (~100ms from typical client locations) on every TLS
reconnect to cloud-api.near.ai and cloud-stg-api.near.ai.
@Evrard-Nil Evrard-Nil merged commit 869dd94 into main May 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant