Skip to content

rst_stream select branch#2

Open
molocule wants to merge 92 commits into
mainfrom
rst-stream-fix
Open

rst_stream select branch#2
molocule wants to merge 92 commits into
mainfrom
rst-stream-fix

Conversation

@molocule

@molocule molocule commented Jun 10, 2026

Copy link
Copy Markdown

johnhurt and others added 30 commits March 20, 2026 19:02
If an h2 upstream response finishes its content-length but doesn't
actually send END_STREAM, we may continue to wait on the h2 upstream
while an h1 downstream may have already sent another request on that
connection. This could create issues where, if the EOS never arrived,
the next request would be stalled from being processed up to the read
timeout of that h2 upstream.

Co-authored-by: Edward Wang <ewang@cloudflare.com>
Co-authored-by: Andrew Hauck <ahauck@cloudflare.com>
Adds ssl_export_keying_material function to both pingora-openssl and
pingora-boringssl ext modules. This wraps the underlying SSL library's
export_keying_material method for RFC 5705 compliance.

Includes-commit: e930a5c
Replicated-from: cloudflare#729
Includes-commit: 08133c6
Replicated-from: cloudflare#795

Co-authored-by: Fei Deng <fdeng@cloudflare.com>
h2c (HTTP/2 cleartext) preface detection should only run on cleartext
TCP connections. On TLS, ALPN negotiates the protocol during the
handshake.

When h2c is enabled and both TCP and TLS listeners share a service,
preface detection runs on TLS streams too. On TLS, try_peek returns
peeked=false, leaving h2c=true unconditionally. This forces all TLS
connections into the HTTP/2 branch, breaking HTTP/1.1 clients.

Fix: check get_ssl_digest().is_some() to detect TLS and skip h2c
detection, letting the existing ALPN check decide the protocol.
---
add tests for h2c + TLS interaction

Includes-commit: 08af462
Includes-commit: ce297ac
Replicated-from: cloudflare#826

Co-authored-by: Anthony Daniel Turcios <anthonydaniel@cloudflare.com>
I noticed this downstream:

crate-audit> Crate:     protobuf
crate-audit> Version:   2.28.0
crate-audit> Title:     Crash due to uncontrolled recursion in protobuf crate
crate-audit> Date:      2024-12-12
crate-audit> ID:        ~~~-0437
crate-audit> URL:       https://rustsec.org/advisories/~~~-0437
crate-audit> Solution:  Upgrade to >=3.7.2
crate-audit> Dependency tree:
crate-audit> protobuf 2.28.0
crate-audit> └── prometheus 0.13.4
crate-audit>     └── pingora-core 0.6.0

This is already fixed upstream in `prometheus`. We just need to bump the
version here to include the fix, no further actions need to be taken.

Includes-commit: a478394
Replicated-from: cloudflare#708
Enable persisting user-defined state across HTTP/1.x keepalive requests
on the same downstream connection.
This allows adjusting the blocking thread pool in the pingora server's
runtime using configuration.
This was removed as part of the bootstrap refactor. The client guard
needs to be retained and reinit post fork.
Adds an async filter (feature-gated) to adjust upstream modules prior to
those modules (currently just compression) running.
This prevents headers like 100-continue from ending the stream and
causing hangs while the downstream is waiting.
…pgrade

When bootstrap_as_a_service is enabled, listen_fds() was called to snapshot
the fd table before BootstrapService had run, always returning None. Services
would then bind fresh sockets instead of inheriting the old process's fds,
breaking graceful upgrades.

Fix this by eagerly allocating the ListenFds table in Bootstrap::new() so it
is non-optional and already distributed to all services before bootstrap runs.
When load_fds() later receives the inherited fds from the old process, it
populates the same shared table in place, making them visible to all services
without any re-distribution.
This update introduces the abort_on_close feature to control behavior when a client closes the connection after the request body. When enabled (default), it results in a ConnectionClosed error, allowing the proxy to abort immediately. When disabled, the proxy can continue processing the upstream response.

Includes-commit: a6420f8
Replicated-from: cloudflare#836
Add fields such that callers can distinguish a successful subrequest
from one that died silently or was cut short. The handle lets callers
await post-response cleanup (cache writes, logging) before issuing the
next subrequest.
…istener port collisions

test_conn_timeout / test_conn_timeout_with_offload: Replace 192.0.2.1
(TEST-~~~) with a bound-but-not-listening local socket via the new
timeout_socket() helper in utils::for_testing. Because listen() is
never called, the kernel silently drops SYN packets, guaranteeing a
real ConnectTimedout on Linux. The total_connection_timeout tests still
use 192.0.2.1 (SEMI_BLACKHOLE) since they test error classification
and accept ConnectNoRoute as an alternative.

test_tls_psk (s2n): PskTlsServer::start() spawned a background thread
with no readiness check. Use an mpsc channel to signal after
TcpListener::bind so tests only proceed once the port is ready. Also
make the accept loop resilient to handshake failures (continue instead
of panic) so a stale probe cannot take down the server.

test_1xx_caching: mock_1xx_server used fixed ports (6151/6152) and
sleep(100ms) for readiness. Refactored to spawn_mock_1xx_server which
binds to port 0 (OS-assigned) and signals readiness via a oneshot
channel after bind. Eliminates AddrInUse from TIME_WAIT and sleep
races.

test_listen_tcp / test_listen_tcp_ipv6_only: Hardcoded ports 7100-7102
collided across parallel CI test jobs. Switch to port 0 with the new
ListenerEndpoint::local_addr() / Listener::local_addr() methods to
discover the actual bound port.
ListenFds only guards an in-memory fd table and a blocking
send_to_sock call, neither of which benefit from an async mutex.
Switch to parking_lot::Mutex and move the fd-send path in main_loop
onto the blocking thread pool via spawn_blocking.

Because the parking_lot lock cannot be held across bind().await in
ListenerEndpointBuilder::listen(), introduce a global per-address
async lock map (flurry::HashMap<String, Arc<tokio::sync::Mutex<()>>>)
that serializes the check-bind-insert sequence for each address.
This prevents two concurrent callers from racing to bind the same
address while the ListenFds lock is released.
The MSRV (1.84.0) job fails because cargo test compiles
dev-dependencies. A transitive dev-dependency chain
(pingora-proxy -> tokio-tungstenite -> tungstenite -> sha1 ->
cpufeatures v0.3.0) pulls in a crate that uses edition 2024,
which Cargo 1.84.0 cannot parse.

Run cargo check --workspace for all toolchains and skip
cargo test on the MSRV.
Add BodyWriter task API (send_body_task, write_current_body_task,
send_finish_task, write_current_finish_task) and HeaderWriter for
cancel-safe writes that can be used in tokio::select! loops.
Using the proxy task API allows polling for the upstream rx task at the
same time, so that upstream cache writes can continue even while serving
downstream. proxy_h2 and h2 downstream (as well as custom) is a todo.
andrewhavck and others added 24 commits June 3, 2026 10:59
HttpHealthCheck::check() sets read_timeout on the session before reading
the response, but never applies write_timeout before the write sequence
(write_request_header, finish_request_body, finish_custom). When using a
custom protocol connector whose finish methods await internal handshake
signals, the absence of a write timeout allows those awaits to block
indefinitely. A single stuck backend can wedge the entire health check
batch, causing stale health state and misrouted traffic.
A single failing shard no longer aborts the whole loop. Missing shard
files on load are treated as empty rather than an error.
Add a PreTlsProcess trait and set_pre_tls_callback() method that allows
applications to process raw bytes before the TLS handshake occurs.

This is useful for protocols like HAProxy's PROXY protocol, which sends
client address information before TLS. The callback reads and consumes
protocol headers, then updates the socket digest with the real client
address.
---
Add shutdown_with_reason for H2 streams

Allow sending RST_STREAM with a custom reason code instead of hardcoded
INTERNAL_ERROR. This enables RFC 7540 §9.1.2 compliant 421 responses where
HTTP_1_1_REQUIRED can signal clients to retry over HTTP/1.1.

Fixes cloudflare#787
---
Merge pull request #1 from jaw-sh/h2-custom-reset-reason

Add shutdown_with_reason for H2 streams
---
Make Stream::rewind() public for protocol detection

PreTlsProcess implementations need to put data back onto the stream
when it doesn't match the expected protocol signature. This lets the
PROXY protocol handler rewind non-PROXY data so TLS proceeds normally.
---
move pre-TLS callback inside TLS branch

Co-authored-by: Josh <jcmoon@pm.me>
Includes-commit: 1bd3c47
Includes-commit: 33465f3
Includes-commit: 58e68a6
Includes-commit: 9aa73fb
Includes-commit: b3d1960
Replicated-from: cloudflare#799
---
Move docs from v0.3 to 0.7
---
Update version to 0.8

Includes-commit: 0a643a3
Includes-commit: 5f7838c
Includes-commit: 6ffee89
Replicated-from: cloudflare#828
Adds ssl_export_keying_material function to pingora-s2n ext module,
wrapping s2n-tls's built-in tls_exporter method for RFC 5705.

Includes-commit: 839e122
Replicated-from: cloudflare#744
Use tokio::io::AsyncWriteExt::write_all_buf directly at every call site,
including inside hand-written poll_* state machines by stack-pinning a
fresh future per poll. Since WriteAllBuf is stateless (all progress
lives in the caller-owned Buf), re-creating it on each poll is sound.

Delete the AsyncWriteVec trait, its blanket impl, the WriteVec /
WriteVecAll future structs, and the poll_write_all_buf /
poll_write_vec_all_buf free functions from stream.rs.

Behavioral note: tokio's write_all_buf will use writev() when the
underlying writer reports is_write_vectored() — true for RawStream — so
the call sites that previously used the explicitly-non-vectored
poll_write_all_buf (content-length body, until-close body, header
writes) now use vectored writes. The wire bytes are identical; only
the syscall path differs. Worth scrutinizing if we see new downstream
response write perf anomalies.

Add unit tests in body::test_poll_body_writer that drive
BodyWriter::poll_write_current_body_task and
BodyWriter::poll_write_current_finish_task directly via
futures::task::noop_waker with an in-memory MockWriter. No tokio
runtime is involved, so the tests are cheap and exercise the
cancel-safe restart paths explicitly.

Coverage:
- Content-length body, single happy write
- Content-length body, resume after Pending + short write
- Chunked body, vectored writer (3-chunk WriteBuf::Chained)
- Chunked body, short-write resume on first inner chunk
- Until-close body, single happy write
- Finish chunked, '0\r\n\r\n' terminator
…al to avoid stale entries in LRU, address other outstanding TODOs
Add opt-in HTTP/1.1 pipelining via HttpSession::set_pipelining_enabled().
When enabled, pipelined requests on a keep-alive connection are served
sequentially in request order per RFC 9112 §9.3.2, matching nginx
behavior on the same traffic shape. Default (off) is unchanged: the
second pipelined request is dropped or surfaced as a 400 by the
body-pump idle branch.

Non-adopters are untouched: ServerSession::finish() and the H1
HttpSession::reuse() keep their pre-pipelining Result<Option<Stream>>
signatures and discard any captured pipelined prefix. Adopters call
finish_reuse() / reuse_pipelined() to receive ReusedHttpConnection.

Covers both wire shapes: same-segment overread (both requests arrive
in one read) via reuse_pipelined(), and two-segment overread (request
N+1 arrives while request N's response is still being written) via
read_body_or_idle()'s idle branch stashing non-zero reads into the
body reader's overread surface. abort_on_close / half_closed are
untouched so FIN handling is unchanged.

HttpPersistentSettings carries pipelining_enabled + pipelined_prefix
across keep-alive reuses; read_request() consumes the prefix first.

Resolves cloudflare#377, cloudflare#673
---
Add pipelining proxy example

Minimal ProxyHttp that opts in via set_pipelining_enabled(true) in
early_request_filter. Matches the reproducer shape from cloudflare#377: pipelined
GETs on one connection now both return responses.
---
Make HTTP/1 reuse pipelining-aware by default

Collapse `Session::finish()` and `HttpSession::reuse()` into a single
pipelining-aware API so prefix bytes cannot be silently dropped. Rename
`ReusedHttpConnection` to `ReusableHttpStream` to match the reusable to
reused lifecycle, simplify the prefix `buf.reserve()` in `read_request()`,
and tighten the `read_body_or_idle()` comment so it no longer implies
`abort_on_close` can still fire after pipelined bytes are stashed.

Includes-commit: 1cfca92
Includes-commit: a1527cc
Includes-commit: ae26ae0
Replicated-from: cloudflare#876
Fixed formatting

Fixed clippy errors

Includes-commit: 72c03b3
Includes-commit: 92d8050
Includes-commit: b3de911
Replicated-from: cloudflare#817
@molocule molocule changed the title add rst_stream select branch Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.