rst_stream select branch#2
Open
molocule wants to merge 92 commits into
Open
Conversation
If an h2 upstream response finishes its content-length but doesn't actually send END_STREAM, we may continue to wait on the h2 upstream while an h1 downstream may have already sent another request on that connection. This could create issues where, if the EOS never arrived, the next request would be stalled from being processed up to the read timeout of that h2 upstream. Co-authored-by: Edward Wang <ewang@cloudflare.com>
Co-authored-by: Andrew Hauck <ahauck@cloudflare.com>
Adds ssl_export_keying_material function to both pingora-openssl and pingora-boringssl ext modules. This wraps the underlying SSL library's export_keying_material method for RFC 5705 compliance. Includes-commit: e930a5c Replicated-from: cloudflare#729
Includes-commit: 08133c6 Replicated-from: cloudflare#795 Co-authored-by: Fei Deng <fdeng@cloudflare.com>
h2c (HTTP/2 cleartext) preface detection should only run on cleartext TCP connections. On TLS, ALPN negotiates the protocol during the handshake. When h2c is enabled and both TCP and TLS listeners share a service, preface detection runs on TLS streams too. On TLS, try_peek returns peeked=false, leaving h2c=true unconditionally. This forces all TLS connections into the HTTP/2 branch, breaking HTTP/1.1 clients. Fix: check get_ssl_digest().is_some() to detect TLS and skip h2c detection, letting the existing ALPN check decide the protocol. --- add tests for h2c + TLS interaction Includes-commit: 08af462 Includes-commit: ce297ac Replicated-from: cloudflare#826 Co-authored-by: Anthony Daniel Turcios <anthonydaniel@cloudflare.com>
I noticed this downstream: crate-audit> Crate: protobuf crate-audit> Version: 2.28.0 crate-audit> Title: Crash due to uncontrolled recursion in protobuf crate crate-audit> Date: 2024-12-12 crate-audit> ID: ~~~-0437 crate-audit> URL: https://rustsec.org/advisories/~~~-0437 crate-audit> Solution: Upgrade to >=3.7.2 crate-audit> Dependency tree: crate-audit> protobuf 2.28.0 crate-audit> └── prometheus 0.13.4 crate-audit> └── pingora-core 0.6.0 This is already fixed upstream in `prometheus`. We just need to bump the version here to include the fix, no further actions need to be taken. Includes-commit: a478394 Replicated-from: cloudflare#708
Includes-commit: 15b962b Replicated-from: cloudflare#612
Enable persisting user-defined state across HTTP/1.x keepalive requests on the same downstream connection.
This allows adjusting the blocking thread pool in the pingora server's runtime using configuration.
This was removed as part of the bootstrap refactor. The client guard needs to be retained and reinit post fork.
Adds an async filter (feature-gated) to adjust upstream modules prior to those modules (currently just compression) running.
This prevents headers like 100-continue from ending the stream and causing hangs while the downstream is waiting.
…pgrade When bootstrap_as_a_service is enabled, listen_fds() was called to snapshot the fd table before BootstrapService had run, always returning None. Services would then bind fresh sockets instead of inheriting the old process's fds, breaking graceful upgrades. Fix this by eagerly allocating the ListenFds table in Bootstrap::new() so it is non-optional and already distributed to all services before bootstrap runs. When load_fds() later receives the inherited fds from the old process, it populates the same shared table in place, making them visible to all services without any re-distribution.
This update introduces the abort_on_close feature to control behavior when a client closes the connection after the request body. When enabled (default), it results in a ConnectionClosed error, allowing the proxy to abort immediately. When disabled, the proxy can continue processing the upstream response. Includes-commit: a6420f8 Replicated-from: cloudflare#836
Add fields such that callers can distinguish a successful subrequest from one that died silently or was cut short. The handle lets callers await post-response cleanup (cache writes, logging) before issuing the next subrequest.
…istener port collisions test_conn_timeout / test_conn_timeout_with_offload: Replace 192.0.2.1 (TEST-~~~) with a bound-but-not-listening local socket via the new timeout_socket() helper in utils::for_testing. Because listen() is never called, the kernel silently drops SYN packets, guaranteeing a real ConnectTimedout on Linux. The total_connection_timeout tests still use 192.0.2.1 (SEMI_BLACKHOLE) since they test error classification and accept ConnectNoRoute as an alternative. test_tls_psk (s2n): PskTlsServer::start() spawned a background thread with no readiness check. Use an mpsc channel to signal after TcpListener::bind so tests only proceed once the port is ready. Also make the accept loop resilient to handshake failures (continue instead of panic) so a stale probe cannot take down the server. test_1xx_caching: mock_1xx_server used fixed ports (6151/6152) and sleep(100ms) for readiness. Refactored to spawn_mock_1xx_server which binds to port 0 (OS-assigned) and signals readiness via a oneshot channel after bind. Eliminates AddrInUse from TIME_WAIT and sleep races. test_listen_tcp / test_listen_tcp_ipv6_only: Hardcoded ports 7100-7102 collided across parallel CI test jobs. Switch to port 0 with the new ListenerEndpoint::local_addr() / Listener::local_addr() methods to discover the actual bound port.
ListenFds only guards an in-memory fd table and a blocking send_to_sock call, neither of which benefit from an async mutex. Switch to parking_lot::Mutex and move the fd-send path in main_loop onto the blocking thread pool via spawn_blocking. Because the parking_lot lock cannot be held across bind().await in ListenerEndpointBuilder::listen(), introduce a global per-address async lock map (flurry::HashMap<String, Arc<tokio::sync::Mutex<()>>>) that serializes the check-bind-insert sequence for each address. This prevents two concurrent callers from racing to bind the same address while the ListenFds lock is released.
The MSRV (1.84.0) job fails because cargo test compiles dev-dependencies. A transitive dev-dependency chain (pingora-proxy -> tokio-tungstenite -> tungstenite -> sha1 -> cpufeatures v0.3.0) pulls in a crate that uses edition 2024, which Cargo 1.84.0 cannot parse. Run cargo check --workspace for all toolchains and skip cargo test on the MSRV.
…ng graceful upgrades
Add BodyWriter task API (send_body_task, write_current_body_task, send_finish_task, write_current_finish_task) and HeaderWriter for cancel-safe writes that can be used in tokio::select! loops.
Using the proxy task API allows polling for the upstream rx task at the same time, so that upstream cache writes can continue even while serving downstream. proxy_h2 and h2 downstream (as well as custom) is a todo.
HttpHealthCheck::check() sets read_timeout on the session before reading the response, but never applies write_timeout before the write sequence (write_request_header, finish_request_body, finish_custom). When using a custom protocol connector whose finish methods await internal handshake signals, the absence of a write timeout allows those awaits to block indefinitely. A single stuck backend can wedge the entire health check batch, causing stale health state and misrouted traffic.
A single failing shard no longer aborts the whole loop. Missing shard files on load are treated as empty rather than an error.
Includes-commit: 10041fa Replicated-from: cloudflare#845
Add a PreTlsProcess trait and set_pre_tls_callback() method that allows applications to process raw bytes before the TLS handshake occurs. This is useful for protocols like HAProxy's PROXY protocol, which sends client address information before TLS. The callback reads and consumes protocol headers, then updates the socket digest with the real client address. --- Add shutdown_with_reason for H2 streams Allow sending RST_STREAM with a custom reason code instead of hardcoded INTERNAL_ERROR. This enables RFC 7540 §9.1.2 compliant 421 responses where HTTP_1_1_REQUIRED can signal clients to retry over HTTP/1.1. Fixes cloudflare#787 --- Merge pull request #1 from jaw-sh/h2-custom-reset-reason Add shutdown_with_reason for H2 streams --- Make Stream::rewind() public for protocol detection PreTlsProcess implementations need to put data back onto the stream when it doesn't match the expected protocol signature. This lets the PROXY protocol handler rewind non-PROXY data so TLS proceeds normally. --- move pre-TLS callback inside TLS branch Co-authored-by: Josh <jcmoon@pm.me> Includes-commit: 1bd3c47 Includes-commit: 33465f3 Includes-commit: 58e68a6 Includes-commit: 9aa73fb Includes-commit: b3d1960 Replicated-from: cloudflare#799
--- Move docs from v0.3 to 0.7 --- Update version to 0.8 Includes-commit: 0a643a3 Includes-commit: 5f7838c Includes-commit: 6ffee89 Replicated-from: cloudflare#828
Adds ssl_export_keying_material function to pingora-s2n ext module, wrapping s2n-tls's built-in tls_exporter method for RFC 5705. Includes-commit: 839e122 Replicated-from: cloudflare#744
Use tokio::io::AsyncWriteExt::write_all_buf directly at every call site, including inside hand-written poll_* state machines by stack-pinning a fresh future per poll. Since WriteAllBuf is stateless (all progress lives in the caller-owned Buf), re-creating it on each poll is sound. Delete the AsyncWriteVec trait, its blanket impl, the WriteVec / WriteVecAll future structs, and the poll_write_all_buf / poll_write_vec_all_buf free functions from stream.rs. Behavioral note: tokio's write_all_buf will use writev() when the underlying writer reports is_write_vectored() — true for RawStream — so the call sites that previously used the explicitly-non-vectored poll_write_all_buf (content-length body, until-close body, header writes) now use vectored writes. The wire bytes are identical; only the syscall path differs. Worth scrutinizing if we see new downstream response write perf anomalies. Add unit tests in body::test_poll_body_writer that drive BodyWriter::poll_write_current_body_task and BodyWriter::poll_write_current_finish_task directly via futures::task::noop_waker with an in-memory MockWriter. No tokio runtime is involved, so the tests are cheap and exercise the cancel-safe restart paths explicitly. Coverage: - Content-length body, single happy write - Content-length body, resume after Pending + short write - Chunked body, vectored writer (3-chunk WriteBuf::Chained) - Chunked body, short-write resume on first inner chunk - Until-close body, single happy write - Finish chunked, '0\r\n\r\n' terminator
…al to avoid stale entries in LRU, address other outstanding TODOs
Add opt-in HTTP/1.1 pipelining via HttpSession::set_pipelining_enabled(). When enabled, pipelined requests on a keep-alive connection are served sequentially in request order per RFC 9112 §9.3.2, matching nginx behavior on the same traffic shape. Default (off) is unchanged: the second pipelined request is dropped or surfaced as a 400 by the body-pump idle branch. Non-adopters are untouched: ServerSession::finish() and the H1 HttpSession::reuse() keep their pre-pipelining Result<Option<Stream>> signatures and discard any captured pipelined prefix. Adopters call finish_reuse() / reuse_pipelined() to receive ReusedHttpConnection. Covers both wire shapes: same-segment overread (both requests arrive in one read) via reuse_pipelined(), and two-segment overread (request N+1 arrives while request N's response is still being written) via read_body_or_idle()'s idle branch stashing non-zero reads into the body reader's overread surface. abort_on_close / half_closed are untouched so FIN handling is unchanged. HttpPersistentSettings carries pipelining_enabled + pipelined_prefix across keep-alive reuses; read_request() consumes the prefix first. Resolves cloudflare#377, cloudflare#673 --- Add pipelining proxy example Minimal ProxyHttp that opts in via set_pipelining_enabled(true) in early_request_filter. Matches the reproducer shape from cloudflare#377: pipelined GETs on one connection now both return responses. --- Make HTTP/1 reuse pipelining-aware by default Collapse `Session::finish()` and `HttpSession::reuse()` into a single pipelining-aware API so prefix bytes cannot be silently dropped. Rename `ReusedHttpConnection` to `ReusableHttpStream` to match the reusable to reused lifecycle, simplify the prefix `buf.reserve()` in `read_request()`, and tighten the `read_body_or_idle()` comment so it no longer implies `abort_on_close` can still fire after pipelined bytes are stashed. Includes-commit: 1cfca92 Includes-commit: a1527cc Includes-commit: ae26ae0 Replicated-from: cloudflare#876
Fixed formatting Fixed clippy errors Includes-commit: 72c03b3 Includes-commit: 92d8050 Includes-commit: b3de911 Replicated-from: cloudflare#817
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
details here: https://www.notion.so/modal-com/H2-Investigation-3531e7f16949805daa0dcff33397fa40?source=copy_link#37a1e7f1694980068b3ad0bb99b8dbbe