Enable HTTP/2 support in kube-client by frobware · Pull Request #1823 · kube-rs/kube

frobware · 2025-09-11T06:54:26Z

This change enables HTTP/2 support in the client, allowing multiple requests to multiplex over a single connection. The implementation simply adds the http2 feature flag to hyper and its related dependencies, then enables HTTP/2 alongside HTTP/1 in the rustls connector configuration.

I noticed that I had 400+ open connections to the API server when streaming logs from all pods in my cluster, each requiring its own TCP handshake and TLS negotiation. With HTTP/2 enabled, I now only have a single connection handling all those streams, eliminating the overhead and avoiding potential resource exhaustion from hitting connection limits.

clux · 2025-09-11T10:07:00Z

wow, really?! i wasn't aware that the apiserver supported http2 for this, but sure enough, there are flags for it.

I do see it gives us some UpgradeConnection(ProtocolSwitch(500)) errors related to http upgrades for websocket related (exec/attach) stuff though 🤔

frobware · 2025-09-11T14:33:02Z

I do see it gives us some UpgradeConnection(ProtocolSwitch(500)) errors related to http upgrades for websocket related (exec/attach) stuff though 🤔

I'll take a look over the regressions, but it won't be immediately.

clux · 2025-09-11T16:09:15Z

Appreciated. No rush.

From a quick googling around, this comment in client-go possibly implies some mutual exclusivity in usage, and perhaps we need to set Request::version_mut to override back to 1.1 for websocket stuff in kube-core's Request constructors. I could be wrong though.

clux · 2026-01-21T15:45:26Z

hey there. do you think you'll have time to come back at this at some point? i can open an issue for someone else to tackle if not.

frobware · 2026-01-24T07:41:21Z

hey there. do you think you'll have time to come back at this at some point? i can open an issue for someone else to tackle if not.

I’d like to say “yes”, but realistically it may be 1 to 2 months before I can get back to this.

For context: this work matters to me because I’ve been building a tool called kat (tailing Kubernetes pod logs). Its default behaviour is to tail all containers across all pods and namespaces, and without connection reuse (HTTP/2) it’s easy to run into file-descriptor limits on some clusters I connect to. So I definitely care about getting HTTP/2 right here - I’m just not sure when I’ll be able to actively work on it again.

If someone else is able to pick this up in the meantime, that would be hugely appreciated. If not, I may be able to return to it towards late March / April.

clux · 2026-01-29T18:47:39Z

Thanks for the response. I have opened #1923 for now to let people experiment if the interest in there.

uchiha-vivek · 2026-01-31T22:33:20Z

@clux I will try to explore it after going through all the above comments.
Will brainstorm in #1923

frobware · 2026-04-10T13:31:46Z

I hope to start looking at this again in the next few weeks. 🤞

Add http2 feature to hyper, hyper-util, and hyper-rustls to enable HTTP/2 ALPN support in the kube-client. This complements the enable_http2() API call to provide full HTTP/2 multiplexing capability for Kubernetes API connections. Signed-off-by: Andrew McDermott <aim@frobware.com>

This change adds HTTP/2 ALPN support to the rustls HTTPS connector by calling enable_http2() in addition to enable_http1(). This enables HTTP/2 multiplexing for Kubernetes API connections, reducing the number of TCP connections from potentially hundreds to a single multiplexed connection per API server. Previously, kube-rs only supported HTTP/1.1 connections, resulting in a new TCP connection for each concurrent API request. With HTTP/2 enabled, multiple requests can be multiplexed over a single connection, improving performance and reducing resource usage. Signed-off-by: Andrew McDermott <aim@frobware.com>

frobware · 2026-04-30T21:35:19Z

Sorry for the silence here -- finally came back to it. The branch is rebased on main, and I've pushed a stab at the routing split that #1923 hinted at (commit c64e575).

The shape is two transports inside Client. The primary one is h2-capable and carries TokioTimer plus HTTP/2 keep-alive PINGs (interval 30s, while-idle on), which is the bit that should actually help the HAProxy / 35s idle-timeout case. The secondary one is HTTP/1.1-only and is wired into Client::connect(), so exec/attach/port-forward stop being attempted over h2 -- that's what was producing the UpgradeConnection(ProtocolSwitch(...)) failures the integration job was hitting.

One thing that bit me along the way: hyper-rustls' enable_http1()-only builder leaves alpn_protocols empty, which silently lets the server pick h2 anyway. The h1 sibling has to advertise http/1.1 explicitly. Rather than push for an upstream change in hyper-rustls (and to keep Config::tls_server_name honoured on the upgrade path), I ended up shipping a small H1OnlyHttpsConnector in kube-client/src/client/tls.rs that mirrors hyper-rustls' TCP-then-TLS dance over public types. tokio-rustls becomes a direct dep gated on rustls-tls (already in the dep tree transitively).

openssl ALPN parity I've left for a follow-up -- openssl was already h1-in-practice (no ALPN at all), so it's not touched here.

One breaking change worth calling out: Config gains a public disable_http2: bool runtime escape hatch (default false, so HTTP/2 stays on). It matches the existing disable_compression style, but it's an exhaustive-struct-literal break for downstream code that constructs Config by hand. I don't have any desire to break the API -- I just wanted to see if this approach could pass CI. If reviewers would rather, I'm happy to move the flag onto ClientBuilder instead -- that's genuinely non-breaking, it just doesn't thread through Config::infer cleanly.

Important caveat: this has had genuinely minimal exposure. It is largely a stab in the dark to see if I could get CI green. Locally I've run the integration suite, the upgrade-path examples (pod_exec, pod_attach, pod_portforward*, pod_shell*, log_stream), and a cargo hack feature powerset -- all clean. But I haven't even pointed my own kat log streamer at it (the thing that made me care about h2 in the first place), and nothing here exercises the long-running watch behind HAProxy from #1923.

The previous commit on this branch enabled HTTP/2 on the rustls connector by switching ALPN from `enable_http1()` alone to `enable_http1().enable_http2()`. REST traffic, watch streams and log streaming benefit from HTTP/2 multiplexing, but exec, attach and port-forward use HTTP/1.1 connection upgrades, which are unrepresentable on an HTTP/2 connection. After ALPN negotiates `h2`, the apiserver answered upgrade requests with 4xx/5xx and the integration suite started failing with `UpgradeConnection(ProtocolSwitch(...))`. Fix this by giving `Client` two transports with separate connection pools and ALPN policies: * a primary, h2-capable transport for normal traffic, configured with `TokioTimer` and HTTP/2 keep-alive PINGs (interval 30s, while-idle on) so watch streams survive idle-killing intermediaries such as HAProxy; * an HTTP/1.1-only transport used by `Client::connect()`, the only caller of `hyper::upgrade::on` in the workspace, hard-routed there via a new `upgrade_inner` field on `Client`. A subtle point governs the h1-only path on rustls. hyper-rustls' builder asserts that `alpn_protocols.is_empty()` when accepting a rustls config, and only `enable_http2()` populates the list. The `enable_http1()`-only builder path therefore leaves ALPN empty, which means no ALPN extension is sent on the wire and a modern apiserver may still negotiate HTTP/2 -- the very condition we are trying to avoid. The h1-only sibling must advertise `http/1.1` *explicitly*. The natural workaround -- `HttpsConnector::from((http, Arc::new(rustls_config)))` with `alpn_protocols = vec![b"http/1.1".to_vec()]` -- bypasses the builder's assertion but also drops `Config::tls_server_name`, because the `From` impl constructs the connector with the default server-name resolver and the resolver field is private. Rather than depend on an upstream hyper-rustls change, ship a small `H1OnlyHttpsConnector<H>` in `kube-client/src/client/tls.rs` that mirrors the public TCP-then-TLS dance from `hyper_rustls::HttpsConnector::call`, sets the explicit ALPN advertisement, and resolves SNI via `Config::tls_server_name` when set or from the URI host otherwise. This adds `tokio-rustls` as a direct dep gated on `rustls-tls` (already in the dep tree transitively via hyper-rustls). Other invariants worth preserving: * `Config::auth_layer()` and `Config::extra_headers_layer()` are computed once and *cloned* into both transport stacks. Calling `auth_layer()` twice would mint independent `RefreshableToken` state, so each transport would refresh tokens on its own and could diverge under exec-plugin or token-file auth. `AuthLayer` gains `#[derive(Clone)]` for this; the inner `Either` was already trivially cloneable. * `Config` gains a public `disable_http2: bool` field as a runtime escape hatch. When set, the primary transport falls back to the h1-only client; the two-client shape stays the same so the upgrade path is unaffected. HTTP/2 stays on by default; the field matches the existing `disable_compression` style. * `ClientBuilder` grows `with_upgrade_service`, paired with a new `Client::new_with_upgrade` constructor for custom-service users who supply their own service stack and need to opt in to the split. The single-service `Client::new` keeps its signature; it internally clones the buffered handle into `upgrade_inner` so back-compat is preserved. Verified against k3d v1.34.1 with the integration suite, the `pod_exec`, `pod_attach`, `pod_portforward*`, `pod_shell*` and `log_stream` examples, and a full 1200-combo `cargo hack` feature powerset. Signed-off-by: Andrew McDermott <aim@frobware.com>

codecov · 2026-05-01T09:30:19Z

Codecov Report

❌ Patch coverage is 75.15924% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.5%. Comparing base (819d08a) to head (59ea267).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
kube-client/src/client/builder.rs	77.8%	14 Missing ⚠️
kube-client/src/client/tls.rs	72.1%	12 Missing ⚠️
kube-client/src/client/config_ext.rs	57.2%	9 Missing ⚠️
kube-client/src/client/mod.rs	86.7%	4 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1823     +/-   ##
=======================================
+ Coverage   76.5%   76.5%   +0.1%     
=======================================
  Files         89      89             
  Lines       8658    8772    +114     
=======================================
+ Hits        6616    6704     +88     
- Misses      2042    2068     +26

Files with missing lines	Coverage Δ
kube-client/src/client/middleware/mod.rs	`93.4% <ø> (ø)`
kube-client/src/config/mod.rs	`54.6% <ø> (ø)`
kube-client/src/client/mod.rs	`79.5% <86.7%> (+2.0%)`	⬆️
kube-client/src/client/config_ext.rs	`54.1% <57.2%> (+1.8%)`	⬆️
kube-client/src/client/tls.rs	`82.6% <72.1%> (-3.4%)`	⬇️
kube-client/src/client/builder.rs	`71.5% <77.8%> (+1.8%)`	⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

frobware · 2026-05-01T09:41:25Z

Pointed kat (built against this branch) at an OpenShift apiserver overnight, about 10 hours. kat is the log streamer this PR is partly aimed at. Around 890k log lines streamed in the window. Connection count was one at start of the run, five when I next looked this morning, and three a few minutes later -- not a continuous trace, but the working set is clearly small and multiplexed. One of the connections from the morning snapshot has stayed up across the whole period (same source port from start to finish). Pre-PR I'd have been opening a fresh TCP+TLS handshake per pod-container as they churned, which was the original itch.

Worth being precise about the topology. There are two HAProxies in the cluster but only one is in kat's data path: the external lbx-ocp-api LB does TCP pass-through with SNI-based routing across the three masters, and TLS/ALPN terminate on the apiserver itself. The relevant config:

defaults
  log global
  option dontlognull
  timeout connect 10s
  timeout client 1h
  timeout server 1h
  timeout tunnel 1h

frontend fe_kubeapi
  mode tcp
  bind :6443
  tcp-request inspect-delay 5s
  tcp-request content accept if { req.ssl_hello_type 1 }
  # SNI-based routing to per-cluster backends

So this run doesn't exercise an L7 HAProxy with an aggressive timeout client -- the in-cluster OpenShift router does have those (timeout client 30s, mode http) but it fronts application Routes, not the apiserver, so kat doesn't transit it. The lukasstockner / #1923 case still hasn't been exercised end-to-end.

It's a useful data point that the multiplexing savings hold for hours of real traffic, with no errors or stalls. For an extra poke at robustness I SIGSTOP'd the process for a while to let the kernel buffer up, then fg'd it; hyper drained the backlog and carried on without complaint.

A separate thing worth discussing: the keep-alive interval is currently hard-coded to 30s in build_main_service. Against a 30s timeout client the PING has no margin; against the 1h timeouts in my LB it's wasteful chatter. There's probably a case for making it a knob on Config (default 30s, None to disable), same shape as disable_http2 -- but happy to defer that conversation for the moment.

frobware · 2026-05-01T09:52:15Z

A bit of belt-and-braces while I'm here -- corroborating the apiserver's protocol behaviour with curl, independent of any kube-rs code, so the assumptions the PR rests on are visible. Two probes against the same external HAProxy LB: once with the per-cluster apiserver hostname (the path kat actually takes) and once with the LB's own hostname (to demonstrate the SNI policy in the rejection direction).

Probe 1: per-cluster apiserver via the LB

curl -sk <opts> -o /dev/null -w '%{http_version} %{http_code}' https://api.ocp421.int.frobware.com:6443/version

Curl options	What curl offers in ALPN	Expectation	Observed
`--http2`	`h2,http/1.1`	server picks `h2`	`http_version=2 http_code=200`
`--http1.1`	`http/1.1`	server picks `http/1.1`	`http_version=1.1 http_code=200`
`--http2-prior-knowledge`	`h2`	server picks `h2`	`http_version=2 http_code=200`

These three rows mirror the two ALPN advertisements this PR makes (h2,http/1.1 on the main transport, http/1.1 on the upgrade transport) plus the h2-only case for completeness. The apiserver picks h2 when offered, falls back to h1 cleanly when only h1 is offered. That is the whole mechanism the dual-transport split depends on.

Verbose for the --http2 row to show ALPN explicitly:

* ALPN: curl offers h2,http/1.1
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / X25519MLKEM768 / RSASSA-PSS
* ALPN: server accepted h2
* using HTTP/2
> GET /version HTTP/2
> Host: api.ocp421.int.frobware.com:6443
< HTTP/2 200
< audit-id: ...
< content-type: application/json

Probe 2: the LB hostname directly (negative test)

curl -sk --http2 https://lbx-ocp-api.int.frobware.com:6443/version

Curl options	Expectation	Observed
`--http2`	TLS handshake rejected; SNI not in routing map	`unexpected eof while reading` (TLS `decode error`)
`--http1.1`	same	`http_version=0 http_code=000`
`--http2-prior-knowledge`	same	`http_version=0 http_code=000`

Verbose for --http2:

*   Trying 192.168.7.43:6443...
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (OUT), TLS alert, decode error (562):
* TLS connect error: error:0A000126:SSL routines::unexpected eof while reading
* closing connection #0

The HAProxy SNI map only forwards traffic for known per-cluster SNIs (api.ocp420.*, api.ocp421.*, api.ocp422.*); anything else hits bk_reject_tcp. So the LB enforces SNI policy at TCP/TLS, before any HTTP-level negotiation has happened. Worth knowing: if a kube-rs caller against this kind of LB sets the wrong tls_server_name (or none, when one is required), the failure is at TLS handshake, not at the HTTP layer -- which is exactly where the in-tree H1OnlyHttpsConnector honours Config::tls_server_name.

Signed-off-by: Eirik A <sszynrae@gmail.com>

clux · 2026-05-05T13:42:01Z

Thanks a lot for this. I think it looks very reasonable at a high level view. Will have to dig in a bit more on some of these bits (as time allows).

One breaking change worth calling out: Config gains a public disable_http2: bool runtime escape hatch (default false, so HTTP/2 stays on). It matches the existing disable_compression style, but it's an exhaustive-struct-literal break for downstream code that constructs Config by hand. I don't have any desire to break the API -- I just wanted to see if this approach could pass CI. If reviewers would rather, I'm happy to move the flag onto ClientBuilder instead -- that's genuinely non-breaking, it just doesn't thread through Config::infer cleanly.

Not a problem. A #[non_exhaustive] attr was added to Config for 4.0 anyway in 819d08a. Which is helpful now because I expect we might not have enough time to test this properly for 4.0, but if this is the only breaking change, we can in theory add it in 4.1 if it's deemed safe.

Have resolved some tiny Cargo.toml conflicts on your branch as a result of 819d08a. (feel free to over smash over it, but this stuff is squash-merged at the end anyway)

clux

quick comment on timeouts

clux · 2026-05-05T12:03:47Z

+    // transports. Calling `auth_layer()` twice would mint independent
+    // `RefreshableToken` state per transport, so each path would refresh
+    // tokens on its own and they'd diverge under exec-plugin or token-file
+    // auth.


really appreciate these detailed comments here.

helps the complexity go down a little more easily.

clux · 2026-05-05T12:12:32Z

+    let mut connector = TimeoutConnector::new(connector);
+    connector.set_connect_timeout(config.connect_timeout);
+    connector.set_read_timeout(config.read_timeout);
+    connector.set_write_timeout(config.write_timeout);


The write timeout here is default 295s which is a number very specific to long watches. I am not sure it makes sense for websocket calls. But I don't have a good recommendation either yet.

frobware · 2026-05-05T13:44:23Z

I started on

commit d9f74781aec98413f32df48e6bfb1b701072e580 (HEAD -> enable-http2)
Author: Andrew McDermott <aim@frobware.com>
Date:   Thu Apr 30 23:28:07 2026 +0100

    Add openssl ALPN parity for the dual-transport split

    The previous commit gave the rustls path two transports with explicit
    ALPN advertisements but left openssl untouched: openssl users got the
    existing no-ALPN-default behaviour on both transports, which meant no
    HTTP/2 benefit on the main path and only accidental HTTP/1.1 on the
    upgrade path.

    Bring openssl to parity:

but haven't pushed the commit yet as a) it was getting late and b) ZERO testing.

frobware force-pushed the enable-http2 branch from 719aab8 to 0b5cdf2 Compare September 11, 2025 06:55

clux added the changelog-add changelog added category for prs label Sep 11, 2025

clux mentioned this pull request Jan 29, 2026

Try to enable HTTP/2 support in the client #1923

Open

clux marked this pull request as draft January 29, 2026 18:48

frobware added 2 commits April 30, 2026 19:37

frobware force-pushed the enable-http2 branch from 0b5cdf2 to 442672d Compare April 30, 2026 18:37

frobware force-pushed the enable-http2 branch from c64e575 to 4e33b1a Compare April 30, 2026 21:44

Merge branch 'main' into enable-http2

59ea267

Signed-off-by: Eirik A <sszynrae@gmail.com>

clux reviewed May 5, 2026

View reviewed changes

Uh oh!

Conversation

frobware commented Sep 11, 2025

Uh oh!

clux commented Sep 11, 2025

Uh oh!

frobware commented Sep 11, 2025

Uh oh!

clux commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clux commented Jan 21, 2026

Uh oh!

frobware commented Jan 24, 2026

Uh oh!

clux commented Jan 29, 2026

Uh oh!

uchiha-vivek commented Jan 31, 2026

Uh oh!

frobware commented Apr 10, 2026

Uh oh!

frobware commented Apr 30, 2026

Uh oh!

codecov Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

frobware commented May 1, 2026

Uh oh!

frobware commented May 1, 2026

Probe 1: per-cluster apiserver via the LB

Probe 2: the LB hostname directly (negative test)

Uh oh!

clux commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clux left a comment

Choose a reason for hiding this comment

Uh oh!

clux May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

clux May 5, 2026

Choose a reason for hiding this comment

Uh oh!

frobware commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clux commented Sep 11, 2025 •

edited

Loading

codecov Bot commented May 1, 2026 •

edited

Loading

clux commented May 5, 2026 •

edited

Loading

clux May 5, 2026 •

edited

Loading