Proposed v26.2.x: 68 RP commits on scylladb/master c302102b7 by travisdowns · Pull Request #277 · redpanda-data/seastar

travisdowns · 2026-05-07T02:48:45Z

This PR shows the proposed contents of the v26.2.x branch after rebase for review. It contains 73 redpanda-specific commits on top of scylladb/master at c302102b7. I don't expect anyone to review these lines by line as these have already been largely reviewed when originally checked in, but more look at the overall approach and maybe do spot checks.

@dotnwat put you on here especially for the OpenSSL stuff.

Reference

Base (proposed-v26.2.x-merge-base): c302102b7c3ac10a02723167dcb155be908b135c — scylladb/master tip as of the rebase point. This is a frozen reference for the diff.
Head (travisdowns:proposed-v26.2.x): 0dd1a13c11d8573481180deab69eb7ca0b345b74 — the proposed v26.2.x with all RP commits replayed.

What's in here

73 commits, broken down as:

41 clean replays of v26.2.x-pre commits (patch identical modulo line-number shifts)
17 minor edits (small adaptations to upstream changes)
3 major edits (substantial rework)
9 ports of RP-specific TLS features rewritten against upstream's pluggable crypto provider architecture (folded into 4 commits)
2 ports of other RP-specific behavior on top of upstream
7 cherry-picks from a separate td-tls-single-provider branch that adds explicit TLS-backend selection (--tls-mode={gnutls,openssl,both}) and tightens up single-backend builds

A further commits from v26.2.x-pre were dropped or upstreamed during this work.

Reading the diff

The TLS surface in particular saw the biggest changes because upstream merged a pluggable crypto provider rewrite (Noah Watkins, scylladb#3360 series); our RP-specific TLS features (cert_info, reload_callback_with_creds, dn_format) were rewritten against that new architecture rather than carried forward as-is.

Additional details

Two comments on this PR have additional context:

Abridged rebase notes — the high-level story (re-rebase history, upstream PRs, build/test status) and a link to the full notes in a gist.
Per-commit accounting — full table of every v26.2.x-pre commit with its disposition in this PR (clean / edits / ported / upstreamed / dropped).

Seastar http server implementation supports multiple listeners. It may be required for the handler logic to know which listener the connection is coming from. Added listener_idx field to `httpd::request` to allow handler recognize listener. Signed-off-by: Michal Maslanka <michal@vectorized.io>

Since an exception carries some text for the response body text, the raising site might like to specify the content type if it's e.g. json. Signed-off-by: John Spray <jcs@vectorized.io>

This enables throwing a base_exception from a json request handler with a json payload inside it. Signed-off-by: John Spray <jcs@vectorized.io>

Signed-off-by: John Spray <jcs@vectorized.io>

Prior to this patch seastar only exposes one global metrics::impl::impl object which holds all metric related data for one application. This patch changes the implementation details such that multiple metrics::impl::impl objects can exist for any given application. Said objects are stored into a map on each shard and created dinamically whenever requested. A metrics::impl::impl is identified by an integer handle that acts as the key for the storage map. Implementation note: in order to avoid issues caused by the ordering of static thread_local objects I had to declare the storage in reactor.cc. (cherry picked from commit 585a8af)

This patch extends the metrics internal apis to use a specific metrics::impl::impl object identified by its integer handle. (cherry picked from commit 6ee4af7)

Add a public method to metric_groups_impl that exposes the handle of the internal implementation it is using. This is required in order for the metric_groups class to be able to reset itself to the configured implementation handle.

This patch extends the metrics user facing apis to use a specific metrics::impl::impl object identified by its integer handle. Note that the constructor of 'metric_groups' is marked explicit in this patch and updates two call sites where the constructor was used implicitly.

This patch removes two subsequent calls to `get_local_impl` and reuses the returned handle in that scope.

This patch extends the user facing prometheus apis allowing the user to specify the internal metrics implementation to be used through a handle. Additionally, 'add_prometheus_routes' now takes an argument that specifies the route on which to advertise the metrics. This enables different metrics "namespaces" to be served by different endpoints in isolation. (cherry picked from commit 6189522)

This patch extends the scollectd apis with the ability to select the internal metrics implementation to be used by providing a handle. (cherry picked from commit d4331d1)

This patch adds a 'get_skip_when_empy' getter to the 'registered_metric' class. It is used by follow-up patches in order to replicate metrics.

This patch adds private methods to the 'metrics::impl' class that deal with the creation of replicated metrics. They will be used to build the public api in future commits.

This patch adds private helpers to 'metrics::impl' that deal with the removal of replicated metric families from their destintation implementation. These methods will be used in subsequent commits to manage the lifetime of replicated metrics.

This patch adds a public method to the 'metrics::impl' class: 'set_metric_families_to_replicate'. When this method is called the families that match any of the specifications will be replicated on the specified destinations.

This patch extends the metric registration and unregistration processes to make them aware of metric replication. In the case of metric registration, if the new metric belongs to a family that matches one of the replication specs, then a replicated metric is created accordingly. For unregistration of a metric, the replicated metric is unregistered too if one exists.

This patch exposes a method in the public interface of the metrics module ('replicate_metric_families'), which enables metric replication internally for the requested metric families.

Extends the metrics api to allow changing the aggregation labels of a metrics family. Otherwise one had to un-register every single metric instance in a metric family and then re-register with the changed aggregation labels. For metric families with thousands of instances (e.g.: histograms with lots of different labels) this is quite expensive. With this change we avoid the full reconstruction of the metrics family and all its metrics. Only the work associated with marking the metrics `dirty()` is needed then.

This commit extends the public interface of scheduling_group to expose usage statistics (e.g. runtime).

Redpanda uses this logger name for its http client and we choose to change the logger name in Seastar to avoid duplicate logger registration exceptions.

We sort of inadvertently picked up the change to io_uring when rebasing our seastar fork, but for the coming release we'd like to keep using aio to reduce risk and give sufficient time to do performance tests on io_uring. This effectively rolls back the upstream commit: eedca15 We simply put aio and epoll above io_uring in the available list, the default backend is the first one in that list Issue redpanda-data/redpanda#10105.

This change is to allow for the timer code in the stall detectors to be used in a profiler implementation. The hope is to be able to reuse a lot of the stall_detector codebase in the profiler without complicating the existing stall_detector implementation. The new posix_timer constructor sets sigev_notify_thread_id via the macro that musl libc / FreeBSD / Linux define for that field, mirroring upstream commit bbe1af3 (which patched the cpu_stall_detector_posix_timer constructor but predated and so missed the new posix_timer site). see https://sourceware.org/bugzilla/show_bug.cgi?id=27417

Moves some functions and classes defined in the stall_detector test to a separate header file so they can be reused in other tests.

In libgcc there is a critical section where the stack is being modified so execution can return to an exceptions landing pad. However, this partially modified state isn’t valid or specified by the dwarf info in the eh_frames. So a seg fault tends to occur when `backtrace()` tries to unwind through this partially modified stack and follows an invalid pointer in it.

…eptions

in the configuration of the io_group. The io_groups can be used to get the original values which are needed to implement throttling in Redpanda.

EAGAIN is expected here when "Insufficient resources are available to queue any iocbs" (see io_submit(2)). Abort on any other error, as those indicate an internal error on our side. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit 9fff5f3)

The `aio_general_context` had the implicit assumption that in a single tick we would never queue more than `--max-networking-io-control-blocks` events/iocbs. This however ignores situations such as queuing multiple iocbs per socket per tick, having left over iocbs in the queue from previous ticks via the new EAGAIN handling or simply because a lot more sockets are in use which isn't prevented anywhere else. If this condition was hit (`last > end`) the reactor would just assert out and crash. To avoid this situation this patch introduces a backlog into which elements are being enqueued when the original array is full and which can grow unbounded. This mirrors how the `aio_storage_context` works which uses the `io_sink` for the same purpose. To avoid oversized allocations after startup the split into two separate data structures is needed (instead of just regrowing the array). Further the datastructure from which the iocbs are passed into `io_submit` needs to be in contigiuous memory (and also provide an API to use it which most containers don't). `std::deque` is used in the backlog to avoid oversized allocations in the backlog itself. The existing array solution for `iocbs` is kept to fulfill the contigiuous memory requirement. Further we slightly change how EAGAIN is handled. Instead of backshifting the array we keep the array as is and just track the `begin` of the array across `flush` calls. This is possible now as the backlog handling is in place. This introduces "batching" and prevents degenerate cases where only a single element is being submitted which would then result in repeated shifting of the whole array on each `flush` call. Given we use a chunked data structure like `std::deque` erasing from the front of the backlog is relatively cheap and does not require shifting all the elements in the backlog. Hence, the per-iocb overhead is amortized constant. Note that in general we try to submit as many iocbs per `io_submit` call. Given the new behaviour of not backshifting the iocb array and immediately backfilling from the backlog we might introduce `io_submit` calls that don't try to submit the max amount of iocbs. However we assume that if we ran into EAGAIN then either: - We are still behind the next time around: it's unlikely we would succeed in submitting all the iocbs anyway - We have now caugt up: we have introduced a single additional `io_submit` call which only submits `max_poll()/2` iocbs on average. The backlog will be drained at full `max_poll` per `io_submit`. (cherry picked from commit d9175fc)

http::request::content is deprecated upstream, with the idea that you set the server into streaming mode and use the input_stream<> in the request directly. This is not a totally trivial change, so for now we want to just keep using content as this functionality is the same as always, so we remove the deprecation from our fork for now. See also CORE-15051.

…back mode When using scoped_system_alloc_fallback, large allocations are expected and intentional. Reduce the log level from warn to debug to avoid spamming logs in this case. Also change the message text to avoid triggering the BLL check and skip the backtrace since it's not useful for expected allocations.

Add struct cert_info (serial number + expiry) to the public API and implement get_cert_info() and get_trust_list_info() on certificate_credentials, backed by virtual methods on credentials_impl. Both GnuTLS and OpenSSL backends extract serial and expiry from loaded certificates and trust lists. Port of the following commits from v26.2.x-pre onto the upstream crypto provider architecture: b8438b3 net/tls: Introduce cert_info and accessors e4c696a net/tls(ossl): Introduce cert_info and accessors 76b82e3 net/tls: Adjust type for cert_info.serial 06eaf07 net/tls: Replace cert_info::bytes with vector<byte>

Add a new reload_callback_with_creds callback type that receives the reloaded certificate_credentials and trust file blob, in addition to the changed files and exception_ptr. This allows callers to inspect the new credentials (e.g., via get_cert_info()) at reload time without having to rebuild them. Add credentials_builder::get_trust_file_blob() to retrieve the loaded trust file contents. Add build_reloadable_{certificate,server}_credentials overloads accepting reload_callback_with_creds. Add test_reload_certificates_with_creds test case. Port of the following commits from v26.2.x-pre onto the upstream crypto provider architecture: f716e6a net/tls: Add reload_callback_with_creds 232567e tls: Include trust file contents with reload callback 3f86a53 tls_test: Add tests for new reload callback and cert_info accessors

Add enum class dn_format { legacy, rfc2253 } and an overload of get_dn_information() that accepts a format parameter. The OpenSSL backend switches X509_NAME_print_ex flags based on the format: XN_FLAG_RFC2253 for rfc2253, and the legacy seastar flags for legacy. GnuTLS ignores the parameter as it does not provide a mechanism to change the DN output format. Port of the following commit from v26.2.x-pre onto the upstream crypto provider architecture: 291dc51 tls: Added support for fetching DN in RFC2253 format

OpenSSL's API contract is too loose and the impact too wide (e.g. low-priority HTTPS traffic could crash the whole process) that it makes sense to only terminate here in debug builds. Restores RP-specific behavior on top of upstream PR scylladb#3369 (which removed the assert entirely in favor of just logging). Uses plain assert() rather than SEASTAR_ASSERT to fire only in debug builds. Port of v26.2.x-pre commit 4152d2f onto upstream's verify_clean_error_queue function (different file: tls_openssl.cc vs the original ossl.cc).

dotnwat

nothing suspicious in TLS land. maybe @pgellert can also take a quick peek too.

pgellert

I took a look at the TLS changes, and they look good to me

travisdowns · 2026-05-08T21:23:57Z

@StephanDollberg wrote:

Did you try dropping this one. I feel like we shouldn't need it anymore after libfmt upgrade.

242f6d2

Same for:

2b4725b

Both have been dropped. The latter needs changes on RP side, which have been pushed as redpanda-data/redpanda#30395.

A debug-only variant of SEASTAR_ASSERT that compiles to nothing in non-debug build modes (Release, Dev, etc.) but still references its argument via (void)sizeof so unused-variable warnings stay quiet. For asserts that catch internal invariants too expensive to keep on in release builds. The author should ensure the assert condition is side-effect-free since it will not be evaluated in non-debug modes. This is an alternative to `assert` from `<cassert>` as that has less clear enablement semantics as end-users may adjust NDEBUG.

Replace OpenSSL's auto-detect-and-enable with explicit opt-in. The new configure.py flag --tls-mode={gnutls,openssl,both} (default gnutls) drives Seastar_GNUTLS and Seastar_OPENSSL together. Direct -DSeastar_GNUTLS / -DSeastar_OPENSSL cache overrides still work. This is less magic than auto-detect: users will want to pick what backend they are using, rather than have cmake decide for them depending on installed libraries which is fragile in the face of external changes (e.g install some random library that happens to bring in to OpenSSL on openssl which suddenly changes your seastar build mode). When both backends are enabled, SEASTAR_TLS_DUAL_BACKEND is added as a PUBLIC compile definition so the public TLS header and downstream code can distinguish single- vs. dual-backend builds.

The seastar::tls::ERROR_* globals (e.g. ERROR_UNKNOWN_CIPHER_SUITE) were mutable ints, zero-initialized at static-init time and filled in at reactor startup by the active backend's init_error_codes() method. Any access before reactor init (static initializers, unit tests that don't spin up a reactor) silently read as 0, locking in the wrong value with no diagnostic. This bit Redpanda unit tests that compare against these constants without starting a reactor. In single-backend builds (SEASTAR_TLS_DUAL_BACKEND not defined), the active backend is fixed at compile time, so the values can be hard coded. Use a new SEASTAR_TLS_ERROR_QUALIFIERS macro that expands to 'extern' in dual-backend builds and 'extern const' in single-backend builds; define the globals as const with the backend's constants in tls_<backend>.cc. Dual-backend builds still go through the dynamic init_error_codes() path with no behavior change.

The opening comment described GnuTLS as the only backend with OpenSSL replacement framed as hypothetical. Both backends are supported today, optionally at the same time with the active one selected at reactor startup via --crypto-provider. Also add a "When backend-dependent state is valid" section that documents the single-backend vs. dual-backend lifetime rules for error_category(), backend_name(), the ERROR_* globals, and any function that internally creates a TLS session, credentials, or DH params (all of which route through internal::crypto::provider() in dual-backend builds and require it to be installed by smp::configure() first). Trim the per-symbol blurbs on error_category(), backend_name(), and the ERROR_* block to point back at the shared section.

…uilds In single-backend builds (only one of GnuTLS / OpenSSL compiled in) there is no runtime choice to make: the active provider is fixed at compile time. Replace the unique_ptr-installed-from-smp::configure() scheme with a function-local static in provider(), constructed lazily on first call. In dual-backend builds, the runtime-installed provider is now paired with an explicit reset: add internal::crypto::reset_provider() and call it from smp::cleanup() so a subsequent app::run() in the same process (which calls smp::configure() -> set_provider() again) starts from a clean slate. set_provider() previously silently overwrote any prior install, which obscured cross-app lifecycle bugs; the explicit set/reset cycle makes the invariant follow-up commits will assert ("set is called exactly once per cycle") observable. As a result: * internal::crypto::set_provider() and reset_provider() are not compiled at all in single-backend builds, and the corresponding call sites in smp::configure() / smp::cleanup() are conditional on SEASTAR_TLS_DUAL_BACKEND. * provider() is valid at any time in single-backend builds, including from static initializers and before reactor startup, mirroring the static-init guarantee the ERROR_* globals just got. The --crypto-provider CLI flag and the reactor_options::crypto_provider field stay unconditionally for compatibility; in single-backend builds the option only offers the compiled-in backend (its value is unused since there is nothing to install).

Dual-backend builds rely on smp::configure() calling set_provider() exactly once before any provider() consumer runs. Catch violations explicitly: * SEASTAR_ASSERT in set_provider() that the_provider is null, so a double install (which would silently drop the previous provider and re-run init_error_codes()) fires loudly in all builds. * SEASTAR_DEBUG_ASSERT in provider() that the_provider is set, so too-early access is caught in debug/sanitize/fuzz builds without paying the branch cost in release.

The configure.py default flipped from "auto-detect both backends" to "--tls-mode=gnutls" (single-backend GnuTLS), which means the existing matrix now exercises only the single-backend GnuTLS code path. Add two standalone jobs to cover the other configurations: * build_with_dual_tls (--tls-mode=both): keeps the dual-backend init_error_codes() + set_provider() path covered. * build_with_openssl_tls (--tls-mode=openssl): exercises the single-backend OpenSSL static-init path which would otherwise be uncovered. clang++ / C++23 / release matches the other dedicated-feature jobs (DPDK, C++ modules) for consistency.

travisdowns · 2026-05-12T16:37:05Z

Updated 2026-05-12

Force-pushed proposed-v26.2.x from 097536fff to 0dd1a13c1 (66 → 73 commits).

What changed

Dropped two commits during code review as suggested by @StephanDollberg :

75bd9aa5b net: Modify call to ::format to ::format_to — was a minor-edits replay
ab209f675 treewide: temporarily comment out some deprecated annotations — was a clean replay

Cherry-picked 7 new commits from the td-tls-single-provider branch (PR forthcoming separately) that adds explicit TLS-backend selection so we no longer auto-detect and silently flip based on installed libraries. New configure flag: --tls-mode={gnutls,openssl,both}. This should fix TLS/connection unit test failures on RP side. This will also be sent upstream.

Original SHA	Head SHA	Title
`659091df6`	`7f0f26ea5`	util/assert: add SEASTAR_DEBUG_ASSERT
`8f129b79d`	`db2b0ddbf`	build: add --tls-mode flag and SEASTAR_TLS_DUAL_BACKEND macro
`51dad994d`	`8a6c605bb`	net/tls: statically initialize ERROR_* globals in single-backend builds
`2d6e8cf48`	`9cc49347e`	net/tls: rewrite top-of-namespace header comment
`a4cecdf93`	`088888116`	core/crypto: rework provider lifecycle for single- and dual-backend builds
`c5b0e3194`	`5b27101f6`	core/crypto: assert provider install/access invariants
`93f165037`	`0dd1a13c1`	ci: add single-backend openssl and dual-backend TLS test jobs

Build status

build/dev configured with --tls-mode openssl: 368/368 ✓
build-gnutls/dev configured with --tls-mode gnutls: 368/368 ✓

Updated artifacts

Cover letter, abridged notes (this comment thread), per-commit accounting comment, and the companion gist are all in sync with the new head.
All NEW SHAs in the per-commit table were rewritten by the post-drop rebase; audit confirms every NEW SHA is reachable from the new head.

Point at the rebased v26.2.x branch (redpanda-data/seastar#277). Replaces the prior v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see redpanda-data/seastar#277 (comment).

travisdowns · 2026-05-28T14:08:46Z

closing as this was for review only: we have since completed the rebase and redpanda is running on the rebased version

mmaslankaprv and others added 30 commits May 5, 2026 16:58

http: enable specifying a content type on exceptions

c39663b

Since an exception carries some text for the response body text, the raising site might like to specify the content type if it's e.g. json. Signed-off-by: John Spray <jcs@vectorized.io>

http: don't jsonize exception if has content type

ced4b17

This enables throwing a base_exception from a json request handler with a json payload inside it. Signed-off-by: John Spray <jcs@vectorized.io>

http: use base_exception content type in non-json errors

9430122

Signed-off-by: John Spray <jcs@vectorized.io>

metrics: expose metric impl handle to internal api

0ee1486

This patch extends the metrics internal apis to use a specific metrics::impl::impl object identified by its integer handle. (cherry picked from commit 6ee4af7)

metrics: expose handle in metric_groups_impl

5d78395

Add a public method to metric_groups_impl that exposes the handle of the internal implementation it is using. This is required in order for the metric_groups class to be able to reset itself to the configured implementation handle.

metrics: Use handle for impl object

151823d

This patch removes two subsequent calls to `get_local_impl` and reuses the returned handle in that scope.

scollectd: select internal metrics implementation

3f737ea

This patch extends the scollectd apis with the ability to select the internal metrics implementation to be used by providing a handle. (cherry picked from commit d4331d1)

metrics: Expose 'skip_when_empty' for metrics

55dc12b

This patch adds a 'get_skip_when_empy' getter to the 'registered_metric' class. It is used by follow-up patches in order to replicate metrics.

metrics: add helpers for creation of replicas

2179447

This patch adds private methods to the 'metrics::impl' class that deal with the creation of replicated metrics. They will be used to build the public api in future commits.

metrics: add metric replication internal interface

69a4762

This patch adds a public method to the 'metrics::impl' class: 'set_metric_families_to_replicate'. When this method is called the families that match any of the specifications will be replicated on the specified destinations.

metrics: public family replication interface

f70c6f8

This patch exposes a method in the public interface of the metrics module ('replicate_metric_families'), which enables metric replication internally for the requested metric families.

tests: add metrics replication unit tests

6db1233

scheduling_group: expose usage statistics

a618fe8

This commit extends the public interface of scheduling_group to expose usage statistics (e.g. runtime).

http: rename http client logger

dea259b

Redpanda uses this logger name for its http client and we choose to change the logger name in Seastar to avoid duplicate logger registration exceptions.

tests/unit: refactor stall_detector test functions

e33f233

Moves some functions and classes defined in the stall_detector test to a separate header file so they can be reused in other tests.

core: cpu profiler implementation

aaf35ae

tests/cpu_profiler: add test to verify that on_signal doesn't allocate

c4e6f1b

tests/cpu_profiler: add tests that ensure correct behavior during exc…

3b99c40

…eptions

core/cpu_profiler: finer-grain stats for why a sample was dropped

e05aa93

Store original values from io-priorities yaml

ac338c7

in the configuration of the io_group. The io_groups can be used to get the original values which are needed to implement throttling in Redpanda.

bhalevy and others added 9 commits May 7, 2026 14:32

opt out of AI training per GitHub policy

bfb7a37

dotnwat reviewed May 7, 2026

View reviewed changes

pgellert self-requested a review May 8, 2026 07:12

pgellert reviewed May 8, 2026

View reviewed changes

travisdowns force-pushed the proposed-v26.2.x branch from 8d712a2 to 097536f Compare May 8, 2026 14:58

travisdowns mentioned this pull request May 8, 2026

Switch to rebased v26.2.x seastar branch redpanda-data/redpanda#30428

Merged

7 tasks

travisdowns added 7 commits May 12, 2026 12:27

travisdowns closed this May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed v26.2.x: 68 RP commits on scylladb/master c302102b7#277

Proposed v26.2.x: 68 RP commits on scylladb/master c302102b7#277
travisdowns wants to merge 73 commits into
redpanda-data:proposed-v26.2.x-merge-basefrom
travisdowns:proposed-v26.2.x

travisdowns commented May 7, 2026 •

edited

Loading

Uh oh!

dotnwat left a comment

Uh oh!

pgellert left a comment

Uh oh!

travisdowns commented May 8, 2026

Uh oh!

travisdowns commented May 12, 2026 •

edited

Loading

Uh oh!

travisdowns commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

travisdowns commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference

What's in here

Reading the diff

Additional details

Uh oh!

dotnwat left a comment

Choose a reason for hiding this comment

Uh oh!

pgellert left a comment

Choose a reason for hiding this comment

Uh oh!

travisdowns commented May 8, 2026

Uh oh!

travisdowns commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated 2026-05-12

What changed

Build status

Updated artifacts

Uh oh!

travisdowns commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

travisdowns commented May 7, 2026 •

edited

Loading

travisdowns commented May 12, 2026 •

edited

Loading