Skip to content

Proposed v26.2.x: 68 RP commits on scylladb/master c302102b7#277

Closed
travisdowns wants to merge 73 commits into
redpanda-data:proposed-v26.2.x-merge-basefrom
travisdowns:proposed-v26.2.x
Closed

Proposed v26.2.x: 68 RP commits on scylladb/master c302102b7#277
travisdowns wants to merge 73 commits into
redpanda-data:proposed-v26.2.x-merge-basefrom
travisdowns:proposed-v26.2.x

Conversation

@travisdowns

@travisdowns travisdowns commented May 7, 2026

Copy link
Copy Markdown
Member

This PR shows the proposed contents of the v26.2.x branch after rebase for review. It contains 73 redpanda-specific commits on top of scylladb/master at c302102b7. I don't expect anyone to review these lines by line as these have already been largely reviewed when originally checked in, but more look at the overall approach and maybe do spot checks.

@dotnwat put you on here especially for the OpenSSL stuff.

Reference

  • Base (proposed-v26.2.x-merge-base): c302102b7c3ac10a02723167dcb155be908b135cscylladb/master tip as of the rebase point. This is a frozen reference for the diff.
  • Head (travisdowns:proposed-v26.2.x): 0dd1a13c11d8573481180deab69eb7ca0b345b74 — the proposed v26.2.x with all RP commits replayed.

What's in here

73 commits, broken down as:

  • 41 clean replays of v26.2.x-pre commits (patch identical modulo line-number shifts)
  • 17 minor edits (small adaptations to upstream changes)
  • 3 major edits (substantial rework)
  • 9 ports of RP-specific TLS features rewritten against upstream's pluggable crypto provider architecture (folded into 4 commits)
  • 2 ports of other RP-specific behavior on top of upstream
  • 7 cherry-picks from a separate td-tls-single-provider branch that adds explicit TLS-backend selection (--tls-mode={gnutls,openssl,both}) and tightens up single-backend builds

A further commits from v26.2.x-pre were dropped or upstreamed during this work.

Reading the diff

The TLS surface in particular saw the biggest changes because upstream merged a pluggable crypto provider rewrite (Noah Watkins, scylladb#3360 series); our RP-specific TLS features (cert_info, reload_callback_with_creds, dn_format) were rewritten against that new architecture rather than carried forward as-is.

Additional details

Two comments on this PR have additional context:

  • Abridged rebase notes — the high-level story (re-rebase history, upstream PRs, build/test status) and a link to the full notes in a gist.
  • Per-commit accounting — full table of every v26.2.x-pre commit with its disposition in this PR (clean / edits / ported / upstreamed / dropped).

mmaslankaprv and others added 30 commits May 5, 2026 16:58
Seastar http server implementation supports multiple listeners. It may
be required for the handler logic to know which listener the connection
is coming from. Added listener_idx field to `httpd::request` to allow
handler recognize listener.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
Since an exception carries some text for the response
body text, the raising site might like to specify
the content type if it's e.g. json.

Signed-off-by: John Spray <jcs@vectorized.io>
This enables throwing a base_exception from
a json request handler with a json payload
inside it.

Signed-off-by: John Spray <jcs@vectorized.io>
Signed-off-by: John Spray <jcs@vectorized.io>
Prior to this patch seastar only exposes one global metrics::impl::impl
object which holds all metric related data for one application.

This patch changes the implementation details such that multiple
metrics::impl::impl objects can exist for any given application.
Said objects are stored into a map on each shard and created
dinamically whenever requested. A metrics::impl::impl is identified
by an integer handle that acts as the key for the storage map.

Implementation note: in order to avoid issues caused by the ordering
of static thread_local objects I had to declare the storage in
reactor.cc.

(cherry picked from commit 585a8af)
This patch extends the metrics internal apis to use a specific
metrics::impl::impl object identified by its integer handle.

(cherry picked from commit 6ee4af7)
Add a public method to metric_groups_impl that exposes the handle
of the internal implementation it is using. This is required in order
for the metric_groups class to be able to reset itself to the configured
implementation handle.
This patch extends the metrics user facing apis to use a specific
metrics::impl::impl object identified by its integer handle.

Note that the constructor of 'metric_groups' is marked explicit
in this patch and updates two call sites where the constructor was used
implicitly.
This patch removes two subsequent calls to `get_local_impl` and reuses
the returned handle in that scope.
This patch extends the user facing prometheus apis allowing the user to
specify the internal metrics implementation to be used through a handle.
Additionally, 'add_prometheus_routes' now takes an argument that
specifies the route on which to advertise the metrics. This enables
different metrics "namespaces" to be served by different endpoints in
isolation.

(cherry picked from commit 6189522)
This patch extends the scollectd apis with the ability to select the
internal metrics implementation to be used by providing a handle.

(cherry picked from commit d4331d1)
This patch adds a 'get_skip_when_empy' getter to the 'registered_metric'
class. It is used by follow-up patches in order to replicate metrics.
This patch adds private methods to the 'metrics::impl' class that deal
with the creation of replicated metrics. They will be used to build the
public api in future commits.
This patch adds private helpers to 'metrics::impl' that deal with the
removal of replicated metric families from their destintation
implementation. These methods will be used in subsequent commits to
manage the lifetime of replicated metrics.
This patch adds a public method to the 'metrics::impl' class:
'set_metric_families_to_replicate'. When this method is called
the families that match any of the specifications will be replicated
on the specified destinations.
This patch extends the metric registration and unregistration processes
to make them aware of metric replication.

In the case of metric registration, if the new metric belongs to a
family that matches one of the replication specs, then a replicated
metric is created accordingly.

For unregistration of a metric, the replicated metric is unregistered
too if one exists.
This patch exposes a method in the public interface of the metrics
module ('replicate_metric_families'), which enables metric replication
internally for the requested metric families.
Extends the metrics api to allow changing the aggregation labels of a
metrics family.

Otherwise one had to un-register every single metric instance in a
metric family and then re-register with the changed aggregation labels.

For metric families with thousands of instances (e.g.: histograms with
lots of different labels) this is quite expensive.

With this change we avoid the full reconstruction of the metrics family
and all its metrics. Only the work associated with marking the metrics
`dirty()` is needed then.
This commit extends the public interface of scheduling_group to expose
usage statistics (e.g. runtime).
Redpanda uses this logger name for its http client and we choose to
change the logger name in Seastar to avoid duplicate logger registration
exceptions.
We sort of inadvertently picked up the change to io_uring when
rebasing our seastar fork, but for the coming release we'd like to
keep using aio to reduce risk and give sufficient time to do
performance tests on io_uring.

This effectively rolls back the upstream commit:

eedca15

We simply put aio and epoll above io_uring in the available list,
the default backend is the first one in that list

Issue redpanda-data/redpanda#10105.
This change is to allow for the timer code in the stall detectors to be
used in a profiler implementation. The hope is to be able to reuse a lot
of the stall_detector codebase in the profiler without complicating the
existing stall_detector implementation.

The new posix_timer constructor sets sigev_notify_thread_id via the
macro that musl libc / FreeBSD / Linux define for that field, mirroring
upstream commit bbe1af3 (which patched the cpu_stall_detector_posix_timer
constructor but predated and so missed the new posix_timer site).

see https://sourceware.org/bugzilla/show_bug.cgi?id=27417
Moves some functions and classes defined in the stall_detector test to a
separate header file so they can be reused in other tests.
In libgcc there is a critical section where the stack is being modified so
execution can return to an exceptions landing pad. However, this
partially modified state isn’t valid or specified by the dwarf info in
the eh_frames. So a seg fault tends to occur when `backtrace()` tries to
unwind through this partially modified stack and follows an invalid
pointer in it.
in the configuration of the io_group. The io_groups can be used to get
the original values which are needed to implement throttling in
Redpanda.
bhalevy and others added 9 commits May 7, 2026 14:32
EAGAIN is expected here when "Insufficient resources are available to queue any iocbs" (see io_submit(2)).
Abort on any other error, as those indicate an internal error on our side.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 9fff5f3)
The `aio_general_context` had the implicit assumption that in a single
tick we would never queue more than `--max-networking-io-control-blocks`
events/iocbs. This however ignores situations such as queuing multiple
iocbs per socket per tick, having left over iocbs in the queue from
previous ticks via the new EAGAIN handling or simply because a lot more
sockets are in use which isn't prevented anywhere else.

If this condition was hit (`last > end`) the reactor would just assert
out and crash.

To avoid this situation this patch introduces a backlog into which
elements are being enqueued when the original array is full and which
can grow unbounded. This mirrors how the `aio_storage_context` works
which uses the `io_sink` for the same purpose.

To avoid oversized allocations after startup the split into two separate
data structures is needed (instead of just regrowing the array). Further
the datastructure from which the iocbs are passed into `io_submit` needs to
be in contigiuous memory (and also provide an API to use it which most
containers don't).

`std::deque` is used in the backlog to avoid oversized allocations in
the backlog itself. The existing array solution for `iocbs` is kept to
fulfill the contigiuous memory requirement.

Further we slightly change how EAGAIN is handled. Instead of
backshifting the array we keep the array as is and just track the
`begin` of the array across `flush` calls. This is possible now as the
backlog handling is in place.

This introduces "batching" and prevents degenerate cases where only a
single element is being submitted which would then result in repeated
shifting of the whole array on each `flush` call. Given we use a chunked
data structure like `std::deque` erasing from the front of the backlog
is relatively cheap and does not require shifting all the elements in
the backlog. Hence, the per-iocb overhead is amortized constant.

Note that in general we try to submit as many iocbs per `io_submit`
call. Given the new behaviour of not backshifting the iocb array and
immediately backfilling from the backlog we might introduce `io_submit`
calls that don't try to submit the max amount of iocbs.

However we assume that if we ran into EAGAIN then either:

 - We are still behind the next time around: it's unlikely we would
   succeed in submitting all the iocbs anyway
 - We have now caugt up: we have introduced a single additional
   `io_submit` call which only submits `max_poll()/2` iocbs on average.
   The backlog will be drained at full `max_poll` per `io_submit`.

(cherry picked from commit d9175fc)
http::request::content is deprecated upstream, with the idea that you
set the server into streaming mode and use the input_stream<> in
the request directly.

This is not a totally trivial change, so for now we want to just keep
using content as this functionality is the same as always, so we
remove the deprecation from our fork for now.

See also CORE-15051.
…back mode

When using scoped_system_alloc_fallback, large allocations are expected
and intentional. Reduce the log level from warn to debug to avoid
spamming logs in this case.

Also change the message text to avoid triggering the BLL check and skip
the backtrace since it's not useful for expected allocations.
Add struct cert_info (serial number + expiry) to the public API and
implement get_cert_info() and get_trust_list_info() on
certificate_credentials, backed by virtual methods on credentials_impl.

Both GnuTLS and OpenSSL backends extract serial and expiry from loaded
certificates and trust lists.

Port of the following commits from v26.2.x-pre onto the upstream
crypto provider architecture:
  b8438b3 net/tls: Introduce cert_info and accessors
  e4c696a net/tls(ossl): Introduce cert_info and accessors
  76b82e3 net/tls: Adjust type for cert_info.serial
  06eaf07 net/tls: Replace cert_info::bytes with vector<byte>
Add a new reload_callback_with_creds callback type that receives the
reloaded certificate_credentials and trust file blob, in addition to the
changed files and exception_ptr. This allows callers to inspect the new
credentials (e.g., via get_cert_info()) at reload time without having to
rebuild them.

Add credentials_builder::get_trust_file_blob() to retrieve the loaded
trust file contents. Add build_reloadable_{certificate,server}_credentials
overloads accepting reload_callback_with_creds.

Add test_reload_certificates_with_creds test case.

Port of the following commits from v26.2.x-pre onto the upstream
crypto provider architecture:
  f716e6a net/tls: Add reload_callback_with_creds
  232567e tls: Include trust file contents with reload callback
  3f86a53 tls_test: Add tests for new reload callback and cert_info accessors
Add enum class dn_format { legacy, rfc2253 } and an overload of
get_dn_information() that accepts a format parameter. The OpenSSL
backend switches X509_NAME_print_ex flags based on the format:
XN_FLAG_RFC2253 for rfc2253, and the legacy seastar flags for legacy.

GnuTLS ignores the parameter as it does not provide a mechanism to
change the DN output format.

Port of the following commit from v26.2.x-pre onto the upstream
crypto provider architecture:
  291dc51 tls: Added support for fetching DN in RFC2253 format
OpenSSL's API contract is too loose and the impact too wide (e.g.
low-priority HTTPS traffic could crash the whole process) that it
makes sense to only terminate here in debug builds.

Restores RP-specific behavior on top of upstream PR scylladb#3369 (which
removed the assert entirely in favor of just logging). Uses plain
assert() rather than SEASTAR_ASSERT to fire only in debug builds.

Port of v26.2.x-pre commit 4152d2f onto upstream's
verify_clean_error_queue function (different file: tls_openssl.cc
vs the original ossl.cc).

@dotnwat dotnwat left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing suspicious in TLS land. maybe @pgellert can also take a quick peek too.

@pgellert pgellert self-requested a review May 8, 2026 07:12

@pgellert pgellert left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the TLS changes, and they look good to me

@travisdowns

Copy link
Copy Markdown
Member Author

@StephanDollberg wrote:

Did you try dropping this one. I feel like we shouldn't need it anymore after libfmt upgrade.

242f6d2

Same for:

2b4725b

Both have been dropped. The latter needs changes on RP side, which have been pushed as redpanda-data/redpanda#30395.

A debug-only variant of SEASTAR_ASSERT that compiles to nothing in
non-debug build modes (Release, Dev, etc.) but still references its
argument via (void)sizeof so unused-variable warnings stay quiet.

For asserts that catch internal invariants too expensive to keep on in
release builds. The author should ensure the assert condition is
side-effect-free since it will not be evaluated in non-debug modes.

This is an alternative to `assert` from `<cassert>` as that has less
clear enablement semantics as end-users may adjust NDEBUG.
Replace OpenSSL's auto-detect-and-enable with explicit opt-in. The new
configure.py flag --tls-mode={gnutls,openssl,both} (default gnutls)
drives Seastar_GNUTLS and Seastar_OPENSSL together. Direct
-DSeastar_GNUTLS / -DSeastar_OPENSSL cache overrides still work.

This is less magic than auto-detect: users will want to pick what
backend they are using, rather than have cmake decide for them depending
on installed libraries which is fragile in the face of external changes
(e.g install some random library that happens to bring in to OpenSSL
on openssl which suddenly changes your seastar build mode).

When both backends are enabled, SEASTAR_TLS_DUAL_BACKEND is added as a
PUBLIC compile definition so the public TLS header and downstream code
can distinguish single- vs. dual-backend builds.
The seastar::tls::ERROR_* globals (e.g. ERROR_UNKNOWN_CIPHER_SUITE) were
mutable ints, zero-initialized at static-init time and filled in at
reactor startup by the active backend's init_error_codes() method.
Any access before reactor init (static initializers, unit tests that
don't spin up a reactor) silently read as 0, locking in the wrong value
with no diagnostic. This bit Redpanda unit tests that compare against
these constants without starting a reactor.

In single-backend builds (SEASTAR_TLS_DUAL_BACKEND not defined), the
active backend is fixed at compile time, so the values can be hard
coded. Use a new SEASTAR_TLS_ERROR_QUALIFIERS macro that expands to
'extern' in dual-backend builds and 'extern const' in single-backend
builds; define the globals as const with the backend's constants in
tls_<backend>.cc. Dual-backend builds still go through the dynamic
init_error_codes() path with no behavior change.
The opening comment described GnuTLS as the only backend with OpenSSL
replacement framed as hypothetical. Both backends are supported today,
optionally at the same time with the active one selected at reactor
startup via --crypto-provider.

Also add a "When backend-dependent state is valid" section that
documents the single-backend vs. dual-backend lifetime rules for
error_category(), backend_name(), the ERROR_* globals, and any function
that internally creates a TLS session, credentials, or DH params (all
of which route through internal::crypto::provider() in dual-backend
builds and require it to be installed by smp::configure() first). Trim
the per-symbol blurbs on error_category(), backend_name(), and the
ERROR_* block to point back at the shared section.
…uilds

In single-backend builds (only one of GnuTLS / OpenSSL compiled in)
there is no runtime choice to make: the active provider is fixed at
compile time. Replace the unique_ptr-installed-from-smp::configure()
scheme with a function-local static in provider(), constructed lazily
on first call.

In dual-backend builds, the runtime-installed provider is now paired
with an explicit reset: add internal::crypto::reset_provider() and
call it from smp::cleanup() so a subsequent app::run() in the same
process (which calls smp::configure() -> set_provider() again) starts
from a clean slate. set_provider() previously silently overwrote any
prior install, which obscured cross-app lifecycle bugs; the explicit
set/reset cycle makes the invariant follow-up commits will assert
("set is called exactly once per cycle") observable.

As a result:
* internal::crypto::set_provider() and reset_provider() are not
  compiled at all in single-backend builds, and the corresponding call
  sites in smp::configure() / smp::cleanup() are conditional on
  SEASTAR_TLS_DUAL_BACKEND.
* provider() is valid at any time in single-backend builds, including
  from static initializers and before reactor startup, mirroring the
  static-init guarantee the ERROR_* globals just got.

The --crypto-provider CLI flag and the reactor_options::crypto_provider
field stay unconditionally for compatibility; in single-backend builds
the option only offers the compiled-in backend (its value is unused
since there is nothing to install).
Dual-backend builds rely on smp::configure() calling set_provider()
exactly once before any provider() consumer runs. Catch violations
explicitly:

* SEASTAR_ASSERT in set_provider() that the_provider is null, so a
  double install (which would silently drop the previous provider and
  re-run init_error_codes()) fires loudly in all builds.
* SEASTAR_DEBUG_ASSERT in provider() that the_provider is set, so
  too-early access is caught in debug/sanitize/fuzz builds without
  paying the branch cost in release.
The configure.py default flipped from "auto-detect both backends" to
"--tls-mode=gnutls" (single-backend GnuTLS), which means the existing
matrix now exercises only the single-backend GnuTLS code path. Add two
standalone jobs to cover the other configurations:

* build_with_dual_tls (--tls-mode=both): keeps the dual-backend
  init_error_codes() + set_provider() path covered.
* build_with_openssl_tls (--tls-mode=openssl): exercises the
  single-backend OpenSSL static-init path which would otherwise be
  uncovered.

clang++ / C++23 / release matches the other dedicated-feature jobs
(DPDK, C++ modules) for consistency.
@travisdowns

travisdowns commented May 12, 2026

Copy link
Copy Markdown
Member Author

Updated 2026-05-12

Force-pushed proposed-v26.2.x from 097536fff to 0dd1a13c1 (66 → 73 commits).

What changed

Dropped two commits during code review as suggested by @StephanDollberg :

  • 75bd9aa5b net: Modify call to ::format to ::format_to — was a minor-edits replay
  • ab209f675 treewide: temporarily comment out some deprecated annotations — was a clean replay

Cherry-picked 7 new commits from the td-tls-single-provider branch (PR forthcoming separately) that adds explicit TLS-backend selection so we no longer auto-detect and silently flip based on installed libraries. New configure flag: --tls-mode={gnutls,openssl,both}. This should fix TLS/connection unit test failures on RP side. This will also be sent upstream.

Original SHA Head SHA Title
659091df6 7f0f26ea5 util/assert: add SEASTAR_DEBUG_ASSERT
8f129b79d db2b0ddbf build: add --tls-mode flag and SEASTAR_TLS_DUAL_BACKEND macro
51dad994d 8a6c605bb net/tls: statically initialize ERROR_* globals in single-backend builds
2d6e8cf48 9cc49347e net/tls: rewrite top-of-namespace header comment
a4cecdf93 088888116 core/crypto: rework provider lifecycle for single- and dual-backend builds
c5b0e3194 5b27101f6 core/crypto: assert provider install/access invariants
93f165037 0dd1a13c1 ci: add single-backend openssl and dual-backend TLS test jobs

Build status

  • build/dev configured with --tls-mode openssl: 368/368 ✓
  • build-gnutls/dev configured with --tls-mode gnutls: 368/368 ✓

Updated artifacts

  • Cover letter, abridged notes (this comment thread), per-commit accounting comment, and the companion gist are all in sync with the new head.
  • All NEW SHAs in the per-commit table were rewritten by the post-drop rebase; audit confirms every NEW SHA is reachable from the new head.

travisdowns added a commit to travisdowns/redpanda that referenced this pull request May 12, 2026
Point at the rebased v26.2.x branch
(redpanda-data/seastar#277). Replaces the prior
v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see
redpanda-data/seastar#277 (comment).
travisdowns added a commit to travisdowns/redpanda that referenced this pull request May 12, 2026
Point at the rebased v26.2.x branch
(redpanda-data/seastar#277). Replaces the prior
v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see
redpanda-data/seastar#277 (comment).
travisdowns added a commit to travisdowns/redpanda that referenced this pull request May 12, 2026
Point at the rebased v26.2.x branch
(redpanda-data/seastar#277). Replaces the prior
v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see
redpanda-data/seastar#277 (comment).
travisdowns added a commit to travisdowns/redpanda that referenced this pull request May 14, 2026
Point at the rebased v26.2.x branch
(redpanda-data/seastar#277). Replaces the prior
v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see
redpanda-data/seastar#277 (comment).
wdberkeley pushed a commit to redpanda-data/redpanda that referenced this pull request May 20, 2026
Point at the rebased v26.2.x branch
(redpanda-data/seastar#277). Replaces the prior
v26.2.x-pre snapshot at a0b4f2a6. Picks up the TLS fixes — see
redpanda-data/seastar#277 (comment).
@travisdowns

Copy link
Copy Markdown
Member Author

closing as this was for review only: we have since completed the rebase and redpanda is running on the rebased version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.