tls: improved support for builds with a single backend (GnuTLS or OpenSSL) by travisdowns · Pull Request #3400 · scylladb/seastar

travisdowns · 2026-05-13T14:51:59Z

The thrust here is that the dual backend TLS mode has a few possible footguns, in particular that certain error "constants" and helper functions will return wrong values or trigger UB if they are called before the TLS backend is initalized (which necessarily happens after main(), in the smp::configure phase).

It is not easy to just solve these in an API-compatible way. So this series tries to improves this in the following ways:

When only one backend is configured, which is probably the most common today, we can simply avoid most of the footguns since the backend is known statically so we can do what we did before and everything can work even before main().
We add some asserts in the relevant places to catch misuse, e.g., trying to access the provider when none has been set.

Details:

Make seastar's TLS backend choice an explicit build-time setting and let
single-backend builds use the active backend without any reactor-time setup.

New configure.py --tls-mode={gnutls,openssl,both} (default gnutls),
backed by a new public SEASTAR_TLS_DUAL_BACKEND compile def when both
backends are enabled. OpenSSL is no longer auto-enabled — users opt in
via --tls-mode=openssl or --tls-mode=both.
In single-backend builds, the seastar::tls::ERROR_* globals are declared
const and statically initialized to the active backend's values, so
they're valid from static initializers and from unit tests that never
start a reactor. Today they read silently as 0 before reactor startup,
which has bitten downstream unit tests that compare against these
constants without spinning up a reactor.
In single-backend builds, internal::crypto::provider() returns a
function-local static singleton constructed lazily on first call — no
set_provider() is compiled or needed.
In dual-backend builds, the runtime-installed provider is now paired
with reset_provider() called from smp::cleanup(), so a subsequent
app::run() in the same process starts from a clean slate. The previous
silent-overwrite behavior obscured cross-app lifecycle bugs.
The lifecycle is now asserted: SEASTAR_ASSERT on double-install in
set_provider(), SEASTAR_DEBUG_ASSERT on access-before-install in
provider(). A new SEASTAR_DEBUG_ASSERT macro (no-op outside
Debug/Sanitize/Fuzz) supports the latter.
CI adds two dedicated jobs to keep the non-default modes covered:
build_with_dual_tls (--tls-mode=both) and build_with_openssl_tls
(--tls-mode=openssl). The existing matrix exercises the new
gnutls-only default.

A debug-only variant of SEASTAR_ASSERT that compiles to nothing in non-debug build modes (Release, Dev, etc.) but still references its argument via (void)sizeof so unused-variable warnings stay quiet. For asserts that catch internal invariants too expensive to keep on in release builds. The author should ensure the assert condition is side-effect-free since it will not be evaluated in non-debug modes. This is an alternative to `assert` from `<cassert>` as that has less clear enablement semantics as end-users may adjust NDEBUG.

Replace OpenSSL's auto-detect-and-enable with explicit opt-in. The new configure.py flag --tls-mode={gnutls,openssl,both} (default gnutls) drives Seastar_GNUTLS and Seastar_OPENSSL together. Direct -DSeastar_GNUTLS / -DSeastar_OPENSSL cache overrides still work. This is less magic than auto-detect: users will want to pick what backend they are using, rather than have cmake decide for them depending on installed libraries which is fragile in the face of external changes (e.g install some random library that happens to bring in to OpenSSL on openssl which suddenly changes your seastar build mode). When both backends are enabled, SEASTAR_TLS_DUAL_BACKEND is added as a PUBLIC compile definition so the public TLS header and downstream code can distinguish single- vs. dual-backend builds.

The seastar::tls::ERROR_* globals (e.g. ERROR_UNKNOWN_CIPHER_SUITE) were mutable ints, zero-initialized at static-init time and filled in at reactor startup by the active backend's init_error_codes() method. Any access before reactor init (static initializers, unit tests that don't spin up a reactor) silently read as 0, locking in the wrong value with no diagnostic. This bit Redpanda unit tests that compare against these constants without starting a reactor. In single-backend builds (SEASTAR_TLS_DUAL_BACKEND not defined), the active backend is fixed at compile time, so the values can be hard coded. Use a new SEASTAR_TLS_ERROR_QUALIFIERS macro that expands to 'extern' in dual-backend builds and 'extern const' in single-backend builds; define the globals as const with the backend's constants in tls_<backend>.cc. Dual-backend builds still go through the dynamic init_error_codes() path with no behavior change.

The opening comment described GnuTLS as the only backend with OpenSSL replacement framed as hypothetical. Both backends are supported today, optionally at the same time with the active one selected at reactor startup via --crypto-provider. Also add a "When backend-dependent state is valid" section that documents the single-backend vs. dual-backend lifetime rules for error_category(), backend_name(), the ERROR_* globals, and any function that internally creates a TLS session, credentials, or DH params (all of which route through internal::crypto::provider() in dual-backend builds and require it to be installed by smp::configure() first). Trim the per-symbol blurbs on error_category(), backend_name(), and the ERROR_* block to point back at the shared section.

…uilds In single-backend builds (only one of GnuTLS / OpenSSL compiled in) there is no runtime choice to make: the active provider is fixed at compile time. Replace the unique_ptr-installed-from-smp::configure() scheme with a function-local static in provider(), constructed lazily on first call. In dual-backend builds, the runtime-installed provider is now paired with an explicit reset: add internal::crypto::reset_provider() and call it from smp::cleanup() so a subsequent app::run() in the same process (which calls smp::configure() -> set_provider() again) starts from a clean slate. set_provider() previously silently overwrote any prior install, which obscured cross-app lifecycle bugs; the explicit set/reset cycle makes the invariant follow-up commits will assert ("set is called exactly once per cycle") observable. As a result: * internal::crypto::set_provider() and reset_provider() are not compiled at all in single-backend builds, and the corresponding call sites in smp::configure() / smp::cleanup() are conditional on SEASTAR_TLS_DUAL_BACKEND. * provider() is valid at any time in single-backend builds, including from static initializers and before reactor startup, mirroring the static-init guarantee the ERROR_* globals just got. The --crypto-provider CLI flag and the reactor_options::crypto_provider field stay unconditionally for compatibility; in single-backend builds the option only offers the compiled-in backend (its value is unused since there is nothing to install).

Dual-backend builds rely on smp::configure() calling set_provider() exactly once before any provider() consumer runs. Catch violations explicitly: * SEASTAR_ASSERT in set_provider() that the_provider is null, so a double install (which would silently drop the previous provider and re-run init_error_codes()) fires loudly in all builds. * SEASTAR_DEBUG_ASSERT in provider() that the_provider is set, so too-early access is caught in debug/sanitize/fuzz builds without paying the branch cost in release.

The configure.py default flipped from "auto-detect both backends" to "--tls-mode=gnutls" (single-backend GnuTLS), which means the existing matrix now exercises only the single-backend GnuTLS code path. Add two standalone jobs to cover the other configurations: * build_with_dual_tls (--tls-mode=both): keeps the dual-backend init_error_codes() + set_provider() path covered. * build_with_openssl_tls (--tls-mode=openssl): exercises the single-backend OpenSSL static-init path which would otherwise be uncovered. clang++ / C++23 / release matches the other dedicated-feature jobs (DPDK, C++ modules) for consistency.

travisdowns · 2026-05-13T17:13:30Z

@scylladb/seastar-maint please take a look, also @dotnwat (this is almost identical to the RP-side changes you reviewed already, I only added the clear_provider stuff which is necessary to support multiple apps)

travisdowns · 2026-05-18T21:13:08Z

@scylladb/seastar-maint ping

dotnwat

LGTM

SEASTAR_TLS_DUAL_BACKEND

If in the future we take the next step of dropping GnuTLS, then the off-ramp for this would be to hard code SEASTAR_TLS_DUAL_BACKEND as true since that seems like the more general case?

travisdowns · 2026-05-21T21:20:38Z

If in the future we take the next step of dropping GnuTLS, then the off-ramp for this would be to hard code SEASTAR_TLS_DUAL_BACKEND as true since that seems like the more general case?

Well we would have a choice.

We could effectively assume we only have the openssl backend and use the single-backend compile path only (removing the other path). We'd still have the abstraction of dynamic providers mostly in place but it wouldn't solve the global variable problem.

Or you can leave the "single provider" vs multi-provider split in place, like today, and build both ways in CI (as today), and then a sub-choice is what the default is. If there's only one supported backend it seems weird to default it to the mode with the footgun.

If we wanted to fix this "even better" I think we should expose these error codes behind an API call, not as globals, and then this API simply throws if there are > 1 providers and init hasn't been called. Then we can remove this DUAL_MODE stuff in the cmake. That requires deprecation and API levels etc.

avikivity · 2026-05-24T12:05:26Z

It is not easy to just solve these in an API-compatible way.

We can go through a deprecation or API_LEVEL cycle if need to avoid contortions.

avikivity · 2026-05-24T12:09:32Z

+#endif // !SEASTAR_TLS_DUAL_BACKEND
+


As an alternative to this pain, could we translate gnutls/openssl errors to our own errors?

We'd make ERROR_NO_CIPHER_SUITES some fixed enum value, and translate the error when we read it.

That's a good idea. I can see two overall ways:

"in place", i.e., actually change the existing globals. I guess it doesn't even need an API change: this is an ABI break but arguably not an API one (though that's only true if folks are only comparing to these enums, not expecting them to be GnuTLS/OpenSSL errors).

Introduce them in addition to the existing thing. Deprecate the existing thing.

I guess I would favor 1.

That's a good idea. I can see two overall ways:

"in place", i.e., actually change the existing globals. I guess it doesn't even need an API change: this is an ABI break but arguably not an API one (though that's only true if folks are only comparing to these enums, not expecting them to be GnuTLS/OpenSSL errors).

Introduce them in addition to the existing thing. Deprecate the existing thing.

I guess I would favor 1.

Me2.

Can actually combine 2 and 1: add an enum class for the errors, and constexpr inline int variables initialized from the enum. But it would be overkill, not recommending it.

travisdowns added 7 commits May 12, 2026 12:22

travisdowns mentioned this pull request May 13, 2026

special case single TLS provider a bit travisdowns/seastar#3

Closed

dotnwat approved these changes May 21, 2026

View reviewed changes

avikivity reviewed May 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tls: improved support for builds with a single backend (GnuTLS or OpenSSL)#3400

tls: improved support for builds with a single backend (GnuTLS or OpenSSL)#3400
travisdowns wants to merge 7 commits into
scylladb:masterfrom
travisdowns:td-tls-single-provider-scylladb

travisdowns commented May 13, 2026 •

edited

Loading

Uh oh!

travisdowns commented May 13, 2026

Uh oh!

travisdowns commented May 18, 2026

Uh oh!

dotnwat left a comment

Uh oh!

travisdowns commented May 21, 2026

Uh oh!

avikivity commented May 24, 2026

Uh oh!

avikivity May 24, 2026

Uh oh!

travisdowns May 24, 2026

Uh oh!

avikivity May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		#endif // !SEASTAR_TLS_DUAL_BACKEND

Conversation

travisdowns commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

travisdowns commented May 13, 2026

Uh oh!

travisdowns commented May 18, 2026

Uh oh!

dotnwat left a comment

Choose a reason for hiding this comment

Uh oh!

travisdowns commented May 21, 2026

Uh oh!

avikivity commented May 24, 2026

Uh oh!

avikivity May 24, 2026

Choose a reason for hiding this comment

Uh oh!

travisdowns May 24, 2026

Choose a reason for hiding this comment

Uh oh!

avikivity May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

travisdowns commented May 13, 2026 •

edited

Loading