Skip to content

fix: use CA certificate from mTLS secret for server verification#212

Merged
carlydf merged 1 commit intomainfrom
fix/mtls-root-ca-support
Feb 27, 2026
Merged

fix: use CA certificate from mTLS secret for server verification#212
carlydf merged 1 commit intomainfrom
fix/mtls-root-ca-support

Conversation

@Shivs11
Copy link
Copy Markdown
Member

@Shivs11 Shivs11 commented Feb 27, 2026

When connecting to a Temporal server via mTLS, the controller reads tls.crt and tls.key from the referenced Kubernetes secret but does not read ca.crt. This causes the controller to fall back to the system CA bundle for server certificate verification, which fails when the server's TLS certificate is signed by a private or internal CA (e.g. cert-manager in a self-hosted cluster).

This change reads ca.crt from the mTLS secret (when present) and uses it as the trusted root CA pool for server certificate verification. This is fully backward compatible. Secrets created by cert-manager automatically include ca.crt. Temporal Cloud users are unaffected since their server certs are signed by public CAs already in the system bundle.

What was changed

Why?

Checklist

  1. Closes
    Closes [Feature Request] Support configuring mTLS trust roots #158

  2. How was this tested:

  1. Any docs updates needed?

When connecting to a Temporal server via mTLS, the controller reads tls.crt and
tls.key from the referenced Kubernetes secret but does not read ca.crt. This
causes the controller to fall back to the system CA bundle for server certificate
verification, which fails when the server's TLS certificate is signed by a private
or internal CA (e.g. cert-manager in a self-hosted cluster).

This change reads ca.crt from the mTLS secret (when present) and uses it as
the trusted root CA pool for server certificate verification. This is fully
backward compatible. Secrets created by cert-manager automatically include
ca.crt. Temporal Cloud users are unaffected since their server certs are signed
by public CAs already in the system bundle.

Closes #158

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Shivs11 Shivs11 marked this pull request as ready for review February 27, 2026 02:34
@Shivs11 Shivs11 requested review from a team and jlegrone as code owners February 27, 2026 02:34
@carlydf carlydf merged commit 3d0ecf2 into main Feb 27, 2026
14 checks passed
@carlydf carlydf deleted the fix/mtls-root-ca-support branch February 27, 2026 03:02
carlydf pushed a commit that referenced this pull request Mar 7, 2026
When connecting to a Temporal server via mTLS, the controller reads
tls.crt and tls.key from the referenced Kubernetes secret but does not
read ca.crt. This causes the controller to fall back to the system CA
bundle for server certificate verification, which fails when the
server's TLS certificate is signed by a private or internal CA (e.g.
cert-manager in a self-hosted cluster).

This change reads ca.crt from the mTLS secret (when present) and uses it
as the trusted root CA pool for server certificate verification. This is
fully backward compatible. Secrets created by cert-manager automatically
include ca.crt. Temporal Cloud users are unaffected since their server
certs are signed by public CAs already in the system bundle.

<!--- Note to EXTERNAL Contributors -->
<!-- Thanks for opening a PR! 
If it is a significant code change, please **make sure there is an open
issue** for this.
We work best with you when we have accepted the idea first before you
code. -->

<!--- For ALL Contributors 👇 -->

## What was changed
<!-- Describe what has changed in this PR -->

## Why?
<!-- Tell your future self why have you made these changes -->

## Checklist
<!--- add/delete as needed --->

1. Closes <!-- add issue number here -->
Closes #158

3. How was this tested:
<!--- Please describe how you tested your changes/how we can test them
-->

4. Any docs updates needed?
<!--- update README if applicable
      or point out where to update docs.temporal.io -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
carlydf pushed a commit that referenced this pull request Mar 10, 2026
When connecting to a Temporal server via mTLS, the controller reads
tls.crt and tls.key from the referenced Kubernetes secret but does not
read ca.crt. This causes the controller to fall back to the system CA
bundle for server certificate verification, which fails when the
server's TLS certificate is signed by a private or internal CA (e.g.
cert-manager in a self-hosted cluster).

This change reads ca.crt from the mTLS secret (when present) and uses it
as the trusted root CA pool for server certificate verification. This is
fully backward compatible. Secrets created by cert-manager automatically
include ca.crt. Temporal Cloud users are unaffected since their server
certs are signed by public CAs already in the system bundle.

<!--- Note to EXTERNAL Contributors -->
<!-- Thanks for opening a PR! 
If it is a significant code change, please **make sure there is an open
issue** for this.
We work best with you when we have accepted the idea first before you
code. -->

<!--- For ALL Contributors 👇 -->

## What was changed
<!-- Describe what has changed in this PR -->

## Why?
<!-- Tell your future self why have you made these changes -->

## Checklist
<!--- add/delete as needed --->

1. Closes <!-- add issue number here -->
Closes #158

3. How was this tested:
<!--- Please describe how you tested your changes/how we can test them
-->

4. Any docs updates needed?
<!--- update README if applicable
      or point out where to update docs.temporal.io -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Shivs11 added a commit that referenced this pull request Mar 10, 2026
PR #212 introduced ca.crt support for server verification but used
x509.NewCertPool(), which replaces the system CA bundle entirely.
This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a ca.crt from cert-manager (the client CA).

Use x509.SystemCertPool() instead so the custom CA is appended to
the system bundle, allowing both private and public CAs to be trusted.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shivs11 added a commit that referenced this pull request Mar 10, 2026
PR #212 introduced ca.crt support for server verification but used
x509.NewCertPool(), which replaces the system CA bundle entirely.
This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a ca.crt from cert-manager (the client CA).

Use x509.SystemCertPool() instead so the custom CA is appended to
the system bundle, allowing both private and public CAs to be trusted.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shivs11 added a commit that referenced this pull request Mar 10, 2026
## Summary
- PR #212 introduced `ca.crt` support for server certificate
verification but used `x509.NewCertPool()`, which creates an **empty**
CA pool — replacing the system CA bundle entirely
- This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a `ca.crt` key from cert-manager (the CA that signed the
**client** cert, not the server cert)
- This fix uses `x509.SystemCertPool()` instead, so the custom CA is
**appended** to the system bundle rather than replacing it

## Why this broke
cert-manager always includes `ca.crt` in TLS secrets (the issuing CA).
When connecting to Temporal Cloud:
1. The controller sees `ca.crt` in the secret (the self-signed client
CA)
2. `NewCertPool()` creates an empty pool with **only** that CA
3. Temporal Cloud's server cert is signed by a public CA (e.g.,
DigiCert)
4. The public CA is no longer trusted → `x509: certificate signed by
unknown authority`

## What this fixes
- `SystemCertPool()` loads the system CA bundle first, then appends the
custom CA
- Both public CAs (Temporal Cloud) and private CAs (self-hosted) are
trusted simultaneously
- Falls back to `NewCertPool()` with a warning log if the system pool
can't be loaded

## Affected versions
- v1.2.1, v1.2.2, v1.2.3 — all contain the regression from PR #212
- Closes #223

## Test plan
- [ ] Deploy against Temporal Cloud with cert-manager mTLS secret (has
`ca.crt`) — verify connection succeeds
- [ ] Deploy against self-hosted Temporal with private CA — verify
connection succeeds
- [ ] Deploy with mTLS secret without `ca.crt` — verify fallback to
system bundle works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Shivs11 added a commit that referenced this pull request Mar 19, 2026
## Summary
- PR #212 introduced `ca.crt` support for server certificate
verification but used `x509.NewCertPool()`, which creates an **empty**
CA pool — replacing the system CA bundle entirely
- This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a `ca.crt` key from cert-manager (the CA that signed the
**client** cert, not the server cert)
- This fix uses `x509.SystemCertPool()` instead, so the custom CA is
**appended** to the system bundle rather than replacing it

## Why this broke
cert-manager always includes `ca.crt` in TLS secrets (the issuing CA).
When connecting to Temporal Cloud:
1. The controller sees `ca.crt` in the secret (the self-signed client
CA)
2. `NewCertPool()` creates an empty pool with **only** that CA
3. Temporal Cloud's server cert is signed by a public CA (e.g.,
DigiCert)
4. The public CA is no longer trusted → `x509: certificate signed by
unknown authority`

## What this fixes
- `SystemCertPool()` loads the system CA bundle first, then appends the
custom CA
- Both public CAs (Temporal Cloud) and private CAs (self-hosted) are
trusted simultaneously
- Falls back to `NewCertPool()` with a warning log if the system pool
can't be loaded

## Affected versions
- v1.2.1, v1.2.2, v1.2.3 — all contain the regression from PR #212
- Closes #223

## Test plan
- [ ] Deploy against Temporal Cloud with cert-manager mTLS secret (has
`ca.crt`) — verify connection succeeds
- [ ] Deploy against self-hosted Temporal with private CA — verify
connection succeeds
- [ ] Deploy with mTLS secret without `ca.crt` — verify fallback to
system bundle works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Shivs11 added a commit that referenced this pull request Mar 20, 2026
## Summary
- PR #212 introduced `ca.crt` support for server certificate
verification but used `x509.NewCertPool()`, which creates an **empty**
CA pool — replacing the system CA bundle entirely
- This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a `ca.crt` key from cert-manager (the CA that signed the
**client** cert, not the server cert)
- This fix uses `x509.SystemCertPool()` instead, so the custom CA is
**appended** to the system bundle rather than replacing it

## Why this broke
cert-manager always includes `ca.crt` in TLS secrets (the issuing CA).
When connecting to Temporal Cloud:
1. The controller sees `ca.crt` in the secret (the self-signed client
CA)
2. `NewCertPool()` creates an empty pool with **only** that CA
3. Temporal Cloud's server cert is signed by a public CA (e.g.,
DigiCert)
4. The public CA is no longer trusted → `x509: certificate signed by
unknown authority`

## What this fixes
- `SystemCertPool()` loads the system CA bundle first, then appends the
custom CA
- Both public CAs (Temporal Cloud) and private CAs (self-hosted) are
trusted simultaneously
- Falls back to `NewCertPool()` with a warning log if the system pool
can't be loaded

## Affected versions
- v1.2.1, v1.2.2, v1.2.3 — all contain the regression from PR #212
- Closes #223

## Test plan
- [ ] Deploy against Temporal Cloud with cert-manager mTLS secret (has
`ca.crt`) — verify connection succeeds
- [ ] Deploy against self-hosted Temporal with private CA — verify
connection succeeds
- [ ] Deploy with mTLS secret without `ca.crt` — verify fallback to
system bundle works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
carlydf added a commit that referenced this pull request Mar 23, 2026
## Summary

- Adds `clientpool_test.go` with 8 unit tests covering the auth code
paths that had no test coverage
- Two tests are explicit regression guards for the bugs fixed in #227
and #232
- Makes `dialFn` and `systemCertPoolFn` injectable on `ClientPool` (no
behavior change in production) to enable testing without network I/O or
OS trust store dependencies

## Regression tests

**`TestFetchMTLS_CACertAppendsToSystemPool`** — guards against the PR
#212 bug (fixed in #227): `fetchClientUsingMTLSSecret` used
`x509.NewCertPool()` (empty) instead of `x509.SystemCertPool()`,
silently dropping system root CAs and breaking Temporal Cloud
connections. The test injects a fake system pool and verifies both the
injected system CAs and the custom `ca.crt` are present in the returned
pool. This test fails if the fix is reverted.

**`TestDialAndUpsert_APIKeySkipsCheckHealth`** — guards against the PR
#203 bug (fixed in #232): `DialAndUpsertClient` called `CheckHealth`
unconditionally, which fails on Temporal Cloud with namespace-scoped API
keys. The test uses an injected mock client and asserts `CheckHealth` is
never called for `AuthModeAPIKey`. This test fails if the fix is
reverted.

## Test plan

- [x] `go test ./internal/controller/clientpool/... -v` — all 8 tests
pass
- [x] `go build ./...` — no compilation errors
- [x] Manually revert the PR #227 fix →
`TestFetchMTLS_CACertAppendsToSystemPool` fails
- [x] Manually revert the PR #232 fix →
`TestDialAndUpsert_APIKeySkipsCheckHealth` fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
shashwatsuri pushed a commit to shashwatsuri/temporal-worker-controller that referenced this pull request Apr 28, 2026
…poralio#212)

When connecting to a Temporal server via mTLS, the controller reads
tls.crt and tls.key from the referenced Kubernetes secret but does not
read ca.crt. This causes the controller to fall back to the system CA
bundle for server certificate verification, which fails when the
server's TLS certificate is signed by a private or internal CA (e.g.
cert-manager in a self-hosted cluster).

This change reads ca.crt from the mTLS secret (when present) and uses it
as the trusted root CA pool for server certificate verification. This is
fully backward compatible. Secrets created by cert-manager automatically
include ca.crt. Temporal Cloud users are unaffected since their server
certs are signed by public CAs already in the system bundle.

<!--- Note to EXTERNAL Contributors -->
<!-- Thanks for opening a PR! 
If it is a significant code change, please **make sure there is an open
issue** for this.
We work best with you when we have accepted the idea first before you
code. -->

<!--- For ALL Contributors 👇 -->

## What was changed
<!-- Describe what has changed in this PR -->

## Why?
<!-- Tell your future self why have you made these changes -->

## Checklist
<!--- add/delete as needed --->

1. Closes <!-- add issue number here -->
Closes temporalio#158

3. How was this tested:
<!--- Please describe how you tested your changes/how we can test them
-->

4. Any docs updates needed?
<!--- update README if applicable
      or point out where to update docs.temporal.io -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
shashwatsuri pushed a commit to shashwatsuri/temporal-worker-controller that referenced this pull request Apr 28, 2026
…mporalio#227)

## Summary
- PR temporalio#212 introduced `ca.crt` support for server certificate
verification but used `x509.NewCertPool()`, which creates an **empty**
CA pool — replacing the system CA bundle entirely
- This breaks connections to Temporal Cloud (public CA) when the mTLS
secret contains a `ca.crt` key from cert-manager (the CA that signed the
**client** cert, not the server cert)
- This fix uses `x509.SystemCertPool()` instead, so the custom CA is
**appended** to the system bundle rather than replacing it

## Why this broke
cert-manager always includes `ca.crt` in TLS secrets (the issuing CA).
When connecting to Temporal Cloud:
1. The controller sees `ca.crt` in the secret (the self-signed client
CA)
2. `NewCertPool()` creates an empty pool with **only** that CA
3. Temporal Cloud's server cert is signed by a public CA (e.g.,
DigiCert)
4. The public CA is no longer trusted → `x509: certificate signed by
unknown authority`

## What this fixes
- `SystemCertPool()` loads the system CA bundle first, then appends the
custom CA
- Both public CAs (Temporal Cloud) and private CAs (self-hosted) are
trusted simultaneously
- Falls back to `NewCertPool()` with a warning log if the system pool
can't be loaded

## Affected versions
- v1.2.1, v1.2.2, v1.2.3 — all contain the regression from PR temporalio#212
- Closes temporalio#223

## Test plan
- [ ] Deploy against Temporal Cloud with cert-manager mTLS secret (has
`ca.crt`) — verify connection succeeds
- [ ] Deploy against self-hosted Temporal with private CA — verify
connection succeeds
- [ ] Deploy with mTLS secret without `ca.crt` — verify fallback to
system bundle works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
shashwatsuri pushed a commit to shashwatsuri/temporal-worker-controller that referenced this pull request Apr 28, 2026
## Summary

- Adds `clientpool_test.go` with 8 unit tests covering the auth code
paths that had no test coverage
- Two tests are explicit regression guards for the bugs fixed in temporalio#227
and temporalio#232
- Makes `dialFn` and `systemCertPoolFn` injectable on `ClientPool` (no
behavior change in production) to enable testing without network I/O or
OS trust store dependencies

## Regression tests

**`TestFetchMTLS_CACertAppendsToSystemPool`** — guards against the PR
temporalio#212 bug (fixed in temporalio#227): `fetchClientUsingMTLSSecret` used
`x509.NewCertPool()` (empty) instead of `x509.SystemCertPool()`,
silently dropping system root CAs and breaking Temporal Cloud
connections. The test injects a fake system pool and verifies both the
injected system CAs and the custom `ca.crt` are present in the returned
pool. This test fails if the fix is reverted.

**`TestDialAndUpsert_APIKeySkipsCheckHealth`** — guards against the PR
temporalio#203 bug (fixed in temporalio#232): `DialAndUpsertClient` called `CheckHealth`
unconditionally, which fails on Temporal Cloud with namespace-scoped API
keys. The test uses an injected mock client and asserts `CheckHealth` is
never called for `AuthModeAPIKey`. This test fails if the fix is
reverted.

## Test plan

- [x] `go test ./internal/controller/clientpool/... -v` — all 8 tests
pass
- [x] `go build ./...` — no compilation errors
- [x] Manually revert the PR temporalio#227 fix →
`TestFetchMTLS_CACertAppendsToSystemPool` fails
- [x] Manually revert the PR temporalio#232 fix →
`TestDialAndUpsert_APIKeySkipsCheckHealth` fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Support configuring mTLS trust roots

2 participants