Flush SDK client cache on PermissionDenied/Unauthenticated errors#300
Merged
Flush SDK client cache on PermissionDenied/Unauthenticated errors#300
Conversation
When the Temporal server returns an access-denied error (PermissionDenied or Unauthenticated), the stale cached client is now evicted from the pool so the next reconcile re-reads credentials from the K8s secret and re-dials. Closes #295 Closes #113 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
veeral-patel
approved these changes
Apr 24, 2026
3 tasks
carlydf
added a commit
that referenced
this pull request
Apr 25, 2026
## Summary Follow-up to #300. Solves #295 more gracefully. - Replaces the static `secret.Data` capture in the `NewAPIKeyDynamicCredentials` closure with a live K8s secret read via a new `fetchAPIKeyFromSecret` helper - A rotated API key now takes effect on the next outgoing Temporal RPC — no permission-denied cycle needed to evict and re-dial the cached client - The k8s client is controller-runtime's cache-backed client, so reads hit the local informer cache (cheap in-memory lookup, not a raw API server call) - `fetchAPIKeyFromSecret` is extracted as a testable method; new test verifies the live-read and rotation behavior directly ## Test plan - [ ] `go test ./internal/controller/clientpool/...` — new `TestFetchAPIKey_CredentialClosureReadsLiveSecret` passes (verifies initial read and post-rotation read return correct values) - [ ] `go test ./internal/controller/...` — existing tests still pass - [ ] `go build ./...` — compiles clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
shashwatsuri
pushed a commit
to shashwatsuri/temporal-worker-controller
that referenced
this pull request
Apr 28, 2026
…mporalio#300) ## Summary - Adds `ClientPool.EvictClient(key)` to close and remove a cached SDK client by key - Detects `PermissionDenied` and `Unauthenticated` errors from Temporal SDK calls in the reconcile loop - Evicts the stale client from the pool at both error sites (`GetWorkerDeploymentState` and `executePlan`), so the next reconcile re-reads credentials from the K8s secret and re-dials Without this, a rotated API key or revoked mTLS cert causes a permanent stuck-retry loop that only recovers with a controller restart. Closes temporalio#295 Closes temporalio#113 ## Test plan - [ ] `go test ./internal/controller/clientpool/...` — new `TestEvictClient_RemovesAndClosesClient` and `TestEvictClient_NoopWhenKeyAbsent` pass - [ ] `go test ./internal/controller/...` — existing controller tests still pass - [ ] `go build ./...` — compiles clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
shashwatsuri
pushed a commit
to shashwatsuri/temporal-worker-controller
that referenced
this pull request
Apr 28, 2026
## Summary Follow-up to temporalio#300. Solves temporalio#295 more gracefully. - Replaces the static `secret.Data` capture in the `NewAPIKeyDynamicCredentials` closure with a live K8s secret read via a new `fetchAPIKeyFromSecret` helper - A rotated API key now takes effect on the next outgoing Temporal RPC — no permission-denied cycle needed to evict and re-dial the cached client - The k8s client is controller-runtime's cache-backed client, so reads hit the local informer cache (cheap in-memory lookup, not a raw API server call) - `fetchAPIKeyFromSecret` is extracted as a testable method; new test verifies the live-read and rotation behavior directly ## Test plan - [ ] `go test ./internal/controller/clientpool/...` — new `TestFetchAPIKey_CredentialClosureReadsLiveSecret` passes (verifies initial read and post-rotation read return correct values) - [ ] `go test ./internal/controller/...` — existing tests still pass - [ ] `go build ./...` — compiles clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ClientPool.EvictClient(key)to close and remove a cached SDK client by keyPermissionDeniedandUnauthenticatederrors from Temporal SDK calls in the reconcile loopGetWorkerDeploymentStateandexecutePlan), so the next reconcile re-reads credentials from the K8s secret and re-dialsWithout this, a rotated API key or revoked mTLS cert causes a permanent stuck-retry loop that only recovers with a controller restart.
Closes #295
Closes #113
Test plan
go test ./internal/controller/clientpool/...— newTestEvictClient_RemovesAndClosesClientandTestEvictClient_NoopWhenKeyAbsentpassgo test ./internal/controller/...— existing controller tests still passgo build ./...— compiles clean🤖 Generated with Claude Code