Skip to content

Include cluster UID in CONTROLLER_IDENTITY to prevent cross-cluster conflicts#309

Merged
carlydf merged 19 commits intomainfrom
dedupe-controller-id2
Apr 29, 2026
Merged

Include cluster UID in CONTROLLER_IDENTITY to prevent cross-cluster conflicts#309
carlydf merged 19 commits intomainfrom
dedupe-controller-id2

Conversation

@carlydf
Copy link
Copy Markdown
Collaborator

@carlydf carlydf commented Apr 28, 2026

Completes the two-release migration for including the cluster namespace UID in the controller identity string.

Previously, the new {identity}/{namespaceUID} format was prepared behind a "next release" comment while the bare {identity} string remained active. This PR flips that: getControllerIdentity() now returns the full {identity}/{suffix} format, and getDeprecatedControllerIdentity() handles reclamation of Worker Deployments previously claimed under the old format.

Changes:

  • getControllerIdentity() now returns {CONTROLLER_IDENTITY}/{CONTROLLER_IDENTITY_SUFFIX}; returns empty string if either is unset
  • getDeprecatedControllerIdentity() (renamed from getControllerIdentityWithNamespaceUID) is used only for backward-compatible reclamation
  • Startup guard in main() fails fast if CONTROLLER_IDENTITY is unset
  • Fallback guard in Reconcile() for library users who bypass main() requires that getControllerIdentity() != ""
  • Safety check in claimManagerIdentity() refuses to call SetManagerIdentity with an empty string to prevent accidental field clearing
  • Renamed ToBeDeprecatedDefaultControllerIdentityDeprecatedDefaultControllerIdentity
  • Updated tests to set both IdentityEnvKey and IdentitySuffixEnvKey

carlydf and others added 17 commits April 27, 2026 13:46
Controllers that previously set ManagerIdentity as "release/namespace"
(pre-cluster-UID) or "temporal-worker-controller" (pre-Helm default)
would be permanently blocked from managing existing Worker Deployments
after upgrading, since shouldClaimManagerIdentity only triggered on
empty identity.

Now reclaims in both migration cases:
- exact match on the old hardcoded default
- new identity is a longer slash-prefixed version of the existing one

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… distinct test identity

- defaults.ControllerIdentity is now DeprecatedDefaultControllerIdentity to make
  clear it exists only for upgrade migration detection, not as a live default
- integration tests use a dedicated testControllerIdentity constant so they
  exercise the normal identity path rather than accidentally triggering the
  deprecated-identity migration branch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SetManagerIdentity with an empty identity clears the ManagerIdentity
field on the Worker Deployment, leaving it ownerless. Add an explicit
check and return an error rather than making the call. The startup
check in main() is the primary enforcement; this is a targeted guard
for the one call site where an empty value is actively dangerous.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
carlydf added a commit that referenced this pull request Apr 28, 2026
…s-cluster conflicts (#308)

<!--- Note to EXTERNAL Contributors -->
<!-- Thanks for opening a PR! 
If it is a significant code change, please **make sure there is an open
issue** for this.
We work best with you when we have accepted the idea first before you
code. -->

<!--- For ALL Contributors 👇 -->

## What was changed
In `shouldClaimManagerIdentity`, detect future format of manager
identity

## Why?
To facilitate clean reclaim after rollback from the next release, which
will include
#309

## Checklist
<!--- add/delete as needed --->

1. Closes <!-- add issue number here -->

2. How was this tested:
Envtest won't test our get ns permissions, so just trusting there (we
will test it in CI env because it runs on controller startup)

3. Any docs updates needed?
<!--- update README if applicable
      or point out where to update docs.temporal.io -->

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Base automatically changed from prepare-for-id-change to main April 28, 2026 20:32
@carlydf carlydf marked this pull request as ready for review April 29, 2026 19:00
@carlydf carlydf requested review from a team and jlegrone as code owners April 29, 2026 19:00
Copy link
Copy Markdown
Contributor

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thank you @carlydf!

Comment thread internal/controller/execplan.go
@carlydf carlydf merged commit d56306a into main Apr 29, 2026
21 of 23 checks passed
@carlydf carlydf deleted the dedupe-controller-id2 branch April 29, 2026 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants