Commit b8d9428
authored
fix: clean up Temporal server-side versioning data on TWD deletion (#240)
- Add a finalizer to `TemporalWorkerDeployment` to run Temporal
server-side cleanup before K8s deletion
- Add a finalizer to `TemporalConnection` to prevent it from being
deleted while any TWD still references it
- On TWD deletion, set current version to unversioned, clear ramping
version, and delete registered versions
## Problem
When a `TemporalWorkerDeployment` CRD is deleted (e.g., switching back
to plain Deployments), the Temporal server retains the build ID routing
configuration. The matching service continues routing new tasks to the
deleted build ID's physical queue, while unversioned workers poll a
different physical queue. Tasks sit in `Scheduled` state indefinitely
with no errors.
A secondary race condition exists: Helm deletes both the
`TemporalConnection` and `TWD` in the same upgrade. Without the
connection, the controller cannot talk to Temporal to clean up. This is
solved by adding a finalizer to the `TemporalConnection` that blocks its
deletion until all referencing TWDs are gone.
## Changes
**`internal/controller/worker_controller.go`:**
**TWD finalizer (`temporal.io/worker-deployment-cleanup`):**
- Added to all TWD resources during normal reconciliation
- On deletion, triggers `handleDeletion()` which:
1. Sets the current version to unversioned (`BuildID: ""`) -- the
critical step that unblocks task dispatch
2. Clears any ramping version
3. Deletes all registered versions with `SkipDrainage: true`
4. Attempts to delete the deployment record itself
5. Removes the connection finalizer if no other TWDs reference it
6. Removes its own finalizer, allowing K8s to complete deletion
**TemporalConnection finalizer (`temporal.io/connection-in-use`):**
- Added to the `TemporalConnection` during normal TWD reconciliation via
`ensureConnectionFinalizer()`
- Prevents the connection from being deleted while any TWD still
references it
- Removed by `removeConnectionFinalizerIfUnused()` during TWD deletion,
after checking no other TWDs in the same namespace reference the
connection
- Guarantees the connection is always available during TWD cleanup -- no
race condition with Helm deleting both resources simultaneously
**RBAC updates:**
- Added `update;patch` verbs for `temporalconnections` (was
`get;list;watch`)
- Added `update` verb for `temporalconnections/finalizers`
## Deletion flow
```
Helm upgrade (TWD disabled)
|
v
Helm deletes TWD CRD + TemporalConnection CRD simultaneously
|
+--> TemporalConnection: has finalizer, K8s sets DeletionTimestamp but blocks deletion
|
+--> TWD: has finalizer, K8s sets DeletionTimestamp, triggers Reconcile
|
v
handleDeletion() runs:
1. Fetches TemporalConnection (guaranteed to exist via finalizer)
2. Connects to Temporal server
3. Sets current version to unversioned
4. Deletes versions
5. Removes connection finalizer (no other TWDs reference it)
6. Removes TWD finalizer
|
v
TWD deleted by K8s
|
v
TemporalConnection: no more finalizers, deleted by K8s
```
Issue #55
Closes #166
---------
Signed-off-by: Anuj Agrawal <anujagrawal380@gmail.com>1 parent 35371a9 commit b8d9428
5 files changed
Lines changed: 625 additions & 16 deletions
File tree
- helm
- temporal-worker-controller-crds/templates
- temporal-worker-controller/templates
- internal
- controller
- tests/internal
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3753 | 3753 | | |
3754 | 3754 | | |
3755 | 3755 | | |
| 3756 | + | |
| 3757 | + | |
| 3758 | + | |
| 3759 | + | |
3756 | 3760 | | |
3757 | 3761 | | |
3758 | 3762 | | |
| |||
3947 | 3951 | | |
3948 | 3952 | | |
3949 | 3953 | | |
| 3954 | + | |
| 3955 | + | |
| 3956 | + | |
| 3957 | + | |
| 3958 | + | |
| 3959 | + | |
| 3960 | + | |
| 3961 | + | |
| 3962 | + | |
| 3963 | + | |
| 3964 | + | |
| 3965 | + | |
3950 | 3966 | | |
3951 | 3967 | | |
3952 | 3968 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| 110 | + | |
110 | 111 | | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | 112 | | |
122 | 113 | | |
123 | 114 | | |
| |||
126 | 117 | | |
127 | 118 | | |
128 | 119 | | |
| 120 | + | |
129 | 121 | | |
130 | 122 | | |
131 | 123 | | |
132 | 124 | | |
133 | 125 | | |
134 | 126 | | |
135 | | - | |
136 | | - | |
| 127 | + | |
137 | 128 | | |
| 129 | + | |
| 130 | + | |
138 | 131 | | |
| 132 | + | |
139 | 133 | | |
140 | 134 | | |
| 135 | + | |
141 | 136 | | |
142 | 137 | | |
143 | 138 | | |
144 | | - | |
| 139 | + | |
| 140 | + | |
145 | 141 | | |
146 | 142 | | |
147 | | - | |
148 | 143 | | |
149 | 144 | | |
150 | | - | |
151 | 145 | | |
152 | 146 | | |
153 | 147 | | |
| |||
0 commit comments