|
2 | 2 |
|
3 | 3 | ## Problem |
4 | 4 |
|
5 | | -If the worker controller is managing a Worker Deployment (i.e. updating its routing config), but a user makes a manual |
6 | | -change via the CLI, SDK, or gRPC API instead of via the `TemporalWorkerDeployment` CRD interface, the controller should |
7 | | -not clobber the user's change. |
| 5 | +If a worker controller is managing a Worker Deployment (ie. the controller is updating the RoutingConfig of the Worker |
| 6 | +Deployment), but the user changes something via the CLI (ie. rolls back to the previous current version, or stops the |
| 7 | +new target version from ramping because an issue was detected), the controller should not clobber what the human did. |
8 | 8 |
|
9 | | -Once the user has finished their manual intervention, they need a way to hand ownership back to the controller. |
| 9 | +At some point, after this human has handled their urgent rollback, they will want to let the controller know that it is |
| 10 | +authorized to resume making changes to the Routing Config of the Worker Deployment. |
10 | 11 |
|
11 | 12 | ## Solution |
12 | 13 |
|
13 | | -The controller uses the Temporal server's `ManagerIdentity` field on Worker Deployments to coordinate exclusive |
14 | | -ownership of routing changes. |
| 14 | +_Once it is available in OSS v1.29, the controller will be able to coordinate with other users via the `ManagerIdentity` |
| 15 | +field of a Worker Deployment. This runbook will be updated when that is available and implemented by the controller._ |
15 | 16 |
|
16 | | -When `ManagerIdentity` is set on a Worker Deployment, only clients whose identity matches `ManagerIdentity` can make |
17 | | -routing changes (set current version, set ramping version). The controller's identity is visible in the |
18 | | -`managerIdentity` field of the `TemporalWorkerDeployment` status. |
| 17 | +In the meantime, the controller will watch the `LastModifierIdentity` field of a Worker Deployment to detect whether |
| 18 | +another user has made a change. If another user made a change to the Worker Deployment, the controller will not make |
| 19 | +any more changes to ensure a human's change is not clobbered. |
19 | 20 |
|
20 | | -### How the controller claims ownership |
| 21 | +Once you are done making your own changes to the Worker Deployment's current and ramping versions, and you are ready |
| 22 | +for the Worker Controller to take over, you can update the metadata to indicate that. |
21 | 23 |
|
22 | | -The first time the controller plans a routing change for a Worker Deployment (i.e. when `ManagerIdentity` is empty), |
23 | | -it calls `SetManagerIdentity` to claim ownership before applying the change. Subsequent routing changes succeed because |
24 | | -the controller's identity already matches `ManagerIdentity`. |
| 24 | +There is no Temporal server support for Worker Deployment Version-level metadata, so you'll have to set this value on |
| 25 | +the Current Version of your Worker Deployment. |
25 | 26 |
|
26 | | -### Taking manual control |
27 | | - |
28 | | -To take manual control away from the controller, set `ManagerIdentity` to your own identity: |
| 27 | +Note: The controller decodes this metadata value as a string. Be sure to set the value to the string "true" (not the boolean true). |
29 | 28 |
|
30 | 29 | ```bash |
31 | | -temporal worker deployment manager-identity set \ |
| 30 | +temporal worker deployment update-metadata-version \ |
32 | 31 | --deployment-name $MY_DEPLOYMENT \ |
33 | | - --self |
| 32 | + --build-id $CURRENT_VERSION_BUILD_ID \ |
| 33 | + --metadata 'temporal.io/ignore-last-modifier="true"' |
34 | 34 | ``` |
35 | | - |
36 | | -The `--self` flag sets `ManagerIdentity` to the identity of the caller (auto-generated by the CLI if not explicitly |
37 | | -provided via `--identity`; similarly, the SDK uses its own auto-generated or configured identity). After this, the |
38 | | -controller's routing change attempts will fail and it will retry on a backoff until ownership is returned. |
39 | | - |
40 | | -You can then make routing changes freely (the server enforces `ManagerIdentity` for all clients, not just the |
41 | | -controller). |
42 | | - |
43 | | -### Returning ownership to the controller |
44 | | - |
45 | | -When you are done with your manual changes and want the controller to resume, clear `ManagerIdentity`: |
46 | | - |
| 35 | +Alternatively, if your CLI supports JSON input: |
47 | 36 | ```bash |
48 | | -temporal worker deployment manager-identity unset \ |
49 | | - --deployment-name $MY_DEPLOYMENT |
| 37 | +temporal worker deployment update-metadata-version \ |
| 38 | + --deployment-name $MY_DEPLOYMENT \ |
| 39 | + --build-id $CURRENT_VERSION_BUILD_ID \ |
| 40 | + --metadata-json '{"temporal.io/ignore-last-modifier":"true"}' |
| 41 | +``` |
| 42 | +In the rare case that you have a nil Current Version when you are passing back ownership, you should set it on your Ramping Version |
| 43 | +```bash |
| 44 | +temporal worker deployment update-metadata-version \ |
| 45 | + --deployment-name $MY_DEPLOYMENT \ |
| 46 | + --build-id $RAMPING_VERSION_BUILD_ID \ |
| 47 | + --metadata 'temporal.io/ignore-last-modifier="true"' |
| 48 | +``` |
| 49 | +Or with JSON: |
| 50 | +```bash |
| 51 | +temporal worker deployment update-metadata-version \ |
| 52 | + --deployment-name $MY_DEPLOYMENT \ |
| 53 | + --build-id $RAMPING_VERSION_BUILD_ID \ |
| 54 | + --metadata-json '{"temporal.io/ignore-last-modifier":"true"}' |
50 | 55 | ``` |
51 | 56 |
|
52 | | -On the next reconcile, the controller will detect that `ManagerIdentity` is empty, claim it for itself, and resume |
53 | | -managing routing changes. |
| 57 | +In the even rarer case that you have nil Current Version and nil Ramping Version, you'll need to use the CLI or SDK to |
| 58 | +set a Current or Ramping Version and then do as instructed above. |
0 commit comments