Skip to content

v1.6.0

Choose a tag to compare

@carlydf carlydf released this 29 Apr 23:18
· 7 commits to main since this release
02e0483

This release corresponds to Helm chart version 0.25.0. For details on versioning and how chart/app versions relate, see docs/release.md.

What's Changed

Upgrade Note

One-time pod rollout after upgrade. The build ID hash algorithm was updated to use json.Marshal instead of spew for pod spec serialization, so it now ignores zero-value fields introduced by future Kubernetes API versions (#290). On first reconcile after upgrading the controller, each TemporalWorkerDeployment will be assigned a new build ID and undergo a normal safe rollout. No manual intervention is required.


Bug Fixes

  • Rate-limit back-off for DescribeWorkerDeployment (#291): When the Temporal server returns ResourceExhausted (namespace read RPS limit hit), the reconciler now backs off for 30 seconds instead of tight-looping. The condition is surfaced as ConditionProgressing=False with reason TemporalStateFetchFailed and a Rate limited message.

  • Credential rotation: API key now read live on every RPC call (#301): The API key credential closure now reads the value from the K8s Secret on every outgoing Temporal RPC, so a rotated key takes effect immediately without requiring a controller restart or a permission-denied error cycle.

  • Credential rotation: SDK client evicted on auth errors (#300): PermissionDenied and Unauthenticated errors from Temporal SDK calls now evict the cached client from the pool, so the next reconcile re-reads credentials and re-dials. Previously, a rotated API key or revoked mTLS cert caused a permanent stuck-retry loop.

  • Events RBAC fix (#292): The events RBAC marker used the wrong API group (events.k8s.io instead of the core "" group), causing Server rejected event (will not retry!) log errors in cluster-wide deployments. Fixed, and Helm ClusterRole generation is now automated from Go markers to prevent future drift.


New Features

  • Server-side versioning cleanup on TWD deletion (#240): When a TemporalWorkerDeployment is deleted (e.g., switching back to plain Deployments), the controller now resets Temporal's routing state before completing deletion. Without this, tasks could become permanently stuck in Scheduled state. A finalizer on TemporalConnection also prevents a race condition where Helm deletes both resources simultaneously and the controller loses its connection before cleanup completes.

  • CRD-level spec validation via CEL rules (#293): Key TemporalWorkerDeployment spec constraints are now enforced by the API server at apply time via x-kubernetes-validations, regardless of whether the webhook is enabled. Validated rules include: name ≤ 63 chars, progressive strategy requires steps, max 20 steps, pauseDuration ≥ 30s per step, and gate.inputFrom requires exactly one source. Two constraints that cannot be expressed in CEL (strictly increasing rampPercentage, mutually exclusive gate.input/gate.inputFrom) fall back to reconciler-level validation with a Warning event and InvalidSpec condition.

  • Accept Opaque secrets for mTLS auth (#276): The controller now accepts both kubernetes.io/tls and Opaque secret types for MutualTLSSecretRef. This unblocks setups that bundle tls.crt, tls.key, and ca.crt into a single Opaque secret (e.g., cert-manager outputs with a custom CA).


Deprecations

  • authProxy.enabled Helm value deprecated (#304): The authProxy.enabled option is deprecated. Use metrics.disableAuth instead. The metrics port now only binds to 127.0.0.1 when the auth proxy is explicitly enabled.

Infrastructure

  • Preparation for cluster-scoped controller identity (#308): The manager identity claim logic now recognizes the upcoming cluster-UID-prefixed identity format, enabling clean reclaim after rollback from v1.7.0 (which will include full cluster UID support).

  • Removed go.work and binaries from source control (#305): go.work, go.work.sum, and checked-in binary files removed; go vet moved to the linters workflow.


Dependency Updates

  • github.com/aws/aws-sdk-go-v2: eventstream 1.7.4→1.7.8, lambda 1.88.0→1.88.5
  • go.opentelemetry.io/otel/sdk: 1.40.0→1.43.0
  • github.com/go-jose/go-jose/v4: 4.1.3→4.1.4
  • github.com/jackc/pgx/v5: 5.7.2→5.9.2

New Contributors

Full Changelog: v1.5.2...v1.6.0