Skip to content

Fix events RBAC and automate Helm ClusterRole generation from markers#292

Merged
carlydf merged 3 commits intomainfrom
generate-permissions
Apr 24, 2026
Merged

Fix events RBAC and automate Helm ClusterRole generation from markers#292
carlydf merged 3 commits intomainfrom
generate-permissions

Conversation

@carlydf
Copy link
Copy Markdown
Collaborator

@carlydf carlydf commented Apr 23, 2026

Problem

Two related issues:

  1. Events permission used the wrong API group.
    The +kubebuilder:rbac marker for events specified groups=events.k8s.io (the newer structured events API), but controller-runtime records v1.Event objects against the core "" API group. This caused the controller to log Server rejected event (will not retry!) errors when deployed in a different namespace from the TemporalWorkerDeployment CRs it manages — common cluster-wide deployment pattern.

  2. The Helm ClusterRole was hand-maintained and had drifted from the Go markers.
    The +kubebuilder:rbac markers in worker_controller.go were incomplete: workerresourcetemplates, workerresourcetemplates/status, and subjectaccessreviews were all present in the Helm chart but had no corresponding markers. This made it easy for RBAC rules to fall out of sync — which is exactly how the events bug happened in the first place.

Changes

Fix the events API group (worker_controller.go, config/rbac/role.yaml): correct groups=events.k8s.io → groups=core in the marker; the generated manifest now uses apiGroups: [""].

Add missing markers (worker_controller.go): add markers for workerresourcetemplates (get/list/watch/patch/update), workerresourcetemplates/status (get/patch/update), and authorization.k8s.io/subjectaccessreviews (create) to match what was already deployed by the Helm chart.

Automate Helm ClusterRole sync (hack/sync-rbac-rules.py, Makefile, helm/.../rbac.yaml): make manifests now runs a script that reads the controller-gen-generated config/rbac/role.yaml and replaces the # GENERATED RULES BEGIN / # GENERATED RULES END section in the Helm template. The Helm-templated dynamic rules (the allowedResources range) are left untouched. Going forward, adding a +kubebuilder:rbac marker and running make manifests is all that's needed to update the deployed ClusterRole.

Closes #277

@carlydf carlydf requested review from a team and jlegrone as code owners April 23, 2026 05:20
Comment thread internal/controller/worker_controller.go Outdated
Comment thread hack/sync-rbac-rules.py
Copy link
Copy Markdown
Contributor

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

Might I suggest adding a check to the .github/workflows/helm-validate.yml workflow file that double-checks nobody manually updated the Helm RBAC stuff?

jobs:
  helm-check-rbac:
    name: Check Helm RBAC wasn't manually updated
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: make manifests && git diff --exit-code helm/temporal-worker-controller/templates/rbac.yaml

Copy link
Copy Markdown
Contributor

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent work on this, thank you @carlydf! :)

@carlydf carlydf merged commit dab1e34 into main Apr 24, 2026
18 checks passed
@carlydf carlydf deleted the generate-permissions branch April 24, 2026 17:35
shashwatsuri pushed a commit to shashwatsuri/temporal-worker-controller that referenced this pull request Apr 28, 2026
…temporalio#292)

## Problem

### Two related issues:

1. Events permission used the wrong API group.
The +kubebuilder:rbac marker for events specified groups=events.k8s.io
(the newer structured events API), but controller-runtime records
v1.Event objects against the core "" API group. This caused the
controller to log Server rejected event (will not retry!) errors when
deployed in a different namespace from the TemporalWorkerDeployment CRs
it manages — common cluster-wide deployment pattern.

2. The Helm ClusterRole was hand-maintained and had drifted from the Go
markers.
The +kubebuilder:rbac markers in worker_controller.go were incomplete:
workerresourcetemplates, workerresourcetemplates/status, and
subjectaccessreviews were all present in the Helm chart but had no
corresponding markers. This made it easy for RBAC rules to fall out of
sync — which is exactly how the events bug happened in the first place.

## Changes

Fix the events API group (worker_controller.go, config/rbac/role.yaml):
correct groups=events.k8s.io → groups=core in the marker; the generated
manifest now uses apiGroups: [""].

Add missing markers (worker_controller.go): add markers for
workerresourcetemplates (get/list/watch/patch/update),
workerresourcetemplates/status (get/patch/update), and
authorization.k8s.io/subjectaccessreviews (create) to match what was
already deployed by the Helm chart.

Automate Helm ClusterRole sync (hack/sync-rbac-rules.py, Makefile,
helm/.../rbac.yaml): make manifests now runs a script that reads the
controller-gen-generated config/rbac/role.yaml and replaces the #
GENERATED RULES BEGIN / # GENERATED RULES END section in the Helm
template. The Helm-templated dynamic rules (the allowedResources range)
are left untouched. Going forward, adding a +kubebuilder:rbac marker and
running make manifests is all that's needed to update the deployed
ClusterRole.

Closes temporalio#277
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] ClusterRole missing events patch/create permissions for cross-namespace TemporalWorkerDeployment resources (v1.4.0)

2 participants