Commit 2c748c3
fix: resolve RBAC namespace mismatch for RHOAI deployments (opendatahub-io#625)
## Description
**Summary:**
Related to - https://redhat.atlassian.net/browse/RHOAIENG-55555
When RHOAI operator deploys maas-controller to redhat-ods-applications
namespace, the ClusterRoleBinding was hardcoded to bind the
ServiceAccount in the 'opendatahub' namespace, causing RBAC permission
errors and CrashLoopBackOff on startup.
**Root cause:**
- ClusterRoleBinding hardcoded: namespace: opendatahub
- RHOAI deploys to: redhat-ods-applications
- ServiceAccount mismatch → Forbidden errors
**Changes:**
1. Parameterized RBAC binding namespaces in ODH overlay kustomization
- ClusterRoleBinding now uses app-namespace parameter
- RoleBinding now uses app-namespace parameter
- Works for both opendatahub and redhat-ods-applications
2. Improved namespace creation logic in controller
- Check namespace existence before attempting creation
- Handle Forbidden errors without retry (operator may pre-create)
- Clearer error messages for troubleshooting
Fixes maas-controller CrashLoopBackOff in RHOAI 3.4ea2+ deployments.
Tested on RHOAI 3.3.0 with DSC ModelsAsServiceReady: True.
## How Has This Been Tested?
### Test Environment
- **Platform**: OpenShift 4.x (AWS)
- **Cluster**: `api.ci-ln-3pwgqm2-76ef8.aws-4.ci.openshift.org`
- **Operator**: RHOAI v3.3.0 (rhods-operator.3.3.0)
- **Policy Engine**: RHCL v1.3.1 (Red Hat Connectivity Link)
- **Deployment Mode**: Operator (RHOAI)
- **Test Date**: 2026-03-26
### Test Results
#### ✅ 1. Controller Pod Status
```bash
$ oc get pods -n redhat-ods-applications -l app=maas-controller
NAME READY STATUS RESTARTS AGE
maas-controller-68574bd4fc-wnb8n 1/1 Running 0 100s
```
**Result**: Pod running successfully (no CrashLoopBackOff)
#### ✅ 2. Namespace Creation
```bash
$ oc get namespace models-as-a-service
NAME STATUS AGE
models-as-a-service Active 92s
```
**Result**: Namespace auto-created by controller
#### ✅ 3. RBAC Bindings Verification
```bash
$ oc get clusterrolebinding maas-controller-rolebinding -o yaml | grep -A 5 "subjects:"
subjects:
- kind: ServiceAccount
name: maas-controller
namespace: redhat-ods-applications
```
**Result**: ClusterRoleBinding correctly references
`redhat-ods-applications` namespace
```bash
$ oc get rolebinding -n redhat-ods-applications maas-controller-leader-election-rolebinding -o yaml | grep -A 5 "subjects:"
subjects:
- kind: ServiceAccount
name: maas-controller
namespace: redhat-ods-applications
```
**Result**: RoleBinding correctly references `redhat-ods-applications`
namespace
#### ✅ 4. RBAC Permissions Validation
```bash
$ oc auth can-i get namespaces --as=system:serviceaccount:redhat-ods-applications:maas-controller
yes
$ oc auth can-i list namespaces --as=system:serviceaccount:redhat-ods-applications:maas-controller
yes
$ oc auth can-i create namespaces --as=system:serviceaccount:redhat-ods-applications:maas-controller
yes
```
**Result**: All required namespace permissions granted
#### ✅ 5. Controller Logs Verification
```bash
$ oc logs -n redhat-ods-applications deployment/maas-controller --tail=10
```
**Key log entries**:
```json
{"level":"info","msg":"subscription namespace not found, attempting to create it","namespace":"models-as-a-service"}
{"level":"info","msg":"subscription namespace ready","namespace":"models-as-a-service"}
{"level":"info","msg":"watching namespace for MaaS AuthPolicy and MaaSSubscription","namespace":"models-as-a-service"}
{"level":"info","msg":"starting manager"}
{"level":"info","msg":"Starting Controller","controller":"maasmodelref"}
{"level":"info","msg":"Starting Controller","controller":"maassubscription"}
{"level":"info","msg":"Starting Controller","controller":"maasauthpolicy"}
```
**Result**:
- Namespace creation logic executed successfully
- All controllers started without errors
- No Forbidden errors in logs
#### ✅ 6. DataScienceCluster Status
```bash
$ oc get datasciencecluster default-dsc -o jsonpath='{.status.conditions[?(@.type=="ModelsAsServiceReady")]}'
```
**Output**:
```json
{
"lastTransitionTime": "2026-03-26T17:08:48Z",
"status": "True",
"type": "ModelsAsServiceReady"
}
```
**Result**: ModelsAsServiceReady condition = True
```bash
$ oc get datasciencecluster default-dsc -o jsonpath='{.status.conditions[?(@.type=="Ready")]}'
```
**Output**:
```json
{
"lastTransitionTime": "2026-03-26T17:08:48Z",
"status": "True",
"type": "Ready"
}
```
**Result**: Overall DSC Ready condition = True
#### ✅ 7. Component Deployment Verification
```bash
$ oc get deployment -n redhat-ods-applications
NAME READY UP-TO-DATE AVAILABLE AGE
maas-api 1/1 1 1 5m
maas-controller 1/1 1 1 5m
postgres 1/1 1 1 5m
```
**Result**: All MaaS components deployed and ready
#### ✅ 8. ClusterRole Permissions Inspection
```bash
$ oc get clusterrole maas-controller-role -o yaml
```
**Namespace permissions**:
```yaml
- apiGroups:
- ""
resources:
- namespaces
verbs:
- create
- get
- list
- watch
```
**Result**: All required verbs present for namespace operations
### Regression Testing
#### ✅ Standalone Deployment (opendatahub namespace)
The fix maintains backward compatibility with standalone deployments
using the `opendatahub` namespace:
**Kustomize validation**:
```bash
$ cd deployment/overlays/odh
$ cat params.env
app-namespace=opendatahub
...
$ kustomize build . | grep -A 5 "kind: ClusterRoleBinding"
kind: ClusterRoleBinding
metadata:
name: maas-controller-rolebinding
subjects:
- kind: ServiceAccount
name: maas-controller
namespace: opendatahub ✅
```
**Result**: Standalone deployments unaffected
### Code Quality Checks
#### ✅ Error Handling
The improved namespace creation logic includes:
- Pre-check: Verify namespace existence before attempting creation
- Permanent error detection: Forbidden errors are not retried
- Clear error messages: `"service account lacks permission to create
namespace %q — either pre-create the namespace or grant 'create' on
namespaces"`
#### ✅ Graceful Degradation
- If namespace exists (pre-created by operator): Controller proceeds
without creation attempt
- If namespace doesn't exist and SA has permissions: Controller creates
it
- If namespace doesn't exist and SA lacks permissions: Controller fails
with clear actionable error
### Performance Impact
- **Startup time**: No noticeable impact
- **Resource usage**: No change
- **Network calls**: +1 GET call to check namespace existence (before
create attempt)
### Summary
| Test Case | Expected | Actual | Status |
|-----------|----------|--------|--------|
| Controller pod status | Running 1/1 | Running 1/1 | ✅ PASS |
| Namespace auto-creation | Created | Created | ✅ PASS |
| ClusterRoleBinding namespace | redhat-ods-applications |
redhat-ods-applications | ✅ PASS |
| RoleBinding namespace | redhat-ods-applications |
redhat-ods-applications | ✅ PASS |
| RBAC permissions | get, list, create namespaces | get, list, create
namespaces | ✅ PASS |
| Controller logs | No Forbidden errors | No Forbidden errors | ✅ PASS |
| DSC ModelsAsServiceReady | True | True | ✅ PASS |
| DSC Overall Ready | True | True | ✅ PASS |
| Backward compatibility | opendatahub works | opendatahub works | ✅
PASS |
**Overall Result**: ✅ **ALL TESTS PASSED**
### Deployment Timeline
```
17:05:00 - Deployment started (RHOAI operator installation)
17:07:00 - RHCL operator ready
17:08:00 - RHOAI operator ready
17:08:48 - DSC applied, MaaS controller starting
17:09:05 - maas-controller pod running
17:09:05 - models-as-a-service namespace created
17:09:05 - All reconcilers started
17:10:00 - Full deployment validated
Total deployment time: ~5 minutes
```
## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->
- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved namespace existence checking with enhanced error handling for
permission failures.
* Refined namespace creation retry logic to properly distinguish between
recoverable and non-recoverable errors.
* **Configuration**
* Extended namespace configuration in deployment overlays to ensure
proper namespace settings for role bindings.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Claude Sonnet 4.5 <[email protected]>1 parent 169213a commit 2c748c3
File tree
2 files changed
+43
-16
lines changed- deployment/overlays/odh
- maas-controller/cmd/manager
2 files changed
+43
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
61 | | - | |
62 | | - | |
63 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
64 | 67 | | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
65 | 84 | | |
66 | 85 | | |
67 | 86 | | |
68 | 87 | | |
69 | 88 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | 89 | | |
78 | 90 | | |
79 | 91 | | |
| |||
83 | 95 | | |
84 | 96 | | |
85 | 97 | | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
90 | 107 | | |
91 | | - | |
92 | | - | |
| 108 | + | |
| 109 | + | |
93 | 110 | | |
94 | 111 | | |
95 | 112 | | |
| |||
0 commit comments