feat: add MaaS (Models as a Service) Helm chart and integration#61
feat: add MaaS (Models as a Service) Helm chart and integration#61tsisodia10 wants to merge 3 commits intoopendatahub-io:mainfrom
Conversation
Add MaaS Helm chart and integrate it into the rhaii-on-xks deployment stack alongside RHCL for authenticated, rate-limited model serving on managed Kubernetes (xKS). - charts/maas/: Helm chart with maas-controller, maas-api, PostgreSQL, Gateway, HTTPRoute, AuthPolicy and RateLimitPolicy templates - CRDs: MaaSSubscription, ExternalModel, MaaSModelRef, MaaSAuthPolicy - Makefile: deploy-maas/undeploy-maas targets, prerequisite checks, orphaned cert-manager detection, MaaS status output - helmfile.yaml.gotmpl: MaaS helmfile inclusion, auto-merge of gatewayNamespaces between RHCL and MaaS - values.yaml: maas config block (enabled, namespace, gatewayNamespace) - scripts/cleanup.sh: RHCL/Kuadrant and MaaS resource cleanup Depends on: KServe, RHCL (Kuadrant), Istio, cert-manager Made-with: Cursor
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tsisodia10 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughAdds a new Models-as-a-Service (MaaS) Helm chart with CRDs (ExternalModel, MaaSAuthPolicy, MaaSModelRef, MaaSSubscription), controller and API deployments, Gateway and HTTPRoute templates, Kuadrant auth/rate-limit policies, optional PostgreSQL, Azure AD config, RBAC, NetworkPolicy, and supporting templates/helpers. Integrates MaaS into top-level helmfile and Makefile (MAAS flag, deploy-maas/undeploy-maas, status checks, lint target), updates cert-manager cleanup logic, extends scripts/cleanup.sh to remove Kuadrant/RHCL and MaaS artifacts, and adds values.yaml entries for gateway namespace aggregation. Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Security & Configuration Issues
Only actionable issues listed above. 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
scripts/cleanup.sh (2)
63-68:⚠️ Potential issue | 🟡 MinorUnquoted variables in kubectl patch commands - potential word splitting.
Variables
$revand$istshould be quoted to prevent word splitting if resource names contain spaces or special characters.Proposed fix
kubectl get istiorevision -A -o name 2>/dev/null | while read rev; do - kubectl patch $rev -p '{"metadata":{"finalizers":[]}}' --type=merge 2>/dev/null || true + kubectl patch "$rev" -p '{"metadata":{"finalizers":[]}}' --type=merge 2>/dev/null || true done kubectl get istio -A -o name 2>/dev/null | while read ist; do - kubectl patch $ist -p '{"metadata":{"finalizers":[]}}' --type=merge 2>/dev/null || true + kubectl patch "$ist" -p '{"metadata":{"finalizers":[]}}' --type=merge 2>/dev/null || true done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/cleanup.sh` around lines 63 - 68, The kubectl patch loops use unquoted shell variables ($rev and $ist) which can cause word splitting for resource names with spaces/special chars; update the two kubectl patch invocations that reference $rev and $ist to quote the variables (use "$rev" and "$ist") so each resource name is passed as a single argument to kubectl, keeping the surrounding command structure (the while read loops and the 2>/dev/null || true behavior) intact.
1-4:⚠️ Potential issue | 🟠 MajorMissing
set -euo pipefail- CWE-78 shell script safety violation.Per coding guidelines, shell scripts must use
set -euo pipefailat script start for proper error handling and to prevent undefined variable usage.Proposed fix
#!/bin/bash +set -euo pipefail # Cleanup script for rhaii-on-xks # Runs helmfile destroy and cleans up presync/template resources🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/cleanup.sh` around lines 1 - 4, Add strict error handling to the script by inserting "set -euo pipefail" immediately after the existing shebang (#!/bin/bash) in cleanup.sh; this ensures the script exits on errors, treats unset variables as errors, and fails on pipeline errors—after adding it, scan the script for any unquoted variable expansions or intentional unset-variable usages and fix them (declare default values or quote expansions) so the new strict mode doesn't break legitimate behavior.
🟡 Minor comments (6)
charts/maas/templates/namespace.yaml-6-6 (1)
6-6:⚠️ Potential issue | 🟡 MinorValidate
maasApi.subscriptionNamespaceat template render time.An empty/missing value here yields invalid namespace manifests and brittle installs. Enforce it with
requiredfor deterministic failures.Remediation
- name: {{ .Values.maasApi.subscriptionNamespace }} + name: {{ required "maasApi.subscriptionNamespace is required" .Values.maasApi.subscriptionNamespace }}As per coding guidelines, "REVIEW PRIORITIES: ... 3. Bug-prone patterns and error handling gaps".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/namespace.yaml` at line 6, The template uses {{ .Values.maasApi.subscriptionNamespace }} directly which allows empty/missing values and produces invalid Namespace manifests; update the Helm template in namespace.yaml to validate and fail fast by replacing the bare reference with the required function (e.g. required "error message" .Values.maasApi.subscriptionNamespace) so template rendering errors clearly when maasApi.subscriptionNamespace is not provided or empty; ensure the error message identifies maasApi.subscriptionNamespace so operators know which value to set.charts/maas/templates/pull-secret.yaml-15-16 (1)
15-16:⚠️ Potential issue | 🟡 MinorPotential double base64 encoding of dockerConfigJson.
The template applies
b64encto.Values.pullSecret.dockerConfigJson. If the value is already base64-encoded (common when sourced from existing pull secrets or~/.docker/config.json), this results in double encoding and a non-functional secret.Kubernetes expects
.dockerconfigjsonto be base64-encoded JSON. If your input is raw JSON, this is correct. If your input is already base64 (fromkubectl get secret ... -o jsonpath), this will break.Proposed fix: Add conditional or document expected format
Option 1: Document clearly in values.yaml that raw JSON is expected:
pullSecret: # dockerConfigJson must be RAW JSON (not base64), e.g.: # {"auths":{"registry.redhat.io":{"auth":"..."}}} dockerConfigJson: ""Option 2: Use a helper that detects and handles both formats (more complex).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/pull-secret.yaml` around lines 15 - 16, The template currently always applies b64enc to .Values.pullSecret.dockerConfigJson which will double-encode inputs that are already base64; update the chart so the template and values agree: either (A) document in values.yaml that pullSecret.dockerConfigJson must be RAW JSON (and keep the b64enc in the template), or (B) implement a conditional in the pull-secret.yaml template that detects if .Values.pullSecret.dockerConfigJson is already base64 (e.g. via a simple regex helper) and only calls b64enc when it is not; reference .dockerconfigjson in the template and .Values.pullSecret.dockerConfigJson and the b64enc invocation when making the change.charts/maas/templates/postgresql/secret.yaml-25-26 (1)
25-26:⚠️ Potential issue | 🟡 MinorConnection URL duplicates hardcoded credentials.
The
DB_CONNECTION_URLembeds the same hardcoded credentials. If the credentials are parameterized above, this template must also reference the same values to stay in sync.Proposed fix
stringData: - DB_CONNECTION_URL: "postgresql://maas:maas-dev-password@postgres.{{ include "maas.namespace" . }}.svc.cluster.local:5432/{{ .Values.postgresql.database }}?sslmode=disable" + DB_CONNECTION_URL: "postgresql://{{ .Values.postgresql.auth.username | default "maas" }}:{{ .Values.postgresql.auth.password | default (randAlphaNum 24) }}@postgres.{{ include "maas.namespace" . }}.svc.cluster.local:5432/{{ .Values.postgresql.database }}?sslmode=disable"Note: Using
randAlphaNumtwice would generate different values. Consider using a shared helper orlookupto retrieve the generated password from the first secret.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/postgresql/secret.yaml` around lines 25 - 26, The DB_CONNECTION_URL in secret.yaml currently hardcodes credentials instead of using the parameterized values, causing duplication and drift; update DB_CONNECTION_URL to reference the same username/password source used elsewhere (e.g., the postgresql secret keys or .Values.postgresql.username/.password) rather than embedding "maas" and "maas-dev-password", or retrieve the generated password via a Helm lookup to the first secret; alternatively extract the password generation into a shared helper so both the secret data and DB_CONNECTION_URL use the same generated value (avoid calling randAlphaNum twice).charts/maas/templates/postgresql/deployment.yaml-55-68 (1)
55-68:⚠️ Potential issue | 🟡 MinorHardcoded username in health probes.
Probes use
-U maasbutPOSTGRESQL_USERis sourced from secret. If the secret's user value changes, probes will fail.readinessProbe: exec: command: - /bin/sh - -c - - pg_isready -U maas -d {{ .Values.postgresql.database }} + - pg_isready -U $POSTGRESQL_USER -d $POSTGRESQL_DATABASESame for livenessProbe.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/postgresql/deployment.yaml` around lines 55 - 68, Health probes hardcode "-U maas" which will break if POSTGRESQL_USER from the secret differs; update the readinessProbe and livenessProbe exec command lines (readinessProbe, livenessProbe) to use the runtime environment variable instead of a literal user by substituting the container env var (e.g. use ${POSTGRESQL_USER} or sh -c "pg_isready -U ${POSTGRESQL_USER} -d ${POSTGRESQL_DATABASE}") so the probes read the secret-provided POSTGRESQL_USER/POSTGRESQL_DATABASE values at runtime.Makefile-198-206 (1)
198-206:⚠️ Potential issue | 🟡 MinorHardcoded gateway name
maas-default-gatewaydoesn't respect customization.Line 204 hardcodes the gateway name, but
charts/maas/values.yaml:49showsgateway.nameis configurable. If users customize this value, the status check will fail to find the gateway.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 198 - 206, The Makefile health/status target hardcodes the gateway name "maas-default-gateway" which breaks when users set charts/maas/values.yaml gateway.name to a custom value; update the kubectl check in the target that calls kubectl get gateway maas-default-gateway -n istio-system to instead read the configured gateway name (e.g., via a variable like MAAS_GATEWAY or by parsing charts/maas/values.yaml or Helm release values) and use that variable in the kubectl command so the check honors the configured gateway.name.charts/maas/helmfile.yaml.gotmpl-93-101 (1)
93-101:⚠️ Potential issue | 🟡 MinorSA patch retry loop doesn't fail when SA never appears.
If the service account isn't created within 25 seconds (5 retries × 5s), the loop exits silently without patching. The gateway pod restart on line 103 will proceed, but the new pod may fail to pull images from registry.redhat.io. This creates a delayed runtime failure with no clear indication of the root cause.
Proposed fix - fail after retries exhausted
for i in 1 2 3 4 5; do if kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" >/dev/null 2>&1; then kubectl patch sa "$GATEWAY_SA" -n "$GATEWAY_NS" \ -p '{"imagePullSecrets": [{"name": "redhat-pull-secret"}]}' 2>/dev/null && \ echo " Gateway SA patched" && break fi echo " Waiting for gateway SA... ($i/5)" sleep 5 + if [ "$i" -eq 5 ]; then + echo "WARNING: Gateway SA '$GATEWAY_SA' not found after 25s. Pull secret not configured." + fi done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/helmfile.yaml.gotmpl` around lines 93 - 101, The retry loop that checks for the service account (variables GATEWAY_SA and GATEWAY_NS) and runs kubectl patch may exit silently after 5 retries without applying the imagePullSecrets, so change the logic in that loop to detect failure: after the for-loop, check whether the patch succeeded (e.g., test kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" and verify imagePullSecrets) and if not, emit a clear error via echo/process exit with non-zero status so the deployment aborts before the subsequent gateway pod restart; update the block around the kubectl patch invocation to set a success flag or use the patch command's exit code and fail fast when exhausted.
🧹 Nitpick comments (9)
charts/maas/templates/policies/azure-ad-admin-only.yaml (1)
1-1: Missing validation for required Azure AD values.When
azureAD.enabledis true, the template should fail early iftenantIdorclientIdare empty, rather than producing an invalid issuerUrl or audience. Consider using Helm'srequiredfunction.Proposed fix
-{{- if and .Values.azureAD.enabled .Values.azureAD.adminGroup }} +{{- if and .Values.azureAD.enabled .Values.azureAD.adminGroup }} +{{- $tenantId := required "azureAD.tenantId is required when azureAD.enabled is true" .Values.azureAD.tenantId }} +{{- $clientId := required "azureAD.clientId is required when azureAD.enabled is true" .Values.azureAD.clientId }} apiVersion: kuadrant.io/v1 kind: AuthPolicyThen use
$tenantIdand$clientIdin the template.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/policies/azure-ad-admin-only.yaml` at line 1, Add validation so the template fails early when .Values.azureAD.enabled is true and required fields are missing: use Helm's required function to assert .Values.azureAD.tenantId and .Values.azureAD.clientId (e.g., assign to local variables like $tenantId and $clientId via required) before constructing issuerUrl or audience, and then use those $tenantId/$clientId variables in the issuerUrl and audience logic to prevent emitting invalid values when tenantId or clientId are empty.charts/maas/templates/policies/default-deny.yaml (1)
17-22: Path prefix matching may be overly permissive.The
startsWith("/v1/models")predicate will match any path beginning with/v1/models, including unintended paths like/v1/models-maliciousor/v1/modelsABC. If the intent is to protect only the/v1/modelsendpoint and its sub-paths, consider whether stricter matching is needed.Additionally, using
limit: 0as a deny mechanism works but relies on Kuadrant's rate limiting enforcement. Ensure the correspondingAuthPolicyindefault-auth.yamlprovides authentication for allowed paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/policies/default-deny.yaml` around lines 17 - 22, The deny-all-by-default predicate uses startsWith("/v1/models") which unintentionally matches paths like /v1/models-malicious; update the predicate in the deny-all-by-default policy to more strictly match only the /v1/models endpoint and its subpaths (e.g., use startsWith("/v1/models/") for subpaths and a separate equality check for exactly "/v1/models", or use a regex anchor like ^/v1/models($|/) depending on policy syntax) and verify the corresponding AuthPolicy in default-auth.yaml still grants access for the intended allowed paths.charts/maas/crds/maas.opendatahub.io_maassubscriptions.yaml (1)
109-131: Minor schema inconsistency:owner.usersvsowner.groupsstructure differs.
usersis a simple string array whilegroupsis an array of objects with anamefield. This asymmetry may complicate client code. Consider aligning both as object arrays for consistency and future extensibility (e.g., addingclusterScopedto users).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/crds/maas.opendatahub.io_maassubscriptions.yaml` around lines 109 - 131, Align the owner.users schema to match owner.groups by changing users from an array of strings to an array of objects (e.g., UserReference) with a name property, mirroring the GroupReference structure; update the CRD definition for properties.owner.users to use items.type: object with a required - name and a name property of type string, and add any future fields such as clusterScoped to the new UserReference shape so users and groups share a consistent structure (references: owner.users, owner.groups, GroupReference, name, clusterScoped).charts/maas/templates/policies/azure-ad-per-user-ratelimit.yaml (1)
19-20: Quote thewindowduration value.The
windowvalue fromvalues.yamlis1h(unquoted YAML string). If Kuadrant expects a string duration, this should work, but defensive quoting ensures consistent behavior regardless of how the value is overridden:rates: - limit: {{ .Values.azureAD.perUserRateLimit.limit }} - window: {{ .Values.azureAD.perUserRateLimit.window }} + window: {{ .Values.azureAD.perUserRateLimit.window | quote }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/policies/azure-ad-per-user-ratelimit.yaml` around lines 19 - 20, The template is inserting the unquoted duration for .Values.azureAD.perUserRateLimit.window which can be misinterpreted if overridden; update the azure-ad-per-user-ratelimit.yaml template to render the window as a quoted string (use the template value .Values.azureAD.perUserRateLimit.window wrapped in quotes) so the generated YAML always provides a string duration to Kuadrant while leaving limit ( .Values.azureAD.perUserRateLimit.limit ) unchanged.charts/maas/crds/maas.opendatahub.io_maasauthpolicies.yaml (1)
131-138: Consider adding enum constraint foracceptedandenforcedstatus fields.These fields describe condition states but lack validation. Without constraints, any string value is accepted, potentially causing confusion or parsing issues for consumers expecting
True,False, orUnknown.accepted: description: Accepted reports whether the AuthPolicy has been accepted enum: - "True" - "False" - Unknown type: string enforced: description: Enforced reports whether the AuthPolicy is enforced enum: - "True" - "False" - Unknown type: string🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/crds/maas.opendatahub.io_maasauthpolicies.yaml` around lines 131 - 138, The CRD's status fields accepted and enforced currently allow any string; update the maas.opendatahub.io_maasauthpolicies.yaml definition for the accepted and enforced properties to add an enum constraint with the allowed values "True", "False", and "Unknown" while keeping type: string and preserving the existing descriptions so consumers can validate and parse these condition states reliably.Makefile (3)
153-158: Duplicate CRD deletion betweenundeploy-maasandcleanup.sh.Line 115 deletes MaaS CRDs explicitly, then Line 158 invokes
cleanup.shwhich also deletes CRDs matchingmaas.opendatahub.io(seescripts/cleanup.sh:144-147). While--ignore-not-foundprevents errors, this redundancy creates maintenance confusion about ownership of teardown responsibilities.Consider either:
- Removing CRD deletion from
undeploy-maasand lettingcleanup.shhandle it, or- Adding a flag to
cleanup.shto skip MaaS CRDs when called fromundeploy🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 153 - 158, The Makefile currently duplicates CRD deletion: the `undeploy-maas` target removes MaaS CRDs while the `undeploy` target calls ./scripts/cleanup.sh which also deletes `maas.opendatahub.io` CRDs; pick one approach and remove redundancy by either removing the explicit CRD deletion from the `undeploy-maas` target (leave full teardown to `cleanup.sh`) or modify `scripts/cleanup.sh` to accept a flag (e.g., --skip-maas) and update the `undeploy` target to call `./scripts/cleanup.sh --skip-maas -y`; update references to `undeploy-maas` and `cleanup.sh` accordingly and ensure `--ignore-not-found` behavior is preserved.
100-110: Prerequisite CRD checks duplicated in helmfile presync hook.Lines 103-108 check for KServe, Kuadrant, and Gateway API CRDs. The same checks exist in
charts/maas/helmfile.yaml.gotmpllines 37-52. This duplication means changes must be synchronized in two places.Not blocking, but consider consolidating prerequisite validation in one location.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 100 - 110, The prerequisite CRD checks are duplicated between the Makefile target deploy-maas and the helmfile presync hook in charts/maas/helmfile.yaml.gotmpl; refactor by moving the CRD validation into a single reusable place (either a new make target like validate-prereqs or a small shell script) and have both deploy-maas and the helmfile presync hook invoke that single validator; update references in deploy-maas (target name deploy-maas) and remove the duplicate checks from charts/maas/helmfile.yaml.gotmpl so maintenance is centralized.
64-76: Unquoted shell variables risk word-splitting/globbing.Several shell variables lack proper quoting. While current usage may be safe, this violates defensive shell scripting practices and can break with unexpected input.
Proposed fix
deploy-cert-manager: check-kubeconfig clear-cache @# Detect orphaned cert-manager pods not managed by Helm - `@if` kubectl get pods -n cert-manager --no-headers 2>/dev/null | grep -q .; then \ - if ! helm list -n cert-manager-operator 2>/dev/null | grep -q cert-manager; then \ + `@if` kubectl get pods -n cert-manager --no-headers 2>/dev/null | grep -q '.'; then \ + if ! helm list -n cert-manager-operator 2>/dev/null | grep -q 'cert-manager'; then \ echo "WARNING: cert-manager pods exist but no Helm release found."; \ echo " This is likely an orphaned install. Cleaning up..."; \ kubectl delete deployment --all -n cert-manager --ignore-not-found 2>/dev/null || true; \As per coding guidelines: "Quote shell variables in targets".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 64 - 76, In the deploy-cert-manager Makefile target, quote all shell variable and command substitutions to prevent word-splitting/globbing: locate the kubectl/helm/grep/xargs pipeline (the lines calling helm list -n cert-manager-operator, kubectl get pods -n cert-manager, kubectl get clusterrole,clusterrolebinding -o name | grep -i cert-manager | xargs -r kubectl delete, and kubectl delete sa --all -n cert-manager) and change any unquoted expansions or command substitutions to quoted forms (e.g. "$VAR" or "$(cmd)"), quote literal patterns passed to grep if they come from variables, and ensure xargs invocation is robust (use -- where appropriate) so all variable content is safely handled.charts/maas/helmfile.yaml.gotmpl (1)
61-64: No validation that./crds/directory exists before applying.Line 63 applies CRDs from
./crds/but doesn't verify the directory exists. If the chart is extracted incorrectly or the directory is missing, the error message won't clearly indicate the problem.# Apply MaaS CRDs via server-side apply (avoids size limits) echo "Applying MaaS CRDs..." + if [ ! -d "./crds/" ] || [ -z "$(ls -A ./crds/)" ]; then + echo "ERROR: ./crds/ directory missing or empty" + exit 1 + fi kubectl apply -f ./crds/ --server-side🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/helmfile.yaml.gotmpl` around lines 61 - 64, Add a check that the ./crds/ directory exists before running the kubectl apply; specifically, before the kubectl apply -f ./crds/ --server-side line in the helmfile template, test for the directory (e.g., using a shell test like [ -d ./crds ] or [[ -d ./crds ]]) and if it does not exist print a clear error referencing "crds" and exit non‑zero so the failure message shows the missing directory rather than a generic kubectl error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/maas/crds/maas.opendatahub.io_externalmodels.yaml`:
- Around line 58-70: The ClusterRole used by the maas-controller lacks
permissions to read Secrets referenced by ExternalModel.spec.credentialRef.name,
causing reconciliation Forbidden errors; update the maas-controller ClusterRole
(the ClusterRole resource in the maas-controller template) to include
core/secrets with verbs ["get","list","watch"] (and optionally "create","update"
only if controller writes secrets) so the controller can read the CredentialRef
Secret during reconciliation for ExternalModel. Ensure the new rule is added
alongside existing rules in the ClusterRole manifest so RBAC covers
credentialRef access.
In `@charts/maas/crds/maas.opendatahub.io_maasauthpolicies.yaml`:
- Around line 114-116: The CEL validation rule under x-kubernetes-validations
(the rule string currently using size(self.groups) > 0 || size(self.users) > 0)
will runtime-fail when optional fields are omitted; update that rule to guard
each optional field with has() before calling size()/length (e.g., check
has(self.groups) before accessing self.groups.size() and likewise for
self.users) so the expression never calls size() on null and short-circuits
correctly.
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 76-84: The post-sync script currently masks failures by appending
"|| echo '... not ready'" to the kubectl rollout status calls for deployments
postgres, maas-controller, and maas-api (while set -e is enabled), so rollout
failures do not cause the script to exit; update the kubectl rollout status
invocations in the postsync block to fail fast for critical components—either
remove the "|| echo ..." fallbacks so the command fails and set -e will
terminate the script, or explicitly capture each command's exit code and call
exit 1 (or aggregate errors and exit non‑zero) when rollout for any of
deployment/postgres, deployment/maas-controller, or deployment/maas-api fails.
- Around line 85-103: Replace hardcoded gateway and service-account names with
templated values: define a GATEWAY_NAME using the chart value
(.Values.gateway.name) with the same default currently assumed
("maas-default-gateway") and use that variable wherever "maas-default-gateway"
is referenced (kubectl get gateway and label selector), and derive GATEWAY_SA
from the gateway name (e.g. "${GATEWAY_NAME}-istio" or use the helper that
builds the SA name) instead of the hardcoded "maas-default-gateway-istio"; keep
existing GATEWAY_NS usage and the patch/delete logic but reference GATEWAY_NAME
and GATEWAY_SA variables so the postsync hook works with customized
.Values.gateway.name.
In `@charts/maas/templates/httproute.yaml`:
- Around line 34-35: The chart hardcodes the X-MaaS-Group header which bypasses
authorization; remove the static header entry (X-MaaS-Group: '["maas-admins"]')
and implement conditional logic in the httproute template so that when
azureAD.enabled=true the header is not set to a fixed value but instead is
sourced from the incoming JWT/group claims (or omitted so maas-api reads the
JWT) and when Azure AD is disabled the header can be set from the configured
adminGroup value in values.yaml; update the template around the X-MaaS-Group
header and any related blocks so maas-api cannot be granted admin rights by a
hardcoded header.
- Around line 27-33: The template currently emits a static header
X-MaaS-Username ("azure-ad-user" or "anonymous") which hides the real user; when
.Values.azureAD.enabled is true, replace the static value with the authenticated
principal claim (e.g., preferred_username or sub) by configuring the HTTPRoute
to set X-MaaS-Username from the auth identity (use Kuadrant/ingress-gateway
identity features or a request header filter that maps
request.auth.principal.claims["preferred_username"] ||
request.auth.principal.claims["sub"] into the header); keep the fallback to
"anonymous" when no identity is present and apply the same change for the later
block that sets X-MaaS-Username (the other occurrence).
In `@charts/maas/templates/maas-api/clusterrole.yaml`:
- Around line 28-30: The ClusterRole currently grants cluster-wide secrets CRUD
which is overly broad; remove the secrets rule from the ClusterRole in
charts/maas/templates/maas-api/clusterrole.yaml and instead create a namespaced
Role named maas-api-secrets and a RoleBinding that binds that Role to the
maas-api ServiceAccount in the chart namespace (use the include "maas.namespace"
template for namespace resolution); ensure the new Role contains the secrets
verbs (create, delete, get, list, patch, update, watch) and the RoleBinding
references kind: Role, name: maas-api-secrets and the subject ServiceAccount
maas-api.
In `@charts/maas/templates/maas-api/service.yaml`:
- Around line 11-13: The Service selector in service.yaml uses only
app.kubernetes.io/name which is too broad and allows label spoofing; update the
selector on the Service resource to include additional deterministic labels (for
example add app.kubernetes.io/instance or app.kubernetes.io/managed-by and
app.kubernetes.io/component/release labels) so it exactly matches the Pods
created by the maas-api Deployment/StatefulSet (match the labels used in the
maas-api Pod template), e.g. include the same app.kubernetes.io/instance/
component keys used by your chart (replace with {{ .Release.Name }} or the
component value used in the Deployment) so no other pods in the namespace can be
selected.
In `@charts/maas/templates/networkpolicy.yaml`:
- Around line 9-31: The NetworkPolicy for the maas-api pod (podSelector
matchLabels app.kubernetes.io/name: maas-api) currently allows ingress only from
Authorino pods in kuadrant-system and kuadrant-operators, which blocks traffic
from the Istio gateway; update the ingress stanza in the NetworkPolicy to also
allow traffic from the istio-system namespace by adding a namespaceSelector
matchLabels entry for kubernetes.io/metadata.name: istio-system (in the same
ingress.from list which contains the existing kuadrant entries) so that Envoy in
istio-system can reach maas-api:8080.
In `@charts/maas/templates/policies/azure-ad-admin-only.yaml`:
- Around line 22-28: The authorization rule using admin-group-check with
patternMatching that selects auth.identity.groups and value {{
.Values.azureAD.adminGroup | quote }} assumes the JWT contains a groups claim
which Azure AD does not emit by default and will omit when a user has >200
groups; update the chart/template to document this prerequisite and recommended
fixes: add README/values documentation that operators must set
groupMembershipClaims in the app registration (or use app roles), and update
policy guidance to either (a) prefer app roles (roles claim) or (b) implement
Graph API fallback when hasgroups/_claim_sources overage is present; reference
admin-group-check, selector auth.identity.groups, and .Values.azureAD.adminGroup
in the documentation so maintainers know exactly which policy and value are
affected.
In `@charts/maas/templates/policies/default-auth.yaml`:
- Around line 18-27: When azureAD.enabled is true the template must validate
.Values.azureAD.tenantId and .Values.azureAD.clientId before emitting the
issuerUrl and audience to avoid malformed URLs/empty audiences; update the
default-auth.yaml helm template to use Helm's required function (or equivalent
conditional checks) inside the azure-ad block so that required("message",
.Values.azureAD.tenantId) and required("message", .Values.azureAD.clientId) are
called (or the block is skipped/fails fast) and only then construct issuerUrl
"https://login.microsoftonline.com/{{ .Values.azureAD.tenantId }}/v2.0" and the
audiences list with {{ .Values.azureAD.clientId }}.
In `@charts/maas/templates/policies/maas-api-auth.yaml`:
- Around line 1-20: The rendered AuthPolicy (metadata.name: maas-api-allow)
currently enables anonymous access via spec.rules.authentication.anonymous and
an empty authorization block; change it to fail-closed when
.Values.azureAD.enabled is false by either not emitting an allow-anonymous
policy or by emitting a denying policy for maas-api-route (e.g., remove
spec.rules.authentication.anonymous and set authorization to explicitly deny or
require authentication), and add an opt-in values flag (e.g.,
.Values.api.allowAnonymous) if you need to permit anonymous behavior; update the
template around the AuthPolicy logic and the spec.rules sections (metadata.name
maas-api-allow, spec.targetRef/name maas-api-route) accordingly so anonymous
access is not granted by default.
In `@charts/maas/templates/postgresql/deployment.yaml`:
- Around line 27-70: The container lacks a securityContext which allows root,
writable rootfs and all capabilities; add a container-level securityContext
under the postgresql container (name: postgresql) that sets runAsUser: 26 and
runAsNonRoot: true, disallows privilege escalation (allowPrivilegeEscalation:
false), drops all capabilities (capabilities.drop: ["ALL"]), keeps
readOnlyRootFilesystem false so /var/lib/pgsql/data remains writable, and add a
seccompProfile (type: RuntimeDefault) and an appropriate runAsGroup if needed;
apply these settings next to the existing
ports/env/volumeMounts/readinessProbe/livenessProbe entries to harden the
container without breaking PostgreSQL writes to the pgdata mount.
In `@charts/maas/templates/postgresql/secret.yaml`:
- Around line 11-14: The secret currently hardcodes POSTGRESQL_USER and
POSTGRESQL_PASSWORD; update the template to read values from chart values (e.g.,
postgresql.auth.username, postgresql.auth.password,
postgresql.auth.existingSecret) and fall back to generating a random password
when postgresql.auth.password is empty, and support mounting values from an
existing secret when postgresql.auth.existingSecret is set; ensure
POSTGRESQL_DATABASE continues to use .Values.postgresql.database and that the
template uses those variables (POSTGRESQL_USER, POSTGRESQL_PASSWORD,
POSTGRESQL_DATABASE) instead of hardcoded strings so production can reference an
external secret and dev can get a generated password.
In `@charts/maas/values.yaml`:
- Around line 14-17: Replace all occurrences of unpinned image tags by setting
explicit version tags or digests for the three image values instead of `latest`:
update maasApi.image.tag, maasController.image.tag, and postgresql.image.tag
(and any other image.tag fields around lines referenced) to a fixed version
string or immutable digest (e.g., vX.Y.Z or sha256:...) so deployments are
reproducible and rollbacks reliable; ensure pullPolicy is adjusted if using
digests (pullPolicy: IfNotPresent or Always as appropriate).
- Around line 60-74: The current postgresql block uses ephemeral storage
(postgresql.storage: emptyDir) which causes data loss; update the values schema
for the postgresql section to support a storage.type option (allowed values like
emptyDir or persistentVolumeClaim) and add a nested pvc object (fields:
storageClassName and size) so consumers can choose persistent storage; modify
the postgresql handling to read storage.type and, when persistentVolumeClaim is
selected, populate the chart's PVC template with pvc.storageClassName and
pvc.size, leaving emptyDir as the default dev option and document the new
postgresql.storage.* fields.
In `@scripts/cleanup.sh`:
- Around line 107-112: The cleanup script hardcodes the gateway namespace
(istio-system) when deleting maas resources; modify scripts/cleanup.sh to accept
a --gateway-namespace argument (e.g., GATEWAY_NAMESPACE) or auto-discover the
namespace by querying for the maas gateway (use kubectl get gateway
--all-namespaces | grep maas-default-gateway to extract the namespace) and then
use that variable in the kubectl delete commands that reference the gateway and
authpolicy/tokenratelimitpolicy (the lines invoking kubectl delete gateway
maas-default-gateway -n istio-system and kubectl delete
authpolicy,tokenratelimitpolicy -n istio-system). Ensure the new logic falls
back to istio-system if no namespace is provided or discovered.
---
Outside diff comments:
In `@scripts/cleanup.sh`:
- Around line 63-68: The kubectl patch loops use unquoted shell variables ($rev
and $ist) which can cause word splitting for resource names with spaces/special
chars; update the two kubectl patch invocations that reference $rev and $ist to
quote the variables (use "$rev" and "$ist") so each resource name is passed as a
single argument to kubectl, keeping the surrounding command structure (the while
read loops and the 2>/dev/null || true behavior) intact.
- Around line 1-4: Add strict error handling to the script by inserting "set
-euo pipefail" immediately after the existing shebang (#!/bin/bash) in
cleanup.sh; this ensures the script exits on errors, treats unset variables as
errors, and fails on pipeline errors—after adding it, scan the script for any
unquoted variable expansions or intentional unset-variable usages and fix them
(declare default values or quote expansions) so the new strict mode doesn't
break legitimate behavior.
---
Minor comments:
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 93-101: The retry loop that checks for the service account
(variables GATEWAY_SA and GATEWAY_NS) and runs kubectl patch may exit silently
after 5 retries without applying the imagePullSecrets, so change the logic in
that loop to detect failure: after the for-loop, check whether the patch
succeeded (e.g., test kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" and verify
imagePullSecrets) and if not, emit a clear error via echo/process exit with
non-zero status so the deployment aborts before the subsequent gateway pod
restart; update the block around the kubectl patch invocation to set a success
flag or use the patch command's exit code and fail fast when exhausted.
In `@charts/maas/templates/namespace.yaml`:
- Line 6: The template uses {{ .Values.maasApi.subscriptionNamespace }} directly
which allows empty/missing values and produces invalid Namespace manifests;
update the Helm template in namespace.yaml to validate and fail fast by
replacing the bare reference with the required function (e.g. required "error
message" .Values.maasApi.subscriptionNamespace) so template rendering errors
clearly when maasApi.subscriptionNamespace is not provided or empty; ensure the
error message identifies maasApi.subscriptionNamespace so operators know which
value to set.
In `@charts/maas/templates/postgresql/deployment.yaml`:
- Around line 55-68: Health probes hardcode "-U maas" which will break if
POSTGRESQL_USER from the secret differs; update the readinessProbe and
livenessProbe exec command lines (readinessProbe, livenessProbe) to use the
runtime environment variable instead of a literal user by substituting the
container env var (e.g. use ${POSTGRESQL_USER} or sh -c "pg_isready -U
${POSTGRESQL_USER} -d ${POSTGRESQL_DATABASE}") so the probes read the
secret-provided POSTGRESQL_USER/POSTGRESQL_DATABASE values at runtime.
In `@charts/maas/templates/postgresql/secret.yaml`:
- Around line 25-26: The DB_CONNECTION_URL in secret.yaml currently hardcodes
credentials instead of using the parameterized values, causing duplication and
drift; update DB_CONNECTION_URL to reference the same username/password source
used elsewhere (e.g., the postgresql secret keys or
.Values.postgresql.username/.password) rather than embedding "maas" and
"maas-dev-password", or retrieve the generated password via a Helm lookup to the
first secret; alternatively extract the password generation into a shared helper
so both the secret data and DB_CONNECTION_URL use the same generated value
(avoid calling randAlphaNum twice).
In `@charts/maas/templates/pull-secret.yaml`:
- Around line 15-16: The template currently always applies b64enc to
.Values.pullSecret.dockerConfigJson which will double-encode inputs that are
already base64; update the chart so the template and values agree: either (A)
document in values.yaml that pullSecret.dockerConfigJson must be RAW JSON (and
keep the b64enc in the template), or (B) implement a conditional in the
pull-secret.yaml template that detects if .Values.pullSecret.dockerConfigJson is
already base64 (e.g. via a simple regex helper) and only calls b64enc when it is
not; reference .dockerconfigjson in the template and
.Values.pullSecret.dockerConfigJson and the b64enc invocation when making the
change.
In `@Makefile`:
- Around line 198-206: The Makefile health/status target hardcodes the gateway
name "maas-default-gateway" which breaks when users set charts/maas/values.yaml
gateway.name to a custom value; update the kubectl check in the target that
calls kubectl get gateway maas-default-gateway -n istio-system to instead read
the configured gateway name (e.g., via a variable like MAAS_GATEWAY or by
parsing charts/maas/values.yaml or Helm release values) and use that variable in
the kubectl command so the check honors the configured gateway.name.
---
Nitpick comments:
In `@charts/maas/crds/maas.opendatahub.io_maasauthpolicies.yaml`:
- Around line 131-138: The CRD's status fields accepted and enforced currently
allow any string; update the maas.opendatahub.io_maasauthpolicies.yaml
definition for the accepted and enforced properties to add an enum constraint
with the allowed values "True", "False", and "Unknown" while keeping type:
string and preserving the existing descriptions so consumers can validate and
parse these condition states reliably.
In `@charts/maas/crds/maas.opendatahub.io_maassubscriptions.yaml`:
- Around line 109-131: Align the owner.users schema to match owner.groups by
changing users from an array of strings to an array of objects (e.g.,
UserReference) with a name property, mirroring the GroupReference structure;
update the CRD definition for properties.owner.users to use items.type: object
with a required - name and a name property of type string, and add any future
fields such as clusterScoped to the new UserReference shape so users and groups
share a consistent structure (references: owner.users, owner.groups,
GroupReference, name, clusterScoped).
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 61-64: Add a check that the ./crds/ directory exists before
running the kubectl apply; specifically, before the kubectl apply -f ./crds/
--server-side line in the helmfile template, test for the directory (e.g., using
a shell test like [ -d ./crds ] or [[ -d ./crds ]]) and if it does not exist
print a clear error referencing "crds" and exit non‑zero so the failure message
shows the missing directory rather than a generic kubectl error.
In `@charts/maas/templates/policies/azure-ad-admin-only.yaml`:
- Line 1: Add validation so the template fails early when
.Values.azureAD.enabled is true and required fields are missing: use Helm's
required function to assert .Values.azureAD.tenantId and
.Values.azureAD.clientId (e.g., assign to local variables like $tenantId and
$clientId via required) before constructing issuerUrl or audience, and then use
those $tenantId/$clientId variables in the issuerUrl and audience logic to
prevent emitting invalid values when tenantId or clientId are empty.
In `@charts/maas/templates/policies/azure-ad-per-user-ratelimit.yaml`:
- Around line 19-20: The template is inserting the unquoted duration for
.Values.azureAD.perUserRateLimit.window which can be misinterpreted if
overridden; update the azure-ad-per-user-ratelimit.yaml template to render the
window as a quoted string (use the template value
.Values.azureAD.perUserRateLimit.window wrapped in quotes) so the generated YAML
always provides a string duration to Kuadrant while leaving limit (
.Values.azureAD.perUserRateLimit.limit ) unchanged.
In `@charts/maas/templates/policies/default-deny.yaml`:
- Around line 17-22: The deny-all-by-default predicate uses
startsWith("/v1/models") which unintentionally matches paths like
/v1/models-malicious; update the predicate in the deny-all-by-default policy to
more strictly match only the /v1/models endpoint and its subpaths (e.g., use
startsWith("/v1/models/") for subpaths and a separate equality check for exactly
"/v1/models", or use a regex anchor like ^/v1/models($|/) depending on policy
syntax) and verify the corresponding AuthPolicy in default-auth.yaml still
grants access for the intended allowed paths.
In `@Makefile`:
- Around line 153-158: The Makefile currently duplicates CRD deletion: the
`undeploy-maas` target removes MaaS CRDs while the `undeploy` target calls
./scripts/cleanup.sh which also deletes `maas.opendatahub.io` CRDs; pick one
approach and remove redundancy by either removing the explicit CRD deletion from
the `undeploy-maas` target (leave full teardown to `cleanup.sh`) or modify
`scripts/cleanup.sh` to accept a flag (e.g., --skip-maas) and update the
`undeploy` target to call `./scripts/cleanup.sh --skip-maas -y`; update
references to `undeploy-maas` and `cleanup.sh` accordingly and ensure
`--ignore-not-found` behavior is preserved.
- Around line 100-110: The prerequisite CRD checks are duplicated between the
Makefile target deploy-maas and the helmfile presync hook in
charts/maas/helmfile.yaml.gotmpl; refactor by moving the CRD validation into a
single reusable place (either a new make target like validate-prereqs or a small
shell script) and have both deploy-maas and the helmfile presync hook invoke
that single validator; update references in deploy-maas (target name
deploy-maas) and remove the duplicate checks from
charts/maas/helmfile.yaml.gotmpl so maintenance is centralized.
- Around line 64-76: In the deploy-cert-manager Makefile target, quote all shell
variable and command substitutions to prevent word-splitting/globbing: locate
the kubectl/helm/grep/xargs pipeline (the lines calling helm list -n
cert-manager-operator, kubectl get pods -n cert-manager, kubectl get
clusterrole,clusterrolebinding -o name | grep -i cert-manager | xargs -r kubectl
delete, and kubectl delete sa --all -n cert-manager) and change any unquoted
expansions or command substitutions to quoted forms (e.g. "$VAR" or "$(cmd)"),
quote literal patterns passed to grep if they come from variables, and ensure
xargs invocation is robust (use -- where appropriate) so all variable content is
safely handled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 6bc93fa8-ea1e-4485-96e4-64485e264fa0
📒 Files selected for processing (35)
Makefilecharts/maas/Chart.yamlcharts/maas/crds/maas.opendatahub.io_externalmodels.yamlcharts/maas/crds/maas.opendatahub.io_maasauthpolicies.yamlcharts/maas/crds/maas.opendatahub.io_maasmodelrefs.yamlcharts/maas/crds/maas.opendatahub.io_maassubscriptions.yamlcharts/maas/helmfile.yaml.gotmplcharts/maas/templates/_helpers.tplcharts/maas/templates/gateway.yamlcharts/maas/templates/httproute.yamlcharts/maas/templates/maas-api/clusterrole.yamlcharts/maas/templates/maas-api/deployment.yamlcharts/maas/templates/maas-api/service.yamlcharts/maas/templates/maas-api/serviceaccount.yamlcharts/maas/templates/maas-controller/clusterrole.yamlcharts/maas/templates/maas-controller/clusterrolebinding.yamlcharts/maas/templates/maas-controller/deployment.yamlcharts/maas/templates/maas-controller/leader-election-role.yamlcharts/maas/templates/maas-controller/serviceaccount.yamlcharts/maas/templates/namespace.yamlcharts/maas/templates/networkpolicy.yamlcharts/maas/templates/policies/azure-ad-admin-only.yamlcharts/maas/templates/policies/azure-ad-per-user-ratelimit.yamlcharts/maas/templates/policies/default-auth.yamlcharts/maas/templates/policies/default-deny.yamlcharts/maas/templates/policies/maas-api-auth.yamlcharts/maas/templates/postgresql/deployment.yamlcharts/maas/templates/postgresql/secret.yamlcharts/maas/templates/pull-secret.yamlcharts/maas/templates/tier-mapping.yamlcharts/maas/values.yamlcharts/rhcl/helmfile.yaml.gotmplhelmfile.yaml.gotmplscripts/cleanup.shvalues.yaml
- Add secrets RBAC to maas-controller ClusterRole (credentialRef access) - Use templated gateway name instead of hardcoded values in postsync - Track and report deployment errors in postsync hook - Quote shell variables in cleanup.sh finalizer loops - Add make lint target (helm lint, yamllint, shellcheck) Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
charts/maas/helmfile.yaml.gotmpl (2)
87-88:⚠️ Potential issue | 🟠 MajorThe integrated stack still can't honor a custom gateway name.
GATEWAY_NAMEreads.Values.gateway.name, but the parenthelmfile.yaml.gotmpl:76-87only forwardsnamespaceandgatewayNamespace. In stack deployments this still falls back tomaas-default-gateway, so the service-account patch and pod restart target the wrong gateway when a custom name is configured. Pass the gateway name into this sub-helmfile explicitly.As per coding guidelines,
**: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/helmfile.yaml.gotmpl` around lines 87 - 88, The sub-helmfile uses GATEWAY_NAME="{{ .Values.gateway.name | default "maas-default-gateway" }}" but the parent helmfile does not forward the gateway name, causing the default to be used; update the parent helmfile template that currently forwards namespace and gatewayNamespace to also pass gateway.name (e.g., add a gatewayName or similar key) into the sub-helmfile values so .Values.gateway.name is populated and GATEWAY_NAME reflects a custom gateway; ensure the symbol names referenced are .Values.gateway.name in the sub-helmfile and the parent template entry that emits gatewayNamespace so you add the corresponding gateway name field there.
105-107:⚠️ Potential issue | 🟠 MajorReturn non-zero when rollout validation finds failures.
ERRORSis accumulated, but the hook still exits 0 here.helmfile applywill report success even when postgres, maas-controller, or maas-api never become Ready, which breaks automation and follow-up targets.Minimal fix
if [ "$ERRORS" -gt 0 ]; then - echo "WARNING: $ERRORS component(s) not ready. Check pod status." + echo "ERROR: $ERRORS component(s) not ready. Check pod status." + exit 1 fiAs per coding guidelines,
**: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/helmfile.yaml.gotmpl` around lines 105 - 107, The rollout-check hook currently prints a warning when the ERRORS counter is >0 but still exits with status 0; modify the if block that checks ERRORS (the rollout validation near the end of the helmfile hook) to return a non-zero exit code when failures are detected (e.g., call exit 1 or exit $ERRORS) so that helmfile apply fails on readiness errors and downstream automation sees the failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 54-59: The current kubectl check using "kubectl get kuadrant -n
kuadrant-system" is a no-op because it exits 0 when the CRD exists but no
instances are present; change the check so it verifies that at least one
Kuadrant resource exists and fail closed if none are found. Replace the existing
branch that prints "Kuadrant instance found in kuadrant-system" / "WARNING: No
Kuadrant instance found..." with a test that inspects the command output (for
example using "kubectl get kuadrant -n kuadrant-system -o name" or a jsonpath
query) and treat an empty output as absence — only print the success message
when a non-empty resource list is returned, otherwise print the warning and exit
non-zero so deployment cannot proceed without an actual Kuadrant instance.
In `@charts/maas/templates/maas-controller/clusterrole.yaml`:
- Around line 9-11: The ClusterRole in
charts/maas/templates/maas-controller/clusterrole.yaml currently allows verbs
["get","list","watch"] on resource "secrets", which enables cluster-wide secret
enumeration; change this to only "get" for "secrets" (remove "list" and "watch")
if the controller only needs credentialRef lookups, or replace the ClusterRole
with a namespaced Role and bind it only in MaaS-managed namespaces if
watches/listening across namespaces are truly required; update the
ClusterRole/Role definition and any RoleBinding/ClusterRoleBinding references
accordingly.
In `@Makefile`:
- Around line 199-201: The Makefile invokes kubectl with an unquoted
KSERVE_NAMESPACE which allows shell injection; update the two kubectl commands
that use -n $(KSERVE_NAMESPACE) so the namespace variable is passed as a single
quoted argument (e.g. wrap the expansion in quotes like "$(KSERVE_NAMESPACE)" or
'${KSERVE_NAMESPACE}') in the kubectl get pods -n ... calls to prevent the shell
from interpreting injected characters.
- Around line 59-63: The Makefile currently allows deploy-all to include
deploy-maas even when RHCL is false (RHCL_TARGET/MAAS_TARGET logic), so add a
fast-fail guard that checks the invalid combination (MAAS=true and RHCL!=true)
before any deployment steps; implement this by creating a small phony target
(e.g., fail-if-maas-without-rhcl) that inspects the RHCL/MAAS variables and
exits non‑zero with a clear message when MAAS is true but RHCL is not, and then
invoke that guard as the first prerequisite of deploy-all (before
deploy-cert-manager, deploy-istio, deploy-lws, etc.) so deploy-all bails out
immediately instead of queuing deploy-maas.
- Around line 67-76: The Makefile currently deletes resources in the
cert-manager namespace when it finds pods but no Helm release in
cert-manager-operator; change this to first require an explicit opt-in variable
(e.g. CLEAN_ORPHANED_CERT_MANAGER="true") before executing the destructive
block, and replace name/grep-based deletes with label/annotation checks — use
kubectl with a selector (e.g.
--selector=app.kubernetes.io/managed-by=<operator-label> or check for a specific
ownership annotation) when calling kubectl delete deployment, kubectl delete sa,
and when selecting clusterrole/clusterrolebinding (avoid grep -i cert-manager),
only proceed with cluster-scoped deletes if the resources carry the expected
ownership label/annotation; keep the existing fallback no-op behavior when the
opt-in flag is not set.
---
Duplicate comments:
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 87-88: The sub-helmfile uses GATEWAY_NAME="{{ .Values.gateway.name
| default "maas-default-gateway" }}" but the parent helmfile does not forward
the gateway name, causing the default to be used; update the parent helmfile
template that currently forwards namespace and gatewayNamespace to also pass
gateway.name (e.g., add a gatewayName or similar key) into the sub-helmfile
values so .Values.gateway.name is populated and GATEWAY_NAME reflects a custom
gateway; ensure the symbol names referenced are .Values.gateway.name in the
sub-helmfile and the parent template entry that emits gatewayNamespace so you
add the corresponding gateway name field there.
- Around line 105-107: The rollout-check hook currently prints a warning when
the ERRORS counter is >0 but still exits with status 0; modify the if block that
checks ERRORS (the rollout validation near the end of the helmfile hook) to
return a non-zero exit code when failures are detected (e.g., call exit 1 or
exit $ERRORS) so that helmfile apply fails on readiness errors and downstream
automation sees the failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: f03653cd-9b3a-4ff5-9ae2-c3045cd730b5
📒 Files selected for processing (4)
Makefilecharts/maas/helmfile.yaml.gotmplcharts/maas/templates/maas-controller/clusterrole.yamlscripts/cleanup.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/cleanup.sh
- Restrict controller secrets RBAC to get-only (least privilege) - Guard MAAS=true without RHCL=true in Makefile (fail early) - Make cert-manager orphan cleanup skippable (CERT_MANAGER_FORCE_CLEANUP) - Quote KSERVE_NAMESPACE in Makefile shell commands - Fix Kuadrant presence check to verify instance count, not just CRD - Exit non-zero from postsync when components fail rollout Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
charts/maas/templates/maas-controller/clusterrole.yaml (1)
9-11:⚠️ Potential issue | 🟠 MajorScope Secret reads to opted-in namespaces.
Major. This ClusterRole is bound cluster-wide to
maas-controller, so a compromised controller pod can still read predictable secret names in any namespace (CWE-250, CWE-284). Dropsecretsfrom the ClusterRole and re-grantgetwith namespacedRole/RoleBindingobjects only where MaaS credentials are actually stored.Minimal hardening
- apiGroups: [""] - resources: ["secrets"] - verbs: ["get"] +# Remove Secret access from the ClusterRole. +# Re-add `get` through namespaced Role/RoleBinding objects in approved namespaces.As per coding guidelines,
**: REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/maas/templates/maas-controller/clusterrole.yaml` around lines 9 - 11, The ClusterRole for maas-controller currently grants cluster-wide read access to "secrets" (verbs: ["get"]) — remove "secrets" from the ClusterRole definition so it no longer grants any secret access, and instead create namespaced Role and RoleBinding objects in only the namespaces that actually store MaaS credentials granting verbs: ["get"] on resources: ["secrets"]; ensure the new Role is named (e.g., maas-controller-secrets) and the RoleBinding binds the maas-controller ServiceAccount to that Role in each opted-in namespace so secret access is limited and no ClusterRole contains "secrets".Makefile (3)
69-80:⚠️ Potential issue | 🟠 MajorThe cert-manager cleanup path is still opt-out and still deletes by
--all.With
$${CERT_MANAGER_FORCE_CLEANUP:-true}, any cluster that has cert-manager installed by OLM, GitOps, or Helm in another namespace can still hit thedeployment --all/sa --alldeletes here. Default this flag tofalseand only remove resources that carry this stack's ownership labels or annotations.At minimum, make cleanup explicit
- if [ "$${CERT_MANAGER_FORCE_CLEANUP:-true}" = "true" ]; then \ + if [ "$${CERT_MANAGER_FORCE_CLEANUP:-false}" = "true" ]; then \As per coding guidelines,
**: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 69 - 80, Change the cert-manager cleanup to be opt-in and label-scoped: set the CERT_MANAGER_FORCE_CLEANUP default to "false" (replace $${CERT_MANAGER_FORCE_CLEANUP:-true} with $${CERT_MANAGER_FORCE_CLEANUP:-false}) and stop using "kubectl delete ... --all"; instead delete only resources matching this stack's ownership labels/annotations by using kubectl delete with a selector (e.g., kubectl delete deployment -n cert-manager -l <ownership-label>=<value> and kubectl delete sa -n cert-manager -l <ownership-label>=<value>), preserving the existing guard that skips cleanup when CERT_MANAGER_FORCE_CLEANUP is false.
205-212:⚠️ Potential issue | 🟡 Minor
statushardcodes the MaaS location.MaaS has its own
namespace,gatewayNamespace, andgateway.namevalues, but this target always checks$(KSERVE_NAMESPACE),istio-system, andmaas-default-gateway. On clusters where KServe lives inredhat-ods-applicationsor those values are overridden,make statuswill report a healthy MaaS install as missing.Use MaaS-specific variables here
+MAAS_NAMESPACE ?= opendatahub +MAAS_GATEWAY_NAMESPACE ?= istio-system +MAAS_GATEWAY_NAME ?= maas-default-gateway ... - `@kubectl` get pods -n "$(KSERVE_NAMESPACE)" -l control-plane=maas-controller 2>/dev/null || echo " Not deployed (optional component)" - `@kubectl` get pods -n "$(KSERVE_NAMESPACE)" -l app.kubernetes.io/name=maas-api 2>/dev/null || echo " " + `@kubectl` get pods -n "$(MAAS_NAMESPACE)" -l control-plane=maas-controller 2>/dev/null || echo " Not deployed (optional component)" + `@kubectl` get pods -n "$(MAAS_NAMESPACE)" -l app.kubernetes.io/name=maas-api 2>/dev/null || echo " " ... - kubectl get gateway maas-default-gateway -n istio-system 2>/dev/null || echo " No gateway"; \ + kubectl get gateway "$(MAAS_GATEWAY_NAME)" -n "$(MAAS_GATEWAY_NAMESPACE)" 2>/dev/null || echo " No gateway"; \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 205 - 212, The status target in the Makefile hardcodes MaaS locations; update the kubectl commands in the status target to use MaaS-specific variables instead of KSERVE_NAMESPACE/istio-system/maas-default-gateway: replace references to "$(KSERVE_NAMESPACE)" with the MaaS namespace variable (e.g., "$(MAAS_NAMESPACE)" or the repo's maas namespace var), replace the hardcoded "istio-system" with a gateway namespace variable (e.g., "$(MAAS_GATEWAY_NAMESPACE)"), and replace "maas-default-gateway" with a gateway name variable (e.g., "$(MAAS_GATEWAY_NAME)"); ensure the CRD check and fallback echo logic remain but reference these variables so overridden/non-default MaaS installs report correctly.
62-65:⚠️ Potential issue | 🟠 MajorMove the MAAS/RHCL guard out of top-level
$(error ...)evaluation.GNU Make parses the entire Makefile before target selection. The
$(if $(and ...), $(error ...))at lines 62-63 fires during parse, breaking unrelated commands likemake help MAAS=trueormake status MAAS=true. Move this check into a prerequisite recipe ondeploy-allinstead.Target-scoped guard
-# MaaS requires RHCL — fail early if MAAS=true without RHCL=true -$(if $(and $(filter true,$(MAAS)),$(filter-out true,$(RHCL))),$(error MAAS=true requires RHCL=true. Run: make deploy-all RHCL=true MAAS=true)) +fail-if-maas-without-rhcl: + `@if` [ "$(MAAS)" = "true" ] && [ "$(RHCL)" != "true" ]; then \ + echo "ERROR: MAAS=true requires RHCL=true. Run: make deploy-all RHCL=true MAAS=true"; \ + exit 1; \ + fi -deploy-all: check-kubeconfig deploy-cert-manager deploy-istio deploy-lws $(RHCL_TARGET) deploy-kserve $(MAAS_TARGET) +deploy-all: fail-if-maas-without-rhcl check-kubeconfig deploy-cert-manager deploy-istio deploy-lws $(RHCL_TARGET) deploy-kserve $(MAAS_TARGET)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 62 - 65, The top-level parse-time guard using $(if $(and $(filter true,$(MAAS)),$(filter-out true,$(RHCL))),$(error ...)) causes errors during Makefile parsing; move this check into a recipe or a target-scoped prerequisite for the deploy-all target so it runs only when deploy-all is invoked. Specifically, remove the global $(if ... $(error ...)) expression and instead add a short phony prerequisite or a target-scoped check for the deploy-all target (referencing deploy-all) that inspects MAAS and RHCL and calls $(error ...) at runtime if MAAS=true and RHCL!=true; keep existing prerequisites (check-kubeconfig deploy-cert-manager deploy-istio deploy-lws $(RHCL_TARGET) deploy-kserve $(MAAS_TARGET)) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/maas/helmfile.yaml.gotmpl`:
- Around line 91-105: The gateway check currently only echoes missing resources
but doesn't mark the hook as failed; update the block that checks kubectl get
gateway "$GATEWAY_NAME" and the ServiceAccount patch loop for GATEWAY_SA so that
on a missing gateway or when the patch loop exhausts (no successful kubectl
patch after 5 attempts) you increment the same ERRORS counter used for rollout
failures (or set a failure flag) and log an error message, ensuring postsync
exits non‑zero when ERRORS > 0; implement the increment and error logging at the
same points where rollout failures are counted so the hook fails if ingress
never becomes ready.
- Around line 98-99: The hook hardcodes the secret name ("redhat-pull-secret")
when patching the gateway ServiceAccount; instead render the chart value so
overrides work: locate the kubectl patch command in
charts/maas/helmfile.yaml.gotmpl (the line with kubectl patch sa "$GATEWAY_SA"
-n "$GATEWAY_NS" -p ...), and replace the hardcoded "redhat-pull-secret" with
the chart pullSecret value (use the template variable .Values.pullSecret.name,
e.g. render it into the JSON payload via the gotmpl expression and proper
quoting/escaping) so the patch attaches the configured pull secret from
values.yaml/templates/pull-secret.yaml.
In `@Makefile`:
- Around line 107-116: The deploy-maas Makefile target only checks Kuadrant CRDs
but not the required live kuadrant instance, causing Helmfile
(charts/maas/helmfile.yaml.gotmpl) to fail later; update the deploy-maas recipe
to also verify the kuadrant resource exists in the kuadrant-system namespace
(e.g., by running a kubectl get kuadrant <name> or kubectl get kuadrant -n
kuadrant-system and failing with a clear message), so the preflight matches the
Helmfile hook and aborts early if the kuadrant instance is missing.
---
Duplicate comments:
In `@charts/maas/templates/maas-controller/clusterrole.yaml`:
- Around line 9-11: The ClusterRole for maas-controller currently grants
cluster-wide read access to "secrets" (verbs: ["get"]) — remove "secrets" from
the ClusterRole definition so it no longer grants any secret access, and instead
create namespaced Role and RoleBinding objects in only the namespaces that
actually store MaaS credentials granting verbs: ["get"] on resources:
["secrets"]; ensure the new Role is named (e.g., maas-controller-secrets) and
the RoleBinding binds the maas-controller ServiceAccount to that Role in each
opted-in namespace so secret access is limited and no ClusterRole contains
"secrets".
In `@Makefile`:
- Around line 69-80: Change the cert-manager cleanup to be opt-in and
label-scoped: set the CERT_MANAGER_FORCE_CLEANUP default to "false" (replace
$${CERT_MANAGER_FORCE_CLEANUP:-true} with $${CERT_MANAGER_FORCE_CLEANUP:-false})
and stop using "kubectl delete ... --all"; instead delete only resources
matching this stack's ownership labels/annotations by using kubectl delete with
a selector (e.g., kubectl delete deployment -n cert-manager -l
<ownership-label>=<value> and kubectl delete sa -n cert-manager -l
<ownership-label>=<value>), preserving the existing guard that skips cleanup
when CERT_MANAGER_FORCE_CLEANUP is false.
- Around line 205-212: The status target in the Makefile hardcodes MaaS
locations; update the kubectl commands in the status target to use MaaS-specific
variables instead of KSERVE_NAMESPACE/istio-system/maas-default-gateway: replace
references to "$(KSERVE_NAMESPACE)" with the MaaS namespace variable (e.g.,
"$(MAAS_NAMESPACE)" or the repo's maas namespace var), replace the hardcoded
"istio-system" with a gateway namespace variable (e.g.,
"$(MAAS_GATEWAY_NAMESPACE)"), and replace "maas-default-gateway" with a gateway
name variable (e.g., "$(MAAS_GATEWAY_NAME)"); ensure the CRD check and fallback
echo logic remain but reference these variables so overridden/non-default MaaS
installs report correctly.
- Around line 62-65: The top-level parse-time guard using $(if $(and $(filter
true,$(MAAS)),$(filter-out true,$(RHCL))),$(error ...)) causes errors during
Makefile parsing; move this check into a recipe or a target-scoped prerequisite
for the deploy-all target so it runs only when deploy-all is invoked.
Specifically, remove the global $(if ... $(error ...)) expression and instead
add a short phony prerequisite or a target-scoped check for the deploy-all
target (referencing deploy-all) that inspects MAAS and RHCL and calls $(error
...) at runtime if MAAS=true and RHCL!=true; keep existing prerequisites
(check-kubeconfig deploy-cert-manager deploy-istio deploy-lws $(RHCL_TARGET)
deploy-kserve $(MAAS_TARGET)) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: e7504aae-3fd6-4ab1-a664-3f919e40b8a1
📒 Files selected for processing (3)
Makefilecharts/maas/helmfile.yaml.gotmplcharts/maas/templates/maas-controller/clusterrole.yaml
| echo "Checking MaaS gateway..." | ||
| kubectl get gateway "$GATEWAY_NAME" -n "$GATEWAY_NS" 2>/dev/null || echo "Gateway not found in $GATEWAY_NS" | ||
|
|
||
| GATEWAY_SA="${GATEWAY_NAME}-istio" | ||
| echo "Patching gateway service account with pull secret..." | ||
| for i in 1 2 3 4 5; do | ||
| if kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" >/dev/null 2>&1; then | ||
| kubectl patch sa "$GATEWAY_SA" -n "$GATEWAY_NS" \ | ||
| -p '{"imagePullSecrets": [{"name": "redhat-pull-secret"}]}' 2>/dev/null && \ | ||
| echo " Gateway SA patched" && break | ||
| fi | ||
| echo " Waiting for gateway SA... ($i/5)" | ||
| sleep 5 | ||
| done | ||
| kubectl delete pod -n "$GATEWAY_NS" -l "istio.io/gateway-name=$GATEWAY_NAME" --ignore-not-found 2>/dev/null || true |
There was a problem hiding this comment.
Make gateway validation fail the hook.
This block only logs gateway and ServiceAccount failures, so postsync can still exit 0 while the ingress path never becomes usable. Count a missing gateway and an exhausted patch loop in ERRORS the same way you already do for rollout failures.
Fail when ingress never becomes ready
- kubectl get gateway "$GATEWAY_NAME" -n "$GATEWAY_NS" 2>/dev/null || echo "Gateway not found in $GATEWAY_NS"
+ kubectl get gateway "$GATEWAY_NAME" -n "$GATEWAY_NS" >/dev/null 2>&1 || { echo "ERROR: Gateway not found in $GATEWAY_NS"; ERRORS=$((ERRORS+1)); }
+ PATCHED=0
GATEWAY_SA="${GATEWAY_NAME}-istio"
echo "Patching gateway service account with pull secret..."
for i in 1 2 3 4 5; do
if kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" >/dev/null 2>&1; then
kubectl patch sa "$GATEWAY_SA" -n "$GATEWAY_NS" \
-p '{"imagePullSecrets": [{"name": "redhat-pull-secret"}]}' 2>/dev/null && \
- echo " Gateway SA patched" && break
+ echo " Gateway SA patched" && PATCHED=1 && break
fi
echo " Waiting for gateway SA... ($i/5)"
sleep 5
done
+ [ "$PATCHED" -eq 1 ] || { echo "ERROR: Gateway SA not patched"; ERRORS=$((ERRORS+1)); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@charts/maas/helmfile.yaml.gotmpl` around lines 91 - 105, The gateway check
currently only echoes missing resources but doesn't mark the hook as failed;
update the block that checks kubectl get gateway "$GATEWAY_NAME" and the
ServiceAccount patch loop for GATEWAY_SA so that on a missing gateway or when
the patch loop exhausts (no successful kubectl patch after 5 attempts) you
increment the same ERRORS counter used for rollout failures (or set a failure
flag) and log an error message, ensuring postsync exits non‑zero when ERRORS >
0; implement the increment and error logging at the same points where rollout
failures are counted so the hook fails if ingress never becomes ready.
| kubectl patch sa "$GATEWAY_SA" -n "$GATEWAY_NS" \ | ||
| -p '{"imagePullSecrets": [{"name": "redhat-pull-secret"}]}' 2>/dev/null && \ |
There was a problem hiding this comment.
Render the pull-secret name instead of hardcoding it.
charts/maas/values.yaml and charts/maas/templates/pull-secret.yaml make pullSecret.name configurable, but this hook always patches redhat-pull-secret. Any override will attach a nonexistent secret to the gateway ServiceAccount and break image pulls.
Use the chart value here too
+ PULL_SECRET_NAME="{{ .Values.pullSecret.name | default "redhat-pull-secret" }}"
for i in 1 2 3 4 5; do
if kubectl get sa "$GATEWAY_SA" -n "$GATEWAY_NS" >/dev/null 2>&1; then
kubectl patch sa "$GATEWAY_SA" -n "$GATEWAY_NS" \
- -p '{"imagePullSecrets": [{"name": "redhat-pull-secret"}]}' 2>/dev/null && \
+ -p "{\"imagePullSecrets\": [{\"name\": \"${PULL_SECRET_NAME}\"}]}" 2>/dev/null && \
echo " Gateway SA patched" && break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@charts/maas/helmfile.yaml.gotmpl` around lines 98 - 99, The hook hardcodes
the secret name ("redhat-pull-secret") when patching the gateway ServiceAccount;
instead render the chart value so overrides work: locate the kubectl patch
command in charts/maas/helmfile.yaml.gotmpl (the line with kubectl patch sa
"$GATEWAY_SA" -n "$GATEWAY_NS" -p ...), and replace the hardcoded
"redhat-pull-secret" with the chart pullSecret value (use the template variable
.Values.pullSecret.name, e.g. render it into the JSON payload via the gotmpl
expression and proper quoting/escaping) so the patch attaches the configured
pull secret from values.yaml/templates/pull-secret.yaml.
| deploy-maas: check-kubeconfig clear-cache | ||
| @echo "=== Deploying MaaS (Models as a Service) ===" | ||
| @echo "Prerequisites: KServe, RHCL, Istio, and cert-manager must be deployed first" | ||
| @kubectl get crd llminferenceservices.serving.kserve.io >/dev/null 2>&1 || \ | ||
| (echo "ERROR: KServe not found. Run 'make deploy-kserve' first." && exit 1) | ||
| @kubectl get crd authpolicies.kuadrant.io >/dev/null 2>&1 || \ | ||
| (echo "ERROR: Kuadrant/RHCL not found. Run 'make deploy-rhcl' first." && exit 1) | ||
| @kubectl get crd gateways.gateway.networking.k8s.io >/dev/null 2>&1 || \ | ||
| (echo "ERROR: Gateway API CRDs not found. Run 'make deploy-istio' first." && exit 1) | ||
| helmfile apply --selector name=maas --state-values-set maas.enabled=true |
There was a problem hiding this comment.
Mirror the Kuadrant-instance prerequisite in deploy-maas.
This target only checks the CRDs, but charts/maas/helmfile.yaml.gotmpl now also requires a live kuadrant resource in kuadrant-system. Without the same check here, make deploy-maas passes its own prerequisite gate and then fails later inside Helmfile.
Keep the Makefile preflight aligned with the Helmfile hook
`@kubectl` get crd gateways.gateway.networking.k8s.io >/dev/null 2>&1 || \
(echo "ERROR: Gateway API CRDs not found. Run 'make deploy-istio' first." && exit 1)
+ `@kubectl` get kuadrant -n kuadrant-system -o name 2>/dev/null | grep -q . || \
+ (echo "ERROR: No Kuadrant instance found in kuadrant-system. Run 'make deploy-rhcl' first." && exit 1)
helmfile apply --selector name=maas --state-values-set maas.enabled=true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Makefile` around lines 107 - 116, The deploy-maas Makefile target only checks
Kuadrant CRDs but not the required live kuadrant instance, causing Helmfile
(charts/maas/helmfile.yaml.gotmpl) to fail later; update the deploy-maas recipe
to also verify the kuadrant resource exists in the kuadrant-system namespace
(e.g., by running a kubectl get kuadrant <name> or kubectl get kuadrant -n
kuadrant-system and failing with a clear message), so the preflight matches the
Helmfile hook and aborts early if the kuadrant instance is missing.
Summary
Add MaaS (Models as a Service) Helm chart and integrate it into the rhaii-on-xks deployment stack alongside RHCL for authenticated, rate-limited model serving on managed Kubernetes (xKS).
What's included
charts/maas/— Full Helm chart for the MaaS stack:maas-controller— Kubernetes operator that watches MaaSSubscription and LLMInferenceService CRsmaas-api— REST API server for API key management, subscriptions, and model discoveryPostgreSQL— Persistent storage for API keys and subscriptionsGateway+HTTPRoute— Istio Gateway API routing for model inference endpointsAuthPolicy+RateLimitPolicy— Kuadrant policies for API key auth and rate limitingMakefile—deploy-maas/undeploy-maastargets with prerequisite validation (KServe, RHCL, Gateway API), orphaned cert-manager detection, MaaS status outputhelmfile.yaml.gotmpl— MaaS helmfile inclusion with automaticgatewayNamespacesmerging between RHCL and MaaSvalues.yaml—maasconfig block (enabled,namespace,gatewayNamespace) +gatewayNamespacesfor RHCLscripts/cleanup.sh— RHCL/Kuadrant and MaaS resource cleanup (CRs, CRDs, RBAC, namespaces)Deployment
Validated on
Test plan
make deploy-all RHCL=true MAAS=truesucceeds on a clean clustermake deploy-maasvalidates prerequisites (fails gracefully if KServe/RHCL missing)/maas-api/health)make undeploy-maascleans up all MaaS resourcesmake statusshows MaaS component statusSummary by CodeRabbit
New Features
Bug Fixes / Cleanup
Chores