feat: improve deploy and test script robustness#647
feat: improve deploy and test script robustness#647somya-bhatnagar wants to merge 10 commits intoopendatahub-io:mainfrom
Conversation
Add configurable timeout constants, consistent error handling, and idempotency improvements across deployment and test scripts to make deployments more predictable and debuggable. - Add environment-variable-overridable timeout constants to deployment-helpers.sh - Replace all hardcoded timeouts with configurable constants - Improve error messages to include actual timeout values - Add idempotency checks for gateway and namespace creation - Add set -uo pipefail to validate-deployment.sh for stricter error handling - Document timeout configuration in script headers Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolve conflicts in test/e2e/scripts/prow_run_smoke_test.sh: - Keep timeout configuration documentation from feature branch - Add MODEL_NAMESPACE documentation from main - Use improved MaaSModelRefs wait logic from main (race condition fix) - Keep configurable MAASMODELREF_TIMEOUT from feature branch - Remove controller bounce workaround (no longer needed per main)
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: somya-bhatnagar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCentralized and exposed many deployment timeouts as environment-overridable readonly variables and replaced hard-coded wait/rollout durations across scripts and tests. Made namespace/Gateway/GatewayClass/ingress creation idempotent with existence checks and added explicit error handling for wait timeouts (functions now return non-zero on timeout and log timeout values). Added stricter shell flags to validate-deployment.sh. Updated Tekton CEL skip patterns. CRD/API change: removed Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Actionable Issues
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
scripts/validate-deployment.sh (2)
205-227:⚠️ Potential issue | 🔴 CriticalUnset variable access in helper functions with
set -u.
print_failandprint_warningaccess$2,$3,$4without defaults. Withset -u, callers must always provide all arguments or the script errors.Fix using default empty values
print_fail() { echo -e "${RED}❌ FAIL: $1${NC}" - if [ -n "$2" ]; then - echo -e "${RED} Reason: $2${NC}" + if [ -n "${2:-}" ]; then + echo -e "${RED} Reason: ${2:-}${NC}" fi - if [ -n "$3" ]; then - echo -e "${YELLOW} Suggestion: $3${NC}" + if [ -n "${3:-}" ]; then + echo -e "${YELLOW} Suggestion: ${3:-}${NC}" fi - if [ -n "$4" ]; then - echo -e "${YELLOW} Suggestion: $4${NC}" + if [ -n "${4:-}" ]; then + echo -e "${YELLOW} Suggestion: ${4:-}${NC}" fi ((FAILED++)) }Apply similar pattern to
print_warning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/validate-deployment.sh` around lines 205 - 227, print_fail and print_warning read optional positional params ($2,$3,$4) which will cause errors under set -u; update both functions to use default-empty parameter expansion (e.g. use ${2:-} / ${3:-} / ${4:-}) or assign local variables with defaults at the top of each function before testing them, then use those locals in the echo checks so missing args do not trigger unbound variable errors.
30-31:⚠️ Potential issue | 🔴 CriticalUnset variable access with
set -uenabled.Line 31 accesses
$1unconditionally. Withset -u(line 6), this errors when no arguments are provided:./validate-deployment.sh: line 31: $1: unbound variableFix
-if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then +if [ "${1:-}" = "--help" ] || [ "${1:-}" = "-h" ]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/validate-deployment.sh` around lines 30 - 31, The script validate-deployment.sh enables set -u but then reads $1 directly in the help check (if [ "$1" = "--help" ] || [ "$1" = "-h" ];), causing an unbound variable error when no arguments are passed; modify the help check to safely handle an empty $1 by either first testing argument count (e.g., if [ "$#" -gt 0 ] && ( ... )) or by using a safe expansion like "${1:-}" in the comparisons (if [ "${1:-}" = "--help" ] || [ "${1:-}" = "-h" ];), keeping the rest of the help logic unchanged.scripts/deployment-helpers.sh (1)
458-458:⚠️ Potential issue | 🟠 MajorCWE-78: Command injection via
eval.
eval "$check_command"executes arbitrary shell code. If any caller passes user-controlled or externally-derived input, this is exploitable. Current callers appear to use hardcoded commands, but this is fragile.Consider replacing with a safer pattern using
bash -cor restructuring to accept a command array:Proposed safer pattern
- if eval "$check_command"; then + if bash -c "$check_command"; thenOr better, refactor callers to pass command arrays and use
"$@"execution.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/deployment-helpers.sh` at line 458, Replace the use of eval on the variable check_command (the conditional using if eval "$check_command"; then) to eliminate command injection risk: change callers to pass a command as an array (e.g., check_command_array) and execute it with the safe array expansion ("${check_command_array[@]}") or, if that's not possible, run the string via bash -c with explicit arguments (bash -c -- "$check_command") so no untrusted input is evaluated by eval; update the calling sites to construct and pass an array when possible and adjust the if condition to use the safe execution form.
🧹 Nitpick comments (1)
test/e2e/scripts/prow_run_smoke_test.sh (1)
51-60: Timeout documentation is partial.Header lists 6 timeout variables but
deployment-helpers.shdefines 16. Consider either documenting all or just referencing the helper file without duplicating the subset.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/scripts/prow_run_smoke_test.sh` around lines 51 - 60, The comment block currently lists only a subset of timeout variables (e.g., CUSTOM_RESOURCE_TIMEOUT, LLMIS_TIMEOUT, MAASMODELREF_TIMEOUT, AUTHPOLICY_TIMEOUT, AUTHORINO_TIMEOUT, ROLLOUT_TIMEOUT) while deployment-helpers.sh actually defines many more; update the header in prow_run_smoke_test.sh to either enumerate all timeout variables defined in deployment-helpers.sh or replace the partial list with a single clear reference line that points users to deployment-helpers.sh for the authoritative list (mentioning deployment-helpers.sh by name so readers know where to look).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/deployment-helpers.sh`:
- Line 967: The local assignment to csv_name masks the function return and
triggers SC2155; change the line so you declare csv_name first and assign
afterward (i.e., use a separate local declaration for csv_name and then set
csv_name=$(find_csv_with_min_version "$operator_prefix" "$min_version"
"$namespace")), referring to the symbol find_csv_with_min_version and variable
csv_name to locate the code to update.
- Around line 1155-1158: The use of "local state=$(kubectl get catalogsource
...)" masks the kubectl exit code; declare the variable first and assign on a
separate line so the command's exit status isn't lost. Change the block around
the catalogsource check to first "local state" and then run "state=$(kubectl get
catalogsource \"$name\" -n \"$namespace\" -o
jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null)"; if you
need the kubectl exit code for branching capture "$?" immediately after the
command. This targets the "state" variable and the "kubectl get catalogsource"
invocation in the script.
---
Outside diff comments:
In `@scripts/deployment-helpers.sh`:
- Line 458: Replace the use of eval on the variable check_command (the
conditional using if eval "$check_command"; then) to eliminate command injection
risk: change callers to pass a command as an array (e.g., check_command_array)
and execute it with the safe array expansion ("${check_command_array[@]}") or,
if that's not possible, run the string via bash -c with explicit arguments (bash
-c -- "$check_command") so no untrusted input is evaluated by eval; update the
calling sites to construct and pass an array when possible and adjust the if
condition to use the safe execution form.
In `@scripts/validate-deployment.sh`:
- Around line 205-227: print_fail and print_warning read optional positional
params ($2,$3,$4) which will cause errors under set -u; update both functions to
use default-empty parameter expansion (e.g. use ${2:-} / ${3:-} / ${4:-}) or
assign local variables with defaults at the top of each function before testing
them, then use those locals in the echo checks so missing args do not trigger
unbound variable errors.
- Around line 30-31: The script validate-deployment.sh enables set -u but then
reads $1 directly in the help check (if [ "$1" = "--help" ] || [ "$1" = "-h"
];), causing an unbound variable error when no arguments are passed; modify the
help check to safely handle an empty $1 by either first testing argument count
(e.g., if [ "$#" -gt 0 ] && ( ... )) or by using a safe expansion like "${1:-}"
in the comparisons (if [ "${1:-}" = "--help" ] || [ "${1:-}" = "-h" ];), keeping
the rest of the help logic unchanged.
---
Nitpick comments:
In `@test/e2e/scripts/prow_run_smoke_test.sh`:
- Around line 51-60: The comment block currently lists only a subset of timeout
variables (e.g., CUSTOM_RESOURCE_TIMEOUT, LLMIS_TIMEOUT, MAASMODELREF_TIMEOUT,
AUTHPOLICY_TIMEOUT, AUTHORINO_TIMEOUT, ROLLOUT_TIMEOUT) while
deployment-helpers.sh actually defines many more; update the header in
prow_run_smoke_test.sh to either enumerate all timeout variables defined in
deployment-helpers.sh or replace the partial list with a single clear reference
line that points users to deployment-helpers.sh for the authoritative list
(mentioning deployment-helpers.sh by name so readers know where to look).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 0718846f-a8e4-44bd-9d72-b06aad384896
📒 Files selected for processing (4)
scripts/deploy.shscripts/deployment-helpers.shscripts/validate-deployment.shtest/e2e/scripts/prow_run_smoke_test.sh
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/deployment-helpers.sh (1)
458-458:⚠️ Potential issue | 🟠 MajorCWE-78:
evalusage violates security guideline.Coding guidelines explicitly prohibit
evaland dynamic command execution. Line 458 useseval "$check_command", where$check_commandcontains variable-interpolated kubectl commands from callers. This pattern is fragile and violates the restriction.Refactor to accept a function name and invoke it directly:
Safer alternative
wait_for_custom_check() { local description=${1?description is required}; shift - local check_command=${1?check command is required}; shift + local check_func=${1?check function is required}; shift local timeout=${1:-$CUSTOM_CHECK_TIMEOUT}; shift || true local interval=${1:-5}; shift || true log_info "Waiting for: $description (timeout: ${timeout}s)" local elapsed=0 while [ $elapsed -lt $timeout ]; do - if eval "$check_command"; then + if "$check_func"; then log_info "$description - Ready" return 0 fiCallers define a function and pass its name:
_check_kuadrant_ready() { kubectl get kuadrant kuadrant -n "$namespace" -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' 2>/dev/null | grep -q True; } wait_for_custom_check "Kuadrant ready in $namespace" _check_kuadrant_ready "$kuadrant_initial_timeout" 5🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/deployment-helpers.sh` at line 458, The use of eval on check_command is a security risk; change the API so wait_for_custom_check accepts a check function name (not a shell string) and invoke that function directly instead of eval: update callers to define a function like _check_kuadrant_ready and pass its name, update wait_for_custom_check (replace check_command and the eval call) to take a check_fn parameter and call it (e.g., "$check_fn" or call via bash indirection) and test its exit status, and ensure arguments like namespace/timeouts are forwarded and all variables are properly quoted; remove all eval usages and update any call sites that build command strings.
♻️ Duplicate comments (1)
scripts/deployment-helpers.sh (1)
1156-1159:⚠️ Potential issue | 🟡 MinorSC2155: Declare and assign separately to avoid masking return values.
Same issue as the prior fix at line 967. The
localdeclaration masks thekubectlexit code.Fix
local state - local state=$(kubectl get catalogsource "$name" -n "$namespace" \ + state=$(kubectl get catalogsource "$name" -n "$namespace" \ -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/deployment-helpers.sh` around lines 1156 - 1159, The local assignment for state masks the kubectl exit code; change the two-part declaration so you first run kubectl to capture its output into a temp variable and check its exit status, then do a separate local assignment like using kubectl get catalogsource "$name" -n "$namespace" -o jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null into a non-local var, check the command succeeded, and only then set local state="$that_var" (referencing the variable name state and the kubectl get catalogsource invocation) so the kubectl exit code is preserved and handled correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@scripts/deployment-helpers.sh`:
- Line 458: The use of eval on check_command is a security risk; change the API
so wait_for_custom_check accepts a check function name (not a shell string) and
invoke that function directly instead of eval: update callers to define a
function like _check_kuadrant_ready and pass its name, update
wait_for_custom_check (replace check_command and the eval call) to take a
check_fn parameter and call it (e.g., "$check_fn" or call via bash indirection)
and test its exit status, and ensure arguments like namespace/timeouts are
forwarded and all variables are properly quoted; remove all eval usages and
update any call sites that build command strings.
---
Duplicate comments:
In `@scripts/deployment-helpers.sh`:
- Around line 1156-1159: The local assignment for state masks the kubectl exit
code; change the two-part declaration so you first run kubectl to capture its
output into a temp variable and check its exit status, then do a separate local
assignment like using kubectl get catalogsource "$name" -n "$namespace" -o
jsonpath='{.status.connectionState.lastObservedState}' 2>/dev/null into a
non-local var, check the command succeeded, and only then set local
state="$that_var" (referencing the variable name state and the kubectl get
catalogsource invocation) so the kubectl exit code is preserved and handled
correctly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 3ae2acfc-d9d9-4869-a359-f644f0d3cbac
📒 Files selected for processing (1)
scripts/deployment-helpers.sh
…hub-io#559) ## Summary - Remove incorrect `3xx` from the list of response codes that grant access in the model-listing-flow documentation - The source code (`discovery.go:256-300`) confirms that 3xx codes fall into the `default` switch case which returns `authRetry` (not `authGranted`). Only `2xx` and `405` result in `authGranted` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Bug Fixes** * Updated model listing behavior to refine access validation criteria. The model listing endpoint now filters models more precisely based on HTTP response codes from endpoint probes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ub-io#642) Extend the CEL expression in Tekton PipelineRun definitions to also skip Konflux builds when only `.gitignore`, `OWNERS`, `PROJECT`, `LICENSE`, .`github/`, or `.tekton/` files are changed — matching the existing skip_if_only_changed regex used by the Prow ci-operator jobs. An example of the existing prow ci-operator job skip condition: ```yaml skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE|CONTRIBUTING\.md)$|^\.github/|^\.tekton/ ``` - https://github.com/openshift/release/blob/01591970ea5008bda7b05a49ce8a0905207182ee/ci-operator/jobs/opendatahub-io/models-as-a-service/opendatahub-io-models-as-a-service-main-presubmits.yaml#L21 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Chores** * Updated pull request pipeline triggering conditions to exclude additional configuration files and CI/CD directories from initiating automated checks. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…pendatahub-io#644) ## Summary Related to - https://redhat.atlassian.net/browse/RHOAIENG-55976 This PR makes `tokenRateLimits` required on each `MaaSSubscription` model reference and removes the unused `tokenRateLimitRef` field from the CRD. ## Changes ### API Changes - ✅ Removed `tokenRateLimitRef` field from `ModelSubscriptionRef` - ✅ Made `tokenRateLimits` required with `MinItems=1` validation - ✅ Regenerated CRD manifests and deepcopy code ### Tests - ✅ Fixed 4 unit tests (added missing field indexes) - ✅ All tests passing - ✅ Live cluster validation complete ### Documentation - ✅ Updated CRD reference docs - ✅ Updated architecture overview - ✅ Created migration guide: `docs/content/migration/tokenratelimits-required-3.4.md` - ✅ Added v3.4.0 release notes with breaking change notice ## Live Cluster Testing Results (https://console-openshift-console.apps.ci-ln-hdzds3t-76ef8.aws-4.ci.openshift.org/) Deployed custom controller and verified: - ✅ Subscriptions **without** `tokenRateLimits` → **REJECTED** with "Required value" error - ✅ Subscriptions **with** `tokenRateLimits` → **ACCEPTED** - ✅ Subscriptions with old `tokenRateLimitRef` → **REJECTED** with "unknown field" error - ✅ Controller reconciliation working correctly - ✅ No RBAC errors ## Migration Impact **Before (3.3 and earlier):** ```yaml modelRefs: - name: my-model namespace: llm tokenRateLimits: # Optional - limit: 1000 window: 1m tokenRateLimitRef: "some-ref" # Unused by controller ``` **After (3.4+):** ```yaml modelRefs: - name: my-model namespace: llm tokenRateLimits: # REQUIRED - at least one - limit: 1000 window: 1m # tokenRateLimitRef removed ``` ## Migration Guide See detailed migration instructions in `docs/content/migration/tokenratelimits-required-3.4.md` Users must add inline `tokenRateLimits` to all `MaaSSubscription` resources before upgrading to 3.4. ## Acceptance Criteria All criteria from the user story met: - [x] `tokenRateLimitRef` not present in API - [x] Each `modelRefs` entry must include non-empty `tokenRateLimits` - [x] Validation rejects non-compliant resources with clear messages - [x] Tests cover required limits, no `tokenRateLimitRef` references - [x] Generated CRD YAML matches types - [x] Documentation updated - [x] Migration guide provided - [x] Breaking change documented in release notes ## Files Changed - `maas-controller/api/maas/v1alpha1/maassubscription_types.go` - API types - `deployment/base/maas-controller/crd/bases/maas.opendatahub.io_maassubscriptions.yaml` - CRD manifest - `maas-controller/api/maas/v1alpha1/zz_generated.deepcopy.go` - Generated code - `maas-controller/pkg/controller/maas/maassubscription_controller_test.go` - Test fixes - `docs/content/configuration-and-management/maas-controller-overview.md` - Architecture docs - `docs/content/reference/crds/maas-subscription.md` - CRD reference - `docs/content/migration/tokenratelimits-required-3.4.md` - Migration guide (new) - `docs/content/release-notes/index.md` - Release notes 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Breaking Changes** * Rate limit configuration for each model subscription is now mandatory and must be specified inline rather than referenced externally * At least one rate limit entry is required per model * Rate limit values must be greater than zero * **Documentation** * Updated configuration and CRD documentation to reflect new rate limit requirements <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Separate local declaration from command substitution to preserve kubectl's exit code. While the exit code isn't currently used in this error path, following this best practice prevents potential issues and improves code maintainability. Before: local state=$(kubectl get catalogsource ...) After: local state state=$(kubectl get catalogsource ...) This ensures $? reflects kubectl's exit status, not the local builtin.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
maas-controller/api/maas/v1alpha1/maassubscription_types.go (1)
78-83:⚠️ Potential issue | 🟠 MajorClose the remaining
windowvalidation gap before regenerating the CRD.Line 82 still accepts
0s/0m/0h/0d, andWindowhas noMaxLength. That leaves this user-facing field able to carry invalid or oversized values even after tighteninglimit.As per coding guidelines, API TYPES SECURITY: Kubebuilder validation markers on all user-facing fields; MaxLength validation on string fields.Diff
type TokenRateLimit struct { // Limit is the maximum number of tokens allowed // +kubebuilder:validation:Minimum=1 Limit int64 `json:"limit"` // Window is the time window (e.g., "1m", "1h", "24h") - // +kubebuilder:validation:Pattern=`^(\d+)(s|m|h|d)$` + // +kubebuilder:validation:MaxLength=20 + // +kubebuilder:validation:Pattern=`^([1-9]\d*)(s|m|h|d)$` Window string `json:"window"` }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@maas-controller/api/maas/v1alpha1/maassubscription_types.go` around lines 78 - 83, The Window field currently allows zero (e.g., "0s") and has no MaxLength; update the kubebuilder markers on the Window field to enforce a positive integer and a bounded length: replace the existing Pattern with one that disallows zero such as ^([1-9]\d*)(s|m|h|d)$ and add a MaxLength marker (e.g., +kubebuilder:validation:MaxLength=6) above the Window string definition, then regenerate the CRD so the new validation is applied; target the Window field in maassubscription_types.go to make this change.docs/content/reference/crds/maas-subscription.md (1)
27-35:⚠️ Potential issue | 🟡 MinorDocument the new
limit >= 1constraint here too.Line 27 updates
tokenRateLimits, but Line 34 still reads as if anyint64is valid. The CRD now rejectstokenRateLimits[].limit: 0, so this table will send users into avoidable validation errors.Diff
-| limit | int64 | Yes | Maximum number of tokens allowed | +| limit | int64 | Yes | Maximum number of tokens allowed. Minimum: `1` |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/content/reference/crds/maas-subscription.md` around lines 27 - 35, Update the TokenRateLimit documentation to state the minimum value constraint: change the description for the limit field in the TokenRateLimit table (referenced as TokenRateLimit and tokenRateLimits) to indicate the integer must be >= 1 (e.g., "int64 (minimum 1)" or "Maximum number of tokens allowed (must be >= 1)") so it matches the CRD validation that rejects tokenRateLimits[].limit == 0.
🧹 Nitpick comments (1)
maas-controller/pkg/controller/maas/maassubscription_controller_test.go (1)
400-400: Use the shared index-key constant instead of repeating the raw field name.These fake-client registrations have to stay byte-for-byte aligned with the reconciler’s
MatchingFieldskey. Adding more"spec.modelRef"literals here creates extra drift points; usemodelRefIndexKeyso a future rename stays atomic.Diff
- WithIndex(&maasv1alpha1.MaaSSubscription{}, "spec.modelRef", subscriptionModelRefIndexer). + WithIndex(&maasv1alpha1.MaaSSubscription{}, modelRefIndexKey, subscriptionModelRefIndexer).Also applies to: 496-496, 722-722, 816-816
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@maas-controller/pkg/controller/maas/maassubscription_controller_test.go` at line 400, Replace the repeated raw index key string "spec.modelRef" used in the fake-client WithIndex registrations with the shared constant modelRefIndexKey so the tests stay byte-for-byte aligned with the reconciler's MatchingFields; update each call that currently uses "spec.modelRef" (e.g., the WithIndex call that uses subscriptionModelRefIndexer for maasv1alpha1.MaaSSubscription) to pass modelRefIndexKey instead of the literal, and apply the same change at the other occurrences mentioned (lines around the second, third and fourth hits).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.tekton/odh-maas-controller-pull-request.yaml:
- Around line 12-14: The PR skip pattern currently excludes changes under
".tekton/" (the string literal '.tekton/' in the conditional), so pipeline runs
are suppressed for pipeline config changes; remove '.tekton/' from the OR-list
of matches in the conditional expression (the clause containing
x.matches('^\\.tekton/')) in the PR pipeline YAML files (both the pull-request
YAML shown and the analogous .tekton/odh-maas-api-pull-request.yaml) so that
modifications to .tekton/ will no longer be skipped and the validation pipeline
will run.
In `@docs/content/release-notes/index.md`:
- Around line 7-9: The release notes mention "See the overall migration plan"
but leave a placeholder; update the text under the Version 3.4.0 entry to point
to a real migration document or anchor (e.g., replace the phrase "See the
overall migration plan for detailed upgrade instructions from previous
versions." with a direct link or relative path to the migration guide or an
in-page anchor like "#migration-plan"); ensure the link target exists (create a
migration guide page or add an anchor named "migration-plan" in the docs) so
readers of the release-notes index can follow concrete upgrade instructions.
---
Outside diff comments:
In `@docs/content/reference/crds/maas-subscription.md`:
- Around line 27-35: Update the TokenRateLimit documentation to state the
minimum value constraint: change the description for the limit field in the
TokenRateLimit table (referenced as TokenRateLimit and tokenRateLimits) to
indicate the integer must be >= 1 (e.g., "int64 (minimum 1)" or "Maximum number
of tokens allowed (must be >= 1)") so it matches the CRD validation that rejects
tokenRateLimits[].limit == 0.
In `@maas-controller/api/maas/v1alpha1/maassubscription_types.go`:
- Around line 78-83: The Window field currently allows zero (e.g., "0s") and has
no MaxLength; update the kubebuilder markers on the Window field to enforce a
positive integer and a bounded length: replace the existing Pattern with one
that disallows zero such as ^([1-9]\d*)(s|m|h|d)$ and add a MaxLength marker
(e.g., +kubebuilder:validation:MaxLength=6) above the Window string definition,
then regenerate the CRD so the new validation is applied; target the Window
field in maassubscription_types.go to make this change.
---
Nitpick comments:
In `@maas-controller/pkg/controller/maas/maassubscription_controller_test.go`:
- Line 400: Replace the repeated raw index key string "spec.modelRef" used in
the fake-client WithIndex registrations with the shared constant
modelRefIndexKey so the tests stay byte-for-byte aligned with the reconciler's
MatchingFields; update each call that currently uses "spec.modelRef" (e.g., the
WithIndex call that uses subscriptionModelRefIndexer for
maasv1alpha1.MaaSSubscription) to pass modelRefIndexKey instead of the literal,
and apply the same change at the other occurrences mentioned (lines around the
second, third and fourth hits).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: eaa44915-16ce-4c3d-b0e0-b737ebd849bc
📒 Files selected for processing (11)
.tekton/odh-maas-api-pull-request.yaml.tekton/odh-maas-controller-pull-request.yamldeployment/base/maas-controller/crd/bases/maas.opendatahub.io_maassubscriptions.yamldocs/content/configuration-and-management/maas-controller-overview.mddocs/content/configuration-and-management/model-listing-flow.mddocs/content/reference/crds/maas-subscription.mddocs/content/release-notes/index.mdmaas-controller/api/maas/v1alpha1/maassubscription_types.gomaas-controller/api/maas/v1alpha1/zz_generated.deepcopy.gomaas-controller/pkg/controller/maas/maassubscription_controller_test.goscripts/deployment-helpers.sh
💤 Files with no reviewable changes (1)
- maas-controller/api/maas/v1alpha1/zz_generated.deepcopy.go
✅ Files skipped from review due to trivial changes (2)
- docs/content/configuration-and-management/maas-controller-overview.md
- docs/content/configuration-and-management/model-listing-flow.md
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/deployment-helpers.sh
| Version 3.4.0 introduces new CRDs and API resources that are not compatible with previous versions. All MaaS custom resources (`MaaSModelRef`, `MaaSAuthPolicy`, `MaaSSubscription`) are new in this release. | ||
|
|
||
| **Migration:** See the overall migration plan for detailed upgrade instructions from previous versions. |
There was a problem hiding this comment.
Replace the migration placeholder with a real link or path.
Line 9 tells readers to see an “overall migration plan” but doesn’t point to any document or anchor. For a breaking-change entry, that leaves upgrade guidance as a dead end.
🧰 Tools
🪛 LanguageTool
[style] ~7-~7: Consider using “incompatible” to avoid wordiness.
Context: ...ces new CRDs and API resources that are not compatible with previous versions. All MaaS custom...
(NOT_ABLE_PREMIUM)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/content/release-notes/index.md` around lines 7 - 9, The release notes
mention "See the overall migration plan" but leave a placeholder; update the
text under the Version 3.4.0 entry to point to a real migration document or
anchor (e.g., replace the phrase "See the overall migration plan for detailed
upgrade instructions from previous versions." with a direct link or relative
path to the migration guide or an in-page anchor like "#migration-plan"); ensure
the link target exists (create a migration guide page or add an anchor named
"migration-plan" in the docs) so readers of the release-notes index can follow
concrete upgrade instructions.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
cat docs/content/release-notes/index.md
Length of output: 437
@somya-bhatnagar, the current branch still has the placeholder text with no actual link — the file reads:
**Migration:** See the overall migration plan for detailed upgrade instructions from previous versions.
The fix doesn't appear to have been committed yet. Could you double-check? The line should point to a real document or anchor, for example:
**Migration:** See the [migration guide](../migration/v3.4.0.md) for detailed upgrade instructions from previous versions.(adjusting the path to wherever the migration document actually lives in your docs tree). If the migration guide doesn't exist yet, you could either create a stub page or add an inline anchor and note that the full guide is forthcoming — just so readers aren't left at a dead end on a breaking-change release.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.tekton/odh-maas-controller-pull-request.yaml:
- Around line 11-14: In the pipelinesascode on-cel expression (the annotation
key pipelinesascode.tekton.dev/on-cel-expression) remove the extra skip pattern
"|| x.matches('^\\.tekton/')" so the CEL boolean expression matches the
controller behavior; locate the expression containing files.all.all(...) and
delete the x.matches('^\\.tekton/') clause (including the preceding ||) so
.tekton/ changes are not excluded from validation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 5ec8ad56-5f96-4093-98ac-51e447cb5ade
📒 Files selected for processing (1)
.tekton/odh-maas-controller-pull-request.yaml
|
/retest |
…ub-io#642) Extend the CEL expression in Tekton PipelineRun definitions to also skip Konflux builds when only `.gitignore`, `OWNERS`, `PROJECT`, `LICENSE`, .`github/`, or `.tekton/` files are changed — matching the existing skip_if_only_changed regex used by the Prow ci-operator jobs. An example of the existing prow ci-operator job skip condition: ```yaml skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE|CONTRIBUTING\.md)$|^\.github/|^\.tekton/ ``` - https://github.com/openshift/release/blob/01591970ea5008bda7b05a49ce8a0905207182ee/ci-operator/jobs/opendatahub-io/models-as-a-service/opendatahub-io-models-as-a-service-main-presubmits.yaml#L21 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> * **Chores** * Updated pull request pipeline triggering conditions to exclude additional configuration files and CI/CD directories from initiating automated checks. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Remove the extra skip pattern || x.matches('^\.tekton/') from the
CEL expression so that changes to .tekton/ pipeline files are validated
by the Konflux builds, matching the expected controller behavior.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@somya-bhatnagar: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
closing the ticket. Will create new. |
Summary
Improve robustness of deploy and test scripts with consistent error handling, configurable timeouts, and idempotent operations to make failures easier to diagnose and re-runs predictable.
Story
Acceptance Criteria
Changes
Error Handling
Timeout Configuration
Added 16 configurable timeout environment variables (defined in `scripts/deployment-helpers.sh`):
Core Timeouts:
Infrastructure Timeouts:
Additional: `RESOURCE_TIMEOUT`, `POD_TIMEOUT`, `WEBHOOK_TIMEOUT`, `CUSTOM_CHECK_TIMEOUT`, `KUBECONFIG_WAIT_TIMEOUT`, `CATALOGSOURCE_TIMEOUT`
Usage Example:
```bash
Deploy with extended timeouts for slow cluster
CUSTOM_RESOURCE_TIMEOUT=900 LLMIS_TIMEOUT=600 ./scripts/deploy.sh --operator-type odh
Run E2E tests with custom timeouts
ROLLOUT_TIMEOUT=60 AUTHORINO_TIMEOUT=60 ./test/e2e/scripts/prow_run_smoke_test.sh
```
Idempotent Operations
Files Modified
Testing
Validation Performed
Related Issues
Addresses requirements for robust CI/CD execution with clear diagnostics and configurable timeouts for different cluster speeds.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests