chore: sync rhoai-3.4 with main#392
Merged
chaitanya1731 merged 18 commits intored-hat-data-services:rhoai-3.4from Apr 22, 2026
Merged
chore: sync rhoai-3.4 with main#392chaitanya1731 merged 18 commits intored-hat-data-services:rhoai-3.4from
chaitanya1731 merged 18 commits intored-hat-data-services:rhoai-3.4from
Conversation
opendatahub-io#765) …wrapper ## Description KServe's LLMInferenceServiceConfig template injects `command: ["/bin/bash", "-c"]` for containers without an explicit command, causing v0.8.2 simulator models to crash with "invalid option" errors. This adds the explicit command back to all v0.8.2 simulator model YAMLs and updates trlp-test to v0.8.2 with consistent args. ## How Has This Been Tested? Currently running smoke test, but models no longer in CrashLoopBackOff. ``` $ oc get pods -n llm NAME READY STATUS RESTARTS AGE e2e-distinct-2-simulated-kserve-7f849f6b56-kpwp9 1/1 Running 0 21s e2e-distinct-simulated-kserve-7bb4cdb4d7-frnz5 1/1 Running 0 87s e2e-trlp-test-simulated-kserve-84db68679b-t98f7 1/1 Running 0 64s e2e-unconfigured-facebook-opt-125m-simulated-kserve-75cdcctjp2d 1/1 Running 0 66s facebook-opt-125m-simulated-kserve-8f8dc67b7-4x7g9 1/1 Running 0 57s premium-simulated-simulated-premium-kserve-6b97b89985-ln8r2 1/1 Running 0 70s ``` ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [ ] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Documentation** * Updated sample model configurations to explicitly specify container execution commands for improved clarity and consistency across all sample deployments. * **Tests** * Upgraded test simulator fixture to version 0.8.2 with enhanced configuration options. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
the CRD initially was very restrictive, with enum specifying the providers. the PR make the field less strict, to avoid some issues around it in payload processing. generally speaking, this CRD would need to update, and current change is a temp solution. ## Description <!--- Describe your changes in detail --> ## How Has This Been Tested? <!--- Please describe in detail how you tested your changes. --> <!--- Include details of your testing environment, and the tests you ran to --> <!--- see how your change affects other areas of the code, etc. --> ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [ ] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Extended provider support to accept custom provider values beyond the previously hardcoded set, enabling greater flexibility in model configuration. * **Bug Fixes** * Refined resource access permissions to improve security posture by limiting scope to essential resources. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Nir Rozenbaum <nrozenba@redhat.com>
This is a minor change, relevant for people working with Git worktrees. The feature requires a working directory, and `.worktrees` is a popular name for it (in the project root directory). ## How Has This Been Tested? Git ignores the directory after updating `.gitignore` properly ## Merge criteria: - [x] The commits are squashed in a cohesive manner and have meaningful messages. - [x] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [x] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Updated `.gitignore` to exclude Git worktree directories from version control. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Automated promotion of **1 commit(s)** from `main` to `stable`. ``` 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (opendatahub-io#765) ``` --------- Signed-off-by: Nir Rozenbaum <nrozenba@redhat.com> Co-authored-by: Jim Rhyness <jrhyness@redhat.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Nir Rozenbaum <nrozenba@redhat.com> Co-authored-by: Egor Lunin <egor@centerya.com>
…tion (opendatahub-io#752) <!--- Provide a general summary of your changes in the Title above --> ## Description Prevent invalid token rate limits from poisoning the aggregated TokenRateLimitPolicy and blocking deletion of subscriptions that share a model. Changes: - Add validateTokenRateLimit() that rejects unreasonable values before they reach Kuadrant (limit > 1B tokens, window > 365 days, bad format) - Skip subscriptions with invalid limits during TRLP aggregation instead of including them and causing Kuadrant validation failures - Log clear error messages identifying the offending subscription - Add Maximum=1000000000 CRD validation on TokenRateLimit.Limit to reject extreme values at admission time Root cause: When one subscription had invalid limits (e.g. limit: 9007199254740991, window: 9007199254740991d), the controller built an aggregated TRLP that Kuadrant rejected. On delete, the controller tried to rebuild the TRLP (still including the bad sub if others shared the model), which failed again, preventing finalizer removal and leaving subscriptions stuck in Terminating. <!--- Describe your changes in detail --> ## How Has This Been Tested? <!--- Please describe in detail how you tested your changes. --> <!--- Include details of your testing environment, and the tests you ran to --> <!--- see how your change affects other areas of the code, etc. --> ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [ ] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Enforced token rate limit validation (allowed range 1–1,000,000,000) and stricter window format/duration checks. * Subscriptions with invalid limits are skipped during aggregation; if all candidates are invalid, existing aggregated rate limits are removed. * **Documentation** * Clarified CRD schema description to state the maximum allowed token limit. * **Tests** * Updated test input to reflect the adjusted maximum window value. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Wen Liang <liangwen12year@gmail.com>
…datahub-io#703) ## Description Enable users to discover available models using OIDC tokens without first minting an API key. OIDC tokens behave identically to K8s tokens: return all accessible models from all subscriptions based on group membership. For https://redhat.atlassian.net/browse/RHOAIENG-51554 Changes: - Controller reads OIDC config from ModelsAsService CR (spec.externalOIDC) - Dynamically generates oidc-identities authentication in model AuthPolicies - Updated CEL expressions to extract username/groups from both OIDC and K8s tokens - Updated OPA rules to handle OIDC token structure (auth.identity.sub, preferred_username) - Added unit tests for OIDC config fetching and CEL expression validation - Added E2E tests for OIDC token authentication and negative test cases OIDC authentication uses JWT signature validation (no per-request callout). Authorino validates tokens using public keys from the OIDC provider's issuerUrl. ## How Has This Been Tested? - unit test - new e2e tests ## Merge criteria: - [ ] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * OIDC bearer-token support for API requests; controller discovers cluster OIDC issuer and optional client ID to enable JWT-based auth for model listing. Username/groups are derived from API keys, OIDC claims, or Kubernetes identity for subscription selection and cache keys. * **Chores** * RBAC updated and controller now watches cluster service configuration to refresh auth policies. * **Tests** * Added unit and e2e tests for OIDC config handling, claim composition, CEL expressions, and token-based access. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
… digest to 1e1c895 (opendatahub-io#360) Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com> Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
…r digest to 175bafd (opendatahub-io#361) Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com> Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
…#772) <!--- Provide a general summary of your changes in the Title above --> ## Description The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s). ## How Has This Been Tested? Tested manually ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [x] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [x] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved Usage dashboard to be model-aware and aggregated per user/subscription/namespace, enhancing accuracy for Active Users, Success Rate, Token Consumption, Total Errors, and Total Requests. * Updated gating and fallback logic for authorization and rate-limit signals to produce more reliable time-series and per-user token consumption reporting. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Arik Hadas <ahadas@redhat.com>
… digest to d637b9d (opendatahub-io#371) Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com> Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
<!--- Provide a general summary of your changes in the Title above --> ## Description Add / Update documentation for Maas Tenant based architecture change. [RHOAIENG-58939](https://redhat.atlassian.net/browse/RHOAIENG-58939) ## How Has This Been Tested? N/A - docs only changes ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [x] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Updated installation and setup to use the Tenant CR as the primary platform configuration and documented controller self-bootstrapping of a default-tenant * Clarified architecture: Tenant reconciler responsibilities, resource ownership/cleanup, and reconciler split * Updated quickstart and verification steps, namespace/resource locations, and command examples * Added Tenant CR details to release notes and configuration guidance <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--- Provide a general summary of your changes in the Title above -->
## Description
Restrict the maas-controller-role ClusterRole so that get on secrets is
scoped to only `maas-db-config` via resourceNames, rather than granting
get on all secrets cluster-wide.
Why list/watch remains unrestricted: The controller-runtime informer
started by `Watches(&corev1.Secret{})` requires list and watch.
Kubernetes RBAC does not support resourceNames on collection verbs — the
API server ignores it. The Watch already has a client-side predicate
(`secretNamedMaaSDB`) that filters to only `maas-db-config`.
Addresses: lphiri's feedback on PR opendatahub-io#735 — "The DB secret has a specific
name we know ahead of time. I suggest creating a role with a
resourceName constraint."
https://redhat.atlassian.net/browse/RHOAIENG-58934
How Has This Been Tested?
```
make -C maas-controller manifests — generated YAML matches updated markers
make -C maas-controller test — all unit tests pass (no logic changes)
make -C maas-controller verify-codegen — confirms markers and YAML are in sync
```
To verify on a cluster after deploy:
```
# Should return "yes"
kubectl auth can-i get secrets/maas-db-config \
--as=system:serviceaccount:opendatahub:maas-controller -n opendatahub
# Should return "no"
kubectl auth can-i get secrets/some-other-secret \
--as=system:serviceaccount:opendatahub:maas-controller -n opendatahub
```
## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->
- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Security Updates**
* Restricted secret access: get permission is now limited to the
specific maas-db-config secret.
* List and watch permissions for secrets remain available, preserving
discovery and monitoring capabilities while reducing exposure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ndatahub-io#787) ## Summary Add the `tenants.maas.opendatahub.io` CRD to e2e must-gather artifact collection and the auth debug report. ## Description The Tenant CRD was recently introduced but was missing from the e2e debugging and artifact-collection utilities in `auth_utils.sh`. - Add `tenants.maas.opendatahub.io` to the `MAAS_CRDS` array so `collect_maas_crs()` dumps Tenant CR YAML to `tenants.yaml`. - Add `kubectl get tenants` to `collect_cluster_state()` alongside other MaaS CRs. - Add Tenant listing and status/condition detail to `run_auth_debug_report()` under the MaaS CRs section. - Update header comment to document the new `tenants.yaml` artifact. ## How it was tested - Verified script syntax with `bash -n`. - Confirmed the Tenant CRD name matches `deployment/base/maas-controller/crd/bases/maas.opendatahub.io_tenants.yaml`. - Confirmed namespace usage aligns with `TenantReconciler.TenantNamespace` (sourced from `--maas-subscription-namespace`). Made with [Cursor](https://cursor.com) Signed-off-by: Chaitanya Kulkarni <ckulkarn@redhat.com> Signed-off-by: Chaitanya Kulkarni <chkulkar@redhat.com>
Automated promotion of **9 commit(s)** from `main` to `stable`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (opendatahub-io#787) 1b8f212 chore: restrict rbac for db secret (opendatahub-io#779) e746008 docs: add/update documentation for Maas Tenant (opendatahub-io#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (opendatahub-io#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (opendatahub-io#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (opendatahub-io#752) fae753e chore: add .worktrees/ to .gitignore (opendatahub-io#774) c01dc5b fix: minor updates for external model (opendatahub-io#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (opendatahub-io#765) ```
Automated promotion of **11 commit(s)** from `stable` to `rhoai`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (opendatahub-io#787) 1b8f212 chore: restrict rbac for db secret (opendatahub-io#779) e746008 docs: add/update documentation for Maas Tenant (opendatahub-io#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (opendatahub-io#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (opendatahub-io#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (opendatahub-io#752) 89fba29 chore: promote main to stable (opendatahub-io#770) fae753e chore: add .worktrees/ to .gitignore (opendatahub-io#774) c01dc5b fix: minor updates for external model (opendatahub-io#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (opendatahub-io#765) ```
…r digest to 7d4e475 (opendatahub-io#384) Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com> Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
jland-redhat
approved these changes
Apr 22, 2026
703a624
into
red-hat-data-services:rhoai-3.4
10 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sync
rhoai-3.4branch with the latest changes frommainto keep the release branch up to date.Description
This PR merges
mainintorhoai-3.4, bringing in 17 commits. Merge completed cleanly with no conflicts.33 files changed across controller code, CRDs, RBAC, dashboards, docs, e2e tests, and Dockerfiles.
Commits included
How It Was Tested