Skip to content

chore: promote main to stable#788

Merged
chaitanya1731 merged 9 commits intostablefrom
main
Apr 22, 2026
Merged

chore: promote main to stable#788
chaitanya1731 merged 9 commits intostablefrom
main

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Automated promotion of 9 commit(s) from main to stable.

fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787)
1b8f212 chore: restrict rbac for db secret (#779)
e746008 docs: add/update documentation for Maas Tenant (#773)
147eaa2 fix: per-model(s) top-level values in usage dashboard (#772)
b327b34 feat: add OIDC token support for model discovery via /v1/models (#703)
dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752)
fae753e chore: add .worktrees/ to .gitignore (#774)
c01dc5b fix: minor updates for external model (#771)
65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765)

jrhyness and others added 9 commits April 17, 2026 21:30
#765)

…wrapper

## Description

KServe's LLMInferenceServiceConfig template injects `command:
["/bin/bash", "-c"]` for containers without an explicit command, causing
v0.8.2 simulator models to crash with "invalid option" errors. This adds
the explicit command back to all v0.8.2 simulator model YAMLs and
updates trlp-test to v0.8.2 with consistent args.


## How Has This Been Tested?
Currently running smoke test, but models no longer in CrashLoopBackOff. 
```
$ oc get pods -n llm
NAME                                                              READY   STATUS    RESTARTS   AGE
e2e-distinct-2-simulated-kserve-7f849f6b56-kpwp9                  1/1     Running   0          21s
e2e-distinct-simulated-kserve-7bb4cdb4d7-frnz5                    1/1     Running   0          87s
e2e-trlp-test-simulated-kserve-84db68679b-t98f7                   1/1     Running   0          64s
e2e-unconfigured-facebook-opt-125m-simulated-kserve-75cdcctjp2d   1/1     Running   0          66s
facebook-opt-125m-simulated-kserve-8f8dc67b7-4x7g9                1/1     Running   0          57s
premium-simulated-simulated-premium-kserve-6b97b89985-ln8r2       1/1     Running   0          70s
```


## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Documentation**
* Updated sample model configurations to explicitly specify container
execution commands for improved clarity and consistency across all
sample deployments.

* **Tests**
* Upgraded test simulator fixture to version 0.8.2 with enhanced
configuration options.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Sonnet 4.5 <[email protected]>
the CRD initially was very restrictive, with enum specifying the
providers.
the PR make the field less strict, to avoid some issues around it in
payload processing.
generally speaking, this CRD would need to update, and current change is
a temp solution.

## Description
<!--- Describe your changes in detail -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran
to -->
<!--- see how your change affects other areas of the code, etc. -->

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Extended provider support to accept custom provider values beyond the
previously hardcoded set, enabling greater flexibility in model
configuration.

* **Bug Fixes**
* Refined resource access permissions to improve security posture by
limiting scope to essential resources.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Nir Rozenbaum <[email protected]>
This is a minor change, relevant for people working with Git worktrees.
The feature requires a working directory, and `.worktrees` is a popular
name for it (in the project root directory).

## How Has This Been Tested?
Git ignores the directory after updating `.gitignore` properly

## Merge criteria:
- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Updated `.gitignore` to exclude Git worktree directories from version
control.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
…tion (#752)

<!--- Provide a general summary of your changes in the Title above -->

## Description
Prevent invalid token rate limits from poisoning the aggregated
TokenRateLimitPolicy and blocking deletion of subscriptions that share a
model.

Changes:
- Add validateTokenRateLimit() that rejects unreasonable values before
they reach Kuadrant (limit > 1B tokens, window > 365 days, bad format)
- Skip subscriptions with invalid limits during TRLP aggregation instead
of including them and causing Kuadrant validation failures
- Log clear error messages identifying the offending subscription
- Add Maximum=1000000000 CRD validation on TokenRateLimit.Limit to
reject extreme values at admission time

Root cause: When one subscription had invalid limits (e.g. limit:
9007199254740991, window: 9007199254740991d), the controller built an
aggregated TRLP that Kuadrant rejected. On delete, the controller tried
to rebuild the TRLP (still including the bad sub if others shared the
model), which failed again, preventing finalizer removal and leaving
subscriptions stuck in Terminating.


<!--- Describe your changes in detail -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran
to -->
<!--- see how your change affects other areas of the code, etc. -->

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Enforced token rate limit validation (allowed range 1–1,000,000,000)
and stricter window format/duration checks.
* Subscriptions with invalid limits are skipped during aggregation; if
all candidates are invalid, existing aggregated rate limits are removed.
* **Documentation**
* Clarified CRD schema description to state the maximum allowed token
limit.
* **Tests**
  * Updated test input to reflect the adjusted maximum window value.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Wen Liang <[email protected]>
## Description

Enable users to discover available models using OIDC tokens without
first minting an API key. OIDC tokens behave identically to K8s tokens:
return all accessible models from all subscriptions based on group
membership.

For  https://redhat.atlassian.net/browse/RHOAIENG-51554



Changes:
- Controller reads OIDC config from ModelsAsService CR
(spec.externalOIDC)
- Dynamically generates oidc-identities authentication in model
AuthPolicies
- Updated CEL expressions to extract username/groups from both OIDC and
K8s tokens
- Updated OPA rules to handle OIDC token structure (auth.identity.sub,
preferred_username)
- Added unit tests for OIDC config fetching and CEL expression
validation
- Added E2E tests for OIDC token authentication and negative test cases

OIDC authentication uses JWT signature validation (no per-request
callout). Authorino validates tokens using public keys from the OIDC
provider's issuerUrl.



## How Has This Been Tested?
- unit test
- new e2e tests 


## Merge criteria:

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* OIDC bearer-token support for API requests; controller discovers
cluster OIDC issuer and optional client ID to enable JWT-based auth for
model listing. Username/groups are derived from API keys, OIDC claims,
or Kubernetes identity for subscription selection and cache keys.

* **Chores**
* RBAC updated and controller now watches cluster service configuration
to refresh auth policies.

* **Tests**
* Added unit and e2e tests for OIDC config handling, claim composition,
CEL expressions, and token-based access.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Sonnet 4.5 <[email protected]>
<!--- Provide a general summary of your changes in the Title above -->

## Description
The top-levels values of Total Requests, Total Errors, Success Rate, and
Active Users, are now calculated also based on the selected model(s).

## How Has This Been Tested?
Tested manually

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Improved Usage dashboard to be model-aware and aggregated per
user/subscription/namespace, enhancing accuracy for Active Users,
Success Rate, Token Consumption, Total Errors, and Total Requests.
* Updated gating and fallback logic for authorization and rate-limit
signals to produce more reliable time-series and per-user token
consumption reporting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Arik Hadas <[email protected]>
<!--- Provide a general summary of your changes in the Title above -->

## Description
Add / Update documentation for Maas Tenant based architecture change.

[RHOAIENG-58939](https://redhat.atlassian.net/browse/RHOAIENG-58939)

## How Has This Been Tested?
N/A - docs only changes

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Updated installation and setup to use the Tenant CR as the primary
platform configuration and documented controller self-bootstrapping of a
default-tenant
* Clarified architecture: Tenant reconciler responsibilities, resource
ownership/cleanup, and reconciler split
* Updated quickstart and verification steps, namespace/resource
locations, and command examples
  * Added Tenant CR details to release notes and configuration guidance
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--- Provide a general summary of your changes in the Title above -->

## Description
Restrict the maas-controller-role ClusterRole so that get on secrets is
scoped to only `maas-db-config` via resourceNames, rather than granting
get on all secrets cluster-wide.

Why list/watch remains unrestricted: The controller-runtime informer
started by `Watches(&corev1.Secret{})` requires list and watch.
Kubernetes RBAC does not support resourceNames on collection verbs — the
API server ignores it. The Watch already has a client-side predicate
(`secretNamedMaaSDB`) that filters to only `maas-db-config`.

Addresses: lphiri's feedback on PR #735 — "The DB secret has a specific
name we know ahead of time. I suggest creating a role with a
resourceName constraint."

https://redhat.atlassian.net/browse/RHOAIENG-58934

How Has This Been Tested?
```
make -C maas-controller manifests — generated YAML matches updated markers
make -C maas-controller test — all unit tests pass (no logic changes)
make -C maas-controller verify-codegen — confirms markers and YAML are in sync
```
To verify on a cluster after deploy:
```
# Should return "yes"
kubectl auth can-i get secrets/maas-db-config \
  --as=system:serviceaccount:opendatahub:maas-controller -n opendatahub
# Should return "no"
kubectl auth can-i get secrets/some-other-secret \
  --as=system:serviceaccount:opendatahub:maas-controller -n opendatahub
```

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Security Updates**
* Restricted secret access: get permission is now limited to the
specific maas-db-config secret.
* List and watch permissions for secrets remain available, preserving
discovery and monitoring capabilities while reducing exposure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Add the `tenants.maas.opendatahub.io` CRD to e2e must-gather artifact
collection and the auth debug report.

## Description
The Tenant CRD was recently introduced but was missing from the e2e
debugging and artifact-collection utilities in `auth_utils.sh`.

- Add `tenants.maas.opendatahub.io` to the `MAAS_CRDS` array so
`collect_maas_crs()` dumps Tenant CR YAML to `tenants.yaml`.
- Add `kubectl get tenants` to `collect_cluster_state()` alongside other
MaaS CRs.
- Add Tenant listing and status/condition detail to
`run_auth_debug_report()` under the MaaS CRs section.
- Update header comment to document the new `tenants.yaml` artifact.

## How it was tested
- Verified script syntax with `bash -n`.
- Confirmed the Tenant CRD name matches
`deployment/base/maas-controller/crd/bases/maas.opendatahub.io_tenants.yaml`.
- Confirmed namespace usage aligns with
`TenantReconciler.TenantNamespace` (sourced from
`--maas-subscription-namespace`).

Made with [Cursor](https://cursor.com)

Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Chaitanya Kulkarni <[email protected]>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 22, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: github-actions[bot]
Once this PR has been reviewed and has the lgtm label, please assign yossiovadia for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 22, 2026

Hi @github-actions[bot]. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@chaitanya1731 chaitanya1731 merged commit fb75981 into stable Apr 22, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants