Skip to content

chore: promote main to stable#637

Open
github-actions[bot] wants to merge 19 commits intostablefrom
main
Open

chore: promote main to stable#637
github-actions[bot] wants to merge 19 commits intostablefrom
main

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Mar 29, 2026

Automated promotion of 13 commit(s) from main to stable.

6b2d7e4 ci: extend Konflux docs-only skip to match Prow skip rules (#642)
291d215 docs: fix access probe response codes in model-listing-flow (#559)
4f06bcf feat: enable configurable Authorino caching for AuthPolicy evaluators (#621)
0b67ec8 feat: add E2E tests and documentation for ephemeral API key cleanup (#639)
babc125 ci: skip Konflux builds and e2e for docs-only PRs (#634)
de748ab fix: fixing bug with resources deletion (authPolicy and TRLP) after removing model ref (#611)
d160b6c feat: integrating OIDC Authentication with maas API AuthPolicy (#598)
ceb7b04 feat: add ephemeral API key cleanup endpoint and CronJob (#581)
5bf50ff chore: remove tier-based access system from maas (#548)
054be5e feat: add external model reconciler for Istio-based egress routing (#582)
a96c94a docs: remove stale Ephemeral Tokens section from maas-api README (#555)
9ae6624 chore: add elunin to OWNERS reviewers (#638)
661ac33 fix: add httproutes/finalizers RBAC for TRLP owner references (#635)

The problem: SetControllerReference sets blockOwnerDeletion: true by
default. Kubernetes' OwnerReferencesPermissionEnforcement admission
controller then requires the service account to have update permission
on httproutes/finalizers. But we did not update the ClusterRole RBAC to
add that permission. Before this commit, SetControllerReference was
never called, so the permission was never needed.

Fixing by adding httproutes/finalizers with the update verb to the
maas-controller ClusterRole RBAC

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
  * Updated RBAC to allow HTTPRoute finalizer updates.
* Added test-script RBAC setup (ClusterRole + RoleBinding) to ensure
admin checks succeed during token setup.

* **Tests**
* Improved end-to-end debug output to show policies and subscription
statuses.
* Added synchronization to wait for TokenRateLimitPolicy enforcement
before issuing requests.
* Replaced generic reconcile barriers with targeted readiness waits for
auth policies and subscriptions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Dmytro Zaharnytskyi <zdmytro@redhat.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: github-actions[bot]
Once this PR has been reviewed and has the lgtm label, please assign dmytro-zaharnytskyi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 29, 2026

Hi @github-actions[bot]. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

EgorLu and others added 12 commits March 29, 2026 13:28
## Summary
- Add `elunin` to the reviewers list in OWNERS

## Test plan
- [x] Verify only the reviewers section is modified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
  * Updated repository review configuration to add a new reviewer.

**Note:** This is an administrative change with no impact on user-facing
features or functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary

- Remove stale Ephemeral Tokens section documenting non-existent `POST
/v1/tokens` endpoint
- Update deploy-dev description to remove SA token provider reference

The `/v1/tokens` route is not registered in `maas-api/cmd/main.go`. The
same README already notes "ServiceAccount-based tokens have been
removed."

---
*Created by document-review workflow `/create-prs` phase.*

Signed-off-by: Mynhardt Burger <mynhardt@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
)

Adds a reconciler that watches `MaaSModelRef` CRs with
`kind=ExternalModel` and creates the Istio resources needed to route
traffic from the MaaS gateway to external inf providers.

An initial PR was opened here for more details and an initial review:
opendatahub-io/ai-gateway-payload-processing#17

I made some modifications to try and align with #571. Once it merges,
I'll change or open a new PR to have `specFromAnnotations()` to read
provider and endpoint from `spec.modelRef` fields instead of
annotations.

**Quick Summary** 

When a MaaSModelRef with kind=ExternalModel is created, the reconciler
creates 4 resources in the gateway namespace: ExternalName Service (DNS
bridge), ServiceEntry (mesh registration), DestinationRule (TLS
origination), and HTTPRoute (routing + Host header). Resources live
cross-namespace from the CR, so cleanup uses a finalizer with a stored
resource key annotation rather than OwnerReferences.

Configuration is read from annotations until the CRD patch merges. The
stored key (maas.opendatahub.io/last-resource-key) ensures cleanup works
even if annotations are changed or removed before deletion. On update,
old resources are cleaned up before new ones are created.

More details in the
[here](opendatahub-io/ai-gateway-payload-processing#17)
that @jland-redhat reviewed 🙏

Ty!


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added an External Model Reconciler to create and manage
cross-namespace routing to external AI providers (ExternalName service,
mesh registry entry, optional TLS rule, and HTTPRoute) with
finalizer-driven cleanup.

* **Documentation**
* Added guide with annotation-driven configuration, provider examples,
routing behavior, and cleanup semantics.

* **Tests**
* Added unit tests for resource construction, naming/sanitization, and
protocol/TLS behavior.

* **Chores**
* Updated module dependencies and extended controller RBAC to allow
managing external networking resources.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>
JIRA - https://redhat.atlassian.net/browse/RHOAIENG-53702

## Summary

Remove the tier-based access system from maas-billing, replaced by the
subscription model.

## Changes (summarized by AI)

- **Go code:** Delete entire `maas-api/internal/tier/` package (handler,
mapper, types, tests), remove `POST /v1/tiers/lookup` route from
`main.go`, remove `TierMappingConfigMap` constant, and clean up
`ClusterConfig` by removing `ConfigMapLister`, dead
`NamespaceLister`/`ServiceAccountLister`, and their core informer
factories.
- **Deployment:** Delete `tier-to-group-mapping` ConfigMap and its
kustomization. Remove dead RBAC rules (`configmaps`, `namespaces`,
`serviceaccounts`, `serviceaccounts/token`, `tokenreviews`) that were
only used by the tier/token system.
- **OpenAPI spec:** Remove `/v1/tiers/lookup` endpoint,
`TierLookupRequest`/`TierLookupResponse`/`TierErrorResponse` schemas,
and `tiers` tag.
- **Scripts & docs:** Update `deploy.sh` help text, fix `README.md` and
`tls-backend/README.md` tier references. Delete 4 obsolete tier docs,
rewrite `token-management.md`, `group-membership-known-issues.md`, and
`architecture.md` for the subscription model. Migration script and
migration docs are **kept** for clusters upgrading 3.3 to 3.4.
- **Test fixtures:** Delete `configmaps.go`, remove tier helpers from
`server_setup.go` and `fake_listers.go`, update `StubTokenProviderAPIs`
callers.

I verified it on a live cluster: custom image built, deployed, tier
endpoint returns 404, inference via API keys and per-route subscription
policies works end-to-end.

PR is NOT ready for review yet


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Architecture Changes**
* Moved to CRD-driven configuration and per-route policy enforcement for
authentication and subscriptions.

* **Removed Features**
* Removed the /v1/tiers/lookup endpoint and the tier-to-group
mapping/mapper and related test fixtures.

* **Security / RBAC**
* Adjusted RBAC: removed some token/configmap rules and added read
access for routing and observability resources.

* **Documentation**
* Updated architecture, management, token, TLS, examples, and READMEs to
reflect CRD-based flows and new internal validation endpoints.

* **Tests / Chores**
* Removed tier-related tests/helpers; added a deployment pre-step and
updated deploy help text.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Dmytro Zaharnytskyi <zdmytro@redhat.com>
<!--- Provide a general summary of your changes in the Title above -->

## Description
Add POST /internal/v1/api-keys/cleanup endpoint to delete expired
ephemeral keys from PostgreSQL, with a K8s CronJob running every 15
minutes. The DELETE query leverages the existing partial index
idx_api_keys_ephemeral_expired for efficient lookups.
<!--- Describe your changes in detail -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran
to -->
<!--- see how your change affects other areas of the code, etc. -->

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automated cleanup of expired ephemeral API keys running every 15
minutes.

* **Improvements**
* Ephemeral API keys now require an explicit expiration date to prevent
invalid configurations.

* **Chores**
  * Enhanced infrastructure security controls for cleanup operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Wen Liang <liangwen12year@gmail.com>
JIRA - https://redhat.atlassian.net/browse/RHOAIENG-47977

Adding optional external OIDC authentication to the maas-api AuthPolicy
so maas can accept Keycloak-issued JWTs in addition to the existing
flow. + sample of Keycloak installation.

Not ready for review (awaits changes to how we modify AuthPolicy, coming
change with a new CRD introduction)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* External OIDC support with optional Keycloak deploy/cleanup and deploy
CLI options; mint API keys using OIDC tokens and conditional runtime
auth-policy adjustments.

* **Documentation**
* Full Keycloak setup guide, test-realm examples, updated
token-management and deployment docs with OIDC/rollout guidance and
scripts usage.

* **Tests**
* New end-to-end tests for external OIDC: token minting, API-key
creation, model listing, and inference with minted keys.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Dmytro Zaharnytskyi <zdmytro@redhat.com>
Co-authored-by: Jim Rhyness <jrhyness@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
…emoving model ref (#611)

JIRA - https://redhat.atlassian.net/browse/RHOAIENG-55153

Both reconcilers only iterated over current spec.modelRefs. When a model
was removed from the spec, its generated AuthPolicy /
TokenRateLimitPolicy was never cleaned up - leaving stale authpolicy
and/or trlp on the cluster.

The fix adds a cleanup step after the main reconcile loop. It lists all
managed generated policies, checks their annotation to see if this CR
contributed, and deletes any whose model is no longer in spec.modelRefs.
The existing watch mechanism then triggers remaining CRs to rebuild the
aggregated policy. The same cleanup runs in the finalizer deletion path
as well.

Four unit tests cover single-policy removal and multi-policy aggregation
for both controllers. All existing tests pass. The fix was verified on a
live OC cluster with three scenarios: single AuthPolicy removal, single
TRLP removal, and two-policy aggregation rebuild. All passed.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automatic cleanup of orphaned authentication policies and
token-rate-limit policies when model references are removed.

* **Changes**
* Authentication now uses API-key validation exclusively; Kubernetes
token path removed.
* Authorization rules converted to pattern-matching over API-key
validation.
* Response headers limited to API-key validation fields; subscription
header injection removed.

* **Tests**
* Added reconciliation tests covering removal and aggregation behavior
for auth and TRLP resources.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Dmytro Zaharnytskyi <zdmytro@redhat.com>
## Summary
- Add CEL path filter to `.tekton/` PipelineRun definitions so that PRs
touching only `docs/**` or `**/*.md` files skip the Konflux multi-arch
container builds and downstream e2e integration tests
- Document the full CI landscape (Konflux pipelines, integration tests,
GitHub Actions) in CONTRIBUTING.md

## Motivation
Previously, a docs-only PR triggered 8 container image builds (2
components x 4 architectures) and an e2e test suite that provisions an
ephemeral OpenShift cluster on AWS. None of these are meaningful for
documentation changes.

## Changes
- `.tekton/odh-maas-api-pull-request.yaml` — added `!files.all.all(x,
x.matches('^docs/') || x.matches('\.md$'))` to the CEL expression
- `.tekton/odh-maas-controller-pull-request.yaml` — same CEL filter
- `CONTRIBUTING.md` — added Konflux/Tekton pipelines section,
integration tests documentation, GitHub Actions summary table, and
docs-only skip explanation

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Adjusted PR pipeline triggers to skip Konflux/Tekton runs for
docs-only changes, reducing unnecessary CI executions.

* **Documentation**
* Revised contributing guide to describe the two-system CI model (GitHub
Actions and Konflux/Tekton), responsibilities, multi-architecture image
builds, PR vs main trigger behavior, and Konflux-driven integration
smoke tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…639)

<!--- Provide a general summary of your changes in the Title above -->

## Description

Add TestEphemeralKeyCleanup E2E test class covering:
- CronJob existence and configuration validation (schedule, security)
- NetworkPolicy existence and restriction validation
- Ephemeral key creation and search visibility (includeEphemeral filter)
- Cleanup trigger via oc exec preserves active ephemeral keys

Update documentation:
- token-management.md: ephemeral keys section, cleanup mechanics, grace
period, security model, troubleshooting commands
- maas-api-overview.md: internal endpoints table listing cleanup,
validate, and subscription select routes
<!--- Describe your changes in detail -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran
to -->
<!--- see how your change affects other areas of the code, etc. -->

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [ ] The commits are squashed in a cohesive manner and have meaningful
messages.
- [ ] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Added comprehensive documentation for ephemeral API keys with built-in
automatic cleanup capabilities
* Documented new subscription endpoints and internal cluster-only API
endpoints with security and operational details
* Included configuration guidance, operational behavior documentation,
and troubleshooting instructions for key management

* **Tests**
* Added end-to-end tests validating ephemeral key creation, search,
filtering, and automatic cleanup behavior

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Wen Liang <liangwen12year@gmail.com>
…#621)

## Description

Add Kuadrant AuthPolicy caching for apiKeyValidation metadata and all
OPA authorization rules (auth-valid, subscription-valid,
require-group-membership). Make cache TTLs configurable via controller
environment variables.

Changes:
- Add --metadata-cache-ttl and --authz-cache-ttl flags to
maas-controller
- Add cache blocks to apiKeyValidation metadata (HTTP call to
/api-keys/validate)
- Update subscription-info metadata cache to use configurable TTL
- Add cache blocks to auth-valid, subscription-valid, and
require-group-membership OPA authorization evaluators
- Add METADATA_CACHE_TTL and AUTHZ_CACHE_TTL environment variables to
deployment
- Add params.env configuration for ODH overlay deployments
- Add comprehensive tests for cache configuration and key isolation
- Add security tests verifying cache keys prevent cross-principal
leakage
- Document cache configuration and security properties

Cache Key Design:
- apiKeyValidation: keyed by API key value
- subscription-info: keyed by username|groups|subscription|model
- auth-valid: keyed by auth-type|identity|model
- subscription-valid: same as subscription-info (cache coherence)
- require-group-membership: keyed by username|groups|model

All cache keys include sufficient dimensions to prevent cross-user,
cross-API-key, and cross-model cache sharing.

Default TTL: 60 seconds for both metadata and authorization caches
Configurable via deployment/overlays/odh/params.env


## How Has This Been Tested?

Tests Added
   
1. TestMaaSAuthPolicyReconciler_CachingConfiguration (4 test cases)
  - Purpose: Verifies cache blocks are generated with configurable TTLs
- What it tests:
    - Default TTLs (60s for both metadata and authz)        
- Custom metadata TTL (300s)
- Custom authz TTL (30s)
- Both custom TTLs together
- Coverage:
- All 5 evaluators have cache blocks present
- TTL values match configuration
- Cache key selectors are non-empty
- Tests both metadata evaluators (apiKeyValidation, subscription-info)
- Tests all 3 authorization evaluators (auth-valid, subscription-valid,
require-group-membership)
2. TestMaaSAuthPolicyReconciler_CacheKeyIsolation (5 subtests)
- Purpose: Security-critical test verifying cache keys prevent
cross-principal leakage
- What it tests:
- apiKeyValidation: Includes API key from request headers
- subscription-info: Includes username, groups, and model namespace/name
- auth-valid: Distinguishes API keys vs K8s tokens, includes identity
and model
- subscription-valid: Matches subscription-info key exactly (cache
coherence)
- require-group-membership: Includes username, groups, and model
- Coverage:
- Verifies required components are present in cache keys (username,
groups, model, etc.)
- Confirms groups are joined for stable string representation
- Validates cache coherence between metadata and authorization
3. TestMaaSAuthPolicyReconciler_CacheKeyModelIsolation (4 subtests)
- Purpose: Verifies different models generate different cache keys
- What it tests:
- Creates two models (llm-1, llm-2) and reconciles both
- For each of 4 evaluators (subscription-info, auth-valid,
subscription-valid, require-group-membership):
- Cache keys differ between models
- Each model's cache key contains its own model name
- Coverage:
- Prevents cross-model cache sharing
- Confirms model namespace/name is part of cache key
Test Results
✅ All existing tests continue to pass (48 tests total)
✅ New cache tests: 13 subtests across 3 test functions
✅ 100% pass rate


## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Configurable cache TTLs for metadata and authorization via
METADATA_CACHE_TTL and AUTHZ_CACHE_TTL (defaults: 60s); authorization
TTL is capped to metadata TTL and logs a startup warning if exceeded.

* **Documentation**
* Added an Authorino caching guide with deployment examples, cache-key
composition, tuning/monitoring, and operational/security notes; README
updated with the new env vars.

* **Behavior Change**
* API key validation now returns the internal key ID as the user
identifier.

* **Tests**
* Added reconciliation and cache-key isolation tests covering TTL
behavior, capping, and selector composition.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
## Summary
- Remove incorrect `3xx` from the list of response codes that grant
access in the model-listing-flow documentation
- The source code (`discovery.go:256-300`) confirms that 3xx codes fall
into the `default` switch case which returns `authRetry` (not
`authGranted`). Only `2xx` and `405` result in `authGranted`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Bug Fixes**
* Updated model listing behavior to refine access validation criteria.
The model listing endpoint now filters models more precisely based on
HTTP response codes from endpoint probes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Extend the CEL expression in Tekton PipelineRun definitions to also skip
Konflux builds when only `.gitignore`, `OWNERS`, `PROJECT`, `LICENSE`,
.`github/`, or `.tekton/` files are changed — matching the existing
skip_if_only_changed regex used by the Prow ci-operator jobs.


An example of the existing prow ci-operator job skip condition:
```yaml
skip_if_only_changed: ^docs/|\.md$|^(?:.*/)?(?:\.gitignore|OWNERS|PROJECT|LICENSE|CONTRIBUTING\.md)$|^\.github/|^\.tekton/
```
-
https://github.com/openshift/release/blob/01591970ea5008bda7b05a49ce8a0905207182ee/ci-operator/jobs/opendatahub-io/models-as-a-service/opendatahub-io-models-as-a-service-main-presubmits.yaml#L21

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Chores**
* Updated pull request pipeline triggering conditions to exclude
additional configuration files and CI/CD directories from initiating
automated checks.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@chaitanya1731
Copy link
Copy Markdown
Collaborator

/retest

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 31, 2026

@github-actions[bot]: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/pr-image-mirror-maas-api 6b2d7e4 link false /test pr-image-mirror-maas-api
ci/prow/pr-image-mirror-maas-controller 6b2d7e4 link false /test pr-image-mirror-maas-controller
ci/prow/e2e-test-maas 6b2d7e4 link false /test e2e-test-maas

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

somya-bhatnagar and others added 6 commits March 31, 2026 12:12
)

## Summary

Related to - https://redhat.atlassian.net/browse/RHOAIENG-55976

This PR makes `tokenRateLimits` required on each `MaaSSubscription`
model reference and removes the unused `tokenRateLimitRef` field from
the CRD.

## Changes

### API Changes
- ✅ Removed `tokenRateLimitRef` field from `ModelSubscriptionRef`
- ✅ Made `tokenRateLimits` required with `MinItems=1` validation
- ✅ Regenerated CRD manifests and deepcopy code

### Tests
- ✅ Fixed 4 unit tests (added missing field indexes)
- ✅ All tests passing
- ✅ Live cluster validation complete

### Documentation
- ✅ Updated CRD reference docs
- ✅ Updated architecture overview
- ✅ Created migration guide:
`docs/content/migration/tokenratelimits-required-3.4.md`
- ✅ Added v3.4.0 release notes with breaking change notice

## Live Cluster Testing Results
(https://console-openshift-console.apps.ci-ln-hdzds3t-76ef8.aws-4.ci.openshift.org/)

Deployed custom controller and verified:
- ✅ Subscriptions **without** `tokenRateLimits` → **REJECTED** with
"Required value" error
- ✅ Subscriptions **with** `tokenRateLimits` → **ACCEPTED**
- ✅ Subscriptions with old `tokenRateLimitRef` → **REJECTED** with
"unknown field" error
- ✅ Controller reconciliation working correctly
- ✅ No RBAC errors

## Migration Impact

**Before (3.3 and earlier):**
```yaml
modelRefs:
  - name: my-model
    namespace: llm
    tokenRateLimits:  # Optional
      - limit: 1000
        window: 1m
    tokenRateLimitRef: "some-ref"  # Unused by controller
```

**After (3.4+):**
```yaml
modelRefs:
  - name: my-model
    namespace: llm
    tokenRateLimits:  # REQUIRED - at least one
      - limit: 1000
        window: 1m
    # tokenRateLimitRef removed
```

## Migration Guide

See detailed migration instructions in
`docs/content/migration/tokenratelimits-required-3.4.md`

Users must add inline `tokenRateLimits` to all `MaaSSubscription`
resources before upgrading to 3.4.

## Acceptance Criteria

All criteria from the user story met:

- [x] `tokenRateLimitRef` not present in API
- [x] Each `modelRefs` entry must include non-empty `tokenRateLimits`
- [x] Validation rejects non-compliant resources with clear messages
- [x] Tests cover required limits, no `tokenRateLimitRef` references
- [x] Generated CRD YAML matches types
- [x] Documentation updated
- [x] Migration guide provided
- [x] Breaking change documented in release notes

## Files Changed

- `maas-controller/api/maas/v1alpha1/maassubscription_types.go` - API
types
-
`deployment/base/maas-controller/crd/bases/maas.opendatahub.io_maassubscriptions.yaml`
- CRD manifest
- `maas-controller/api/maas/v1alpha1/zz_generated.deepcopy.go` -
Generated code
-
`maas-controller/pkg/controller/maas/maassubscription_controller_test.go`
- Test fixes
-
`docs/content/configuration-and-management/maas-controller-overview.md`
- Architecture docs
- `docs/content/reference/crds/maas-subscription.md` - CRD reference
- `docs/content/migration/tokenratelimits-required-3.4.md` - Migration
guide (new)
- `docs/content/release-notes/index.md` - Release notes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **Breaking Changes**
* Rate limit configuration for each model subscription is now mandatory
and must be specified inline rather than referenced externally
  * At least one rate limit entry is required per model
  * Rate limit values must be greater than zero

* **Documentation**
* Updated configuration and CRD documentation to reflect new rate limit
requirements
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
# Upgrade to Kuadrant 1.4.2
https://redhat.atlassian.net/browse/RHOAIENG-55749
https://redhat.atlassian.net/browse/RHOAIENG-54157
## Summary
Upgrades deployment to use Kuadrant 1.4.2 (from 1.3.1) to enable
authorization header stripping for security and migrates AuthPolicy
configurations to the new schema format required
  by Authorino v0.23.1.                                     
## Changes
   
### 1. Kuadrant Upgrade
- Update `scripts/deploy.sh` to install Kuadrant v1.4.2 catalog (was
v1.3.1)
- Brings Authorino v0.23.1 (was v0.22.0)
- **Required for authorization header stripping capability**
### 2. Authorization Header Stripping for Internal Models
**Security enhancement to prevent credential exfiltration**
Added authorization header stripping in
`maas-controller/pkg/controller/maas/maasauthpolicy_controller.go` for
internal KServe models.
**Problem:** User credentials (OpenShift tokens or API keys) were being
forwarded to model backends, creating a credential exfiltration risk
where malicious or compromised models
  could capture and misuse tokens.                          
**Solution:** Authorino now strips the Authorization header before
forwarding requests to internal model backends:
   
```go
  rule["response"] = map[string]any{                        
"success": map[string]any{
"headers": map[string]any{
// Strip Authorization header to prevent token exfiltration
"Authorization": map[string]any{
"plain": map[string]any{
"value": "",
},
"key": "authorization",
"metrics": false,
"priority": int64(0),
},
// ... other headers (username, groups, subscription)
Impact:
- User tokens and API keys are validated by Authorino but NOT forwarded
to model services
- Applies to both inference requests and model discovery probing
- Only affects internal KServe models (ExternalModel resources with
credentialRef are unaffected)
Version requirement verified: Testing confirmed that Kuadrant 1.4.1 /
Authorino v0.24.0 does not properly strip headers (appends empty value
but leaves original token present).
Kuadrant 1.4.2 / Authorino v0.23.1 correctly replaces the authorization
header.
3. AuthPolicy Schema Migration
                                                            
Migrate deployment/base/maas-api/policies/auth-policy.yaml from CEL
predicate format to selector/operator format:
                                                            
Before (v1.3.1 format):
  when:                                                     
- predicate: request.headers.authorization.startsWith("Bearer sk-oai-")
After (v1.4.2 format):
when:
- selector: request.headers.authorization
operator: matches
value: "^Bearer sk-oai-.*"
4. Fallback Header Handling
- Removed negative lookahead regex from fallback headers
(X-MaaS-Username-OC, X-MaaS-Group-OC)
- Authorino v0.23.1's matches operator doesn't support ^Bearer
(?!sk-oai-).* pattern
- Now rely on priority system instead:
- API key headers: priority: 0 (higher priority, evaluated first)
- Fallback headers: priority: 1 (lower priority, only used when API key
when clause doesn't match)
5. Subscription Header Fix
Fixed handling of multiple X-Maas-Subscription header values in
maas-api/internal/handlers/models.go:
                                                            
Problem: When clients send x-maas-subscription header (even empty
string), Authorino appends its injected value, resulting in:
  X-Maas-Subscription: ["", "simulator-subscription"]       
Gin's GetHeader() returns only the first value (empty string), causing
403 errors.
Solution: Iterate header values in reverse order and take the last
non-empty value:
headerValues := c.Request.Header.Values("X-Maas-Subscription")
for i := len(headerValues) - 1; i >= 0; i-- {
trimmed := strings.TrimSpace(headerValues[i])
if trimmed != "" {
requestedSubscription = trimmed
break
}
}
This ensures we use Authorino's validated subscription when available,
while still supporting clients that don't send the header.
6. Documentation Updates
Updated version requirements in README.md and
docs/content/install/prerequisites.md:
  - MaaS v0.2.0+ requires Kuadrant 1.4.2+ (ODH) or RHCL 1.3+ (RHOAI)
- Added detailed explanation of authorization header stripping security
requirement
                                                            
Testing
                                                            
All E2E tests passing (74 passed):
                                                            
Fixed by subscription header change:
- ✅ test_empty_subscription_header_value - correctly auto-selects
subscription when empty header sent
- ✅ test_api_key_ignores_subscription_header - correctly uses API key's
bound subscription, ignoring client header
                                                            
Verified working:
  - ✅ API key authentication with subscription binding     
- ✅ OpenShift token authentication
  - ✅ Subscription-scoped model filtering                  
- ✅ Rate limiting with TokenRateLimitPolicy
- ✅ Namespace-scoped resource access
- ✅ Cross-namespace model references
Authorization header stripping verified:
- ✅ Kuadrant 1.4.2 / Authorino v0.23.1: Header correctly stripped
(backend receives empty string)
- ❌ Kuadrant 1.4.1 / Authorino v0.24.0: Header NOT properly stripped
(original token still present)
Compatibility
                                                            
The new AuthPolicy schema format is backward compatible with Kuadrant
1.3.1. This was verified by:
  1. Deploying main branch code (predicate format) with Kuadrant 1.4.2 ✅
2. Deploying new schema format with Kuadrant 1.3.1 ✅
                                                            
This allows safe deployment of auth-policy changes before operator
upgrade.
Migration Notes
For clusters upgrading from Kuadrant 1.3.x to 1.4.x:
   
1. Apply updated AuthPolicy first (backward compatible)
  2. Upgrade Kuadrant operator to 1.4.2                     
3. Monitor Authorino reconciliation - should be seamless
4. Validate auth flows - API key and OpenShift token authentication
5. Verify header stripping - User tokens should not reach model backends
No service disruption expected during upgrade.
Related Issues
                                                            
- [RHOAIENG-55749](https://redhat.atlassian.net/browse/RHOAIENG-55749):
Validate Kuadrant/RHCL Compatibility and Upgrade
- [RHOAIENG-54157](https://redhat.atlassian.net/browse/RHOAIENG-54157):
Strip Authorization Header to Prevent Credential Exfiltration
Merge criteria:
- The commits are squashed in a cohesive manner and have meaningful
messages.
- Testing instructions have been added in the PR body (for PRs involving
changes that are not immediately obvious).
- The developer has manually tested the changes and verified that the
changes work
Summary by CodeRabbit
- Security
- Added authorization header stripping for internal models to prevent
credential exfiltration to model backends
- Bug Fixes
- Improved API key validation logic for more robust authentication
handling
- Enhanced subscription header processing to correctly handle multiple
header instances
- Chores
- Upgraded Kuadrant policy-engine dependency from v1.3.1 to v1.4.2
- Documentation
- Updated version requirements to specify Kuadrant 1.4.2+ / RHCL 1.3+
for MaaS v0.2.0+
- Added security context explanation for authorization header stripping
requirement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Strengthened API-key checks using regex-based matching and adjusted
header precedence so API-key-derived values override fallback headers.
* Fixed subscription header handling to pick the last non-empty value
when multiple headers are present and tightened non-empty checks.
* Ensure upstream Authorization is stripped from responses to prevent
credential forwarding.

* **Documentation**
* Updated prerequisites and migration notes; clarified minimum versions.

* **Chores**
  * Bumped Kuadrant policy-engine reference to v1.4.2.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
## Description
Remove identity headers from model inference requests and simplify
subscription-info handling in controller-generated AuthPolicies
(defense-in-depth security improvements).
Changes:
- Remove X-MaaS-Username, X-MaaS-Group, and X-MaaS-Key-Id headers from
requests forwarded to model workloads
- Simplify AuthPolicy by passing complete subscription-info object
instead of extracting individual fields
- Fix CEL syntax error in subscription_info has() check that was
blocking AuthPolicy enforcement
- Identity data remains available to TokenRateLimitPolicy and telemetry
via filters.identity mechanism
**Security Motivation:**
Model workloads (vLLM, Llama.cpp, external APIs) do not require identity
headers. Removing them prevents accidental disclosure in:
  - Model runtime logs                                      
- Upstream debug dumps
- Misconfigured proxies or sidecars
- External provider logs (for ExternalModel backends)
**Scope:**
Only affects controller-generated AuthPolicies for model inference
routes. The static maas-api AuthPolicy is unchanged (maas-api's
ExtractUserInfo middleware still requires headers).
                                                            
**Subscription Info Changes:**
                                                            
Before (multiple extracted fields):
  ```yaml                                                   
filters:
identity:
- organizationId: auth.metadata["subscription-info"].organizationId
- costCenter: auth.metadata["subscription-info"].costCenter
- subscription_labels: auth.metadata["subscription-info"].labels
After (single object):
filters:
identity:
      - subscription_info: auth.metadata["subscription-info"]
Consumers update expressions from auth.identity.organizationId to
auth.identity.subscription_info.organizationId.
How Has This Been Tested?
Unit Tests:
                                                            
1. TestMaaSAuthPolicyReconciler_IdentityHeadersNotForwarded
- Verifies headers (X-MaaS-Username, X-MaaS-Group, X-MaaS-Key-Id) are
NOT in response.success.headers
- Confirms identity data IS available via filters.identity for
TRLP/telemetry
- Ensures defense-in-depth without breaking rate limiting or metrics
2. TestMaaSAuthPolicyReconciler_SubscriptionInfoObject
- Verifies subscription_info field returns complete object (not
extracted fields)
- Confirms organizationId, costCenter, and labels are accessible via
nested paths
- Tests has() check uses correct CEL syntax (subscription-info.name)
Integration Testing:
- ✅ All existing controller tests pass (no regressions)
  - ✅ CEL syntax fix unblocks AuthPolicy enforcement (Enforced=True)
- ✅ TelemetryPolicy updated to use nested subscription_info fields
Manual Testing:
Smoke tests were blocked by CEL syntax error; fixed in this PR. CI
pipeline will validate full e2e flows.
                                                            
Merge criteria:
                                                            
- The commits are squashed in a cohesive manner and have meaningful
messages.
- Testing instructions have been added in the PR body (for PRs involving
changes that are not immediately obvious).
- The developer has manually tested the changes and verified that the
changes work
  

For: 
https://redhat.atlassian.net/browse/RHOAIENG-55482
https://redhat.atlassian.net/browse/RHOAIENG-55480


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Identity headers are no longer forwarded to upstream model workloads;
gateway-only consumption enforced.
* Telemetry label sources corrected to consistently use subscription
info fields for metrics.

* **Documentation**
* Clarified identity propagation and defense-in-depth: what is exposed
to gateway telemetry and rate limiting, and what is not forwarded to
upstream workloads.

* **Tests**
* Added reconciliation test verifying no identity headers upstream and
required subscription telemetry.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
## Summary
- Add missing `TLS_SELF_SIGNED` and `TLS_MIN_VERSION` environment
variables to the TLS configuration docs
- Add default values column to the env var reference table
- Document precedence behavior (cert/key files take priority over
self-signed)

Resolves:
[RHOAIENG-55147](https://redhat.atlassian.net/browse/RHOAIENG-55147)

## Code references

Each documented claim maps to the implementation as follows:

| Documentation claim | Source |
|---|---|
| `TLS_CERT` env var, default `""` |
[`tls.go:70`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L70)
|
| `TLS_KEY` env var, default `""` |
[`tls.go:71`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L71)
|
| `TLS_SELF_SIGNED` env var, default `false` |
[`tls.go:68`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L68)
|
| `TLS_MIN_VERSION` env var, accepts `1.2` or `1.3`, default `1.2` |
[`tls.go:73`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L73)
(default),
[`tls.go:96`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L96)
(env read),
[`tls.go:32-42`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L32-L42)
(valid values) |
| Self-signed generates cert at startup (in-memory) |
[`server.go:44-48`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/cmd/server.go#L44-L48)
→
[`cert.go:29-68`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/cert/cert.go#L29-L68)
|
| Cert/key files take precedence over self-signed |
[`tls.go:92-94`](https://github.com/opendatahub-io/models-as-a-service/blob/main/maas-api/internal/config/tls.go#L92-L94)
|

## Test plan
- [x] Verify docs render correctly with `mkdocs serve`
- [x] Confirm env var descriptions match behavior in
`maas-api/internal/config/tls.go`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Updated TLS configuration documentation with newly available
environment variables for self-signed certificate generation and minimum
TLS version settings.
* Clarified HTTPS activation behavior and configuration precedence rules
when multiple TLS options are specified.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Basically this pr just adds the ability to update with a new release
without updating the `latest` alias. Meaning we can cut docs for 3.4 but
still have 3.3 be our default for now.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a prerelease option to control release vs. prerelease status and
adjust docs deployment behavior.
* Added a manual workflow to promote an existing release tag and update
the docs "latest" alias.

* **Chores**
* Release workflow updated to respect prerelease semantics when
publishing releases and docs.
* Pinned documentation syntax-highlighting dependency to a safe version
range.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…rom maas-api (#656)

## Summary
Add a dedicated GitHub Actions CI workflow for maas-controller and
remove obsolete image build/push steps from the maas-api CI workflow.
Ref - https://redhat.atlassian.net/browse/RHOAIENG-52494
## Description
- Add `.github/workflows/maas-controller-ci.yml` with lint and test jobs
triggered on PRs to `maas-controller/**`.
- Go version and golangci-lint version are resolved dynamically from
`go.mod` and `tools.mk`.
- Coverage artifacts (`coverage.out`, `coverage.html`) are uploaded on
every run.
- Remove the `Build image` step and `push`-on-merge trigger from
`maas-api-ci.yml` — container builds are now handled by Konflux.
- Rename the maas-api `build` job to `test` to reflect its actual
purpose.
- Update maas-controller Makefile test target to produce coverage output
(`-race -coverprofile`) consistent with maas-api.
## How it was tested
- Ran `make lint` and `make test` locally for both maas-controller and
maas-api — all tests pass.
- Verified `coverage.out` and `coverage.html` are generated by the
updated maas-controller test target.

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets
all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->

- [x] The commits are squashed in a cohesive manner and have meaningful
messages.
- [x] Testing instructions have been added in the PR body (for PRs
involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that
the changes work


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **CI/CD**
* CI workflows updated: API workflow now runs tests only on pull
requests; added a controller workflow with linting, testing, and
coverage artifact uploads. Removed image build step and pinned action
versions for reproducible runs.
* **Tests**
* Test helpers made safer to avoid panics; test target now produces
coverage output and HTML.
* **Bug Fixes**
  * Improved error wrapping/handling in parsing paths.
* **Refactor**
* Consistent data-structure usage and minor formatting improvements for
maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Chaitanya Kulkarni <ckulkarn@redhat.com>
Signed-off-by: Chaitanya Kulkarni <chkulkar@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants