Skip to content

Commit fb75981

Browse files
chore: promote main to stable (#788)
Automated promotion of **9 commit(s)** from `main` to `stable`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787) 1b8f212 chore: restrict rbac for db secret (#779) e746008 docs: add/update documentation for Maas Tenant (#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752) fae753e chore: add .worktrees/ to .gitignore (#774) c01dc5b fix: minor updates for external model (#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765) ```
2 parents 89fba29 + fb2ea25 commit fb75981

22 files changed

Lines changed: 935 additions & 71 deletions

File tree

.github/hack/install-odh.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,9 @@ EOF
208208
fi
209209
fi
210210

211-
# 7. Apply DataScienceCluster (modelsAsService Unmanaged - MaaS deployed separately)
211+
# 7. Apply DataScienceCluster (KServe + ModelsAsService Managed)
212+
# The manifest filename retains "unmanaged" for backward compat; contents include
213+
# modelsAsService.managementState: Managed so the operator deploys maas-controller.
212214
echo "7. Applying DataScienceCluster..."
213215
if kubectl get datasciencecluster -A --no-headers 2>/dev/null | grep -q .; then
214216
echo " DataScienceCluster already exists, skipping"

deployment/base/maas-controller/crd/bases/maas.opendatahub.io_maassubscriptions.yaml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,11 @@ spec:
8383
description: TokenRateLimit defines a token rate limit
8484
properties:
8585
limit:
86-
description: Limit is the maximum number of tokens allowed
86+
description: |-
87+
Limit is the maximum number of tokens allowed within the window.
88+
Must be between 1 and 1,000,000,000 (1 billion).
8789
format: int64
90+
maximum: 1000000000
8891
minimum: 1
8992
type: integer
9093
window:

deployment/base/maas-controller/rbac/clusterrole.yaml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ rules:
2121
resources:
2222
- endpoints
2323
- pods
24-
- secrets
2524
verbs:
2625
- get
2726
- list
@@ -35,6 +34,21 @@ rules:
3534
- get
3635
- list
3736
- watch
37+
- apiGroups:
38+
- ""
39+
resources:
40+
- secrets
41+
verbs:
42+
- list
43+
- watch
44+
- apiGroups:
45+
- ""
46+
resourceNames:
47+
- maas-db-config
48+
resources:
49+
- secrets
50+
verbs:
51+
- get
3852
- apiGroups:
3953
- ""
4054
resources:

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

Lines changed: 98 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,29 @@ spec:
8181
datasource:
8282
kind: PrometheusDatasource
8383
name: kuadrant-prometheus-datasource
84-
query: 'count(count by (user) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) > 0)) or vector(0)'
84+
query: |-
85+
count(
86+
count by (user) (
87+
(
88+
(
89+
sum by (user, subscription, limitador_namespace) (
90+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
91+
)
92+
+
93+
sum by (user, subscription, limitador_namespace) (
94+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
95+
)
96+
)
97+
or
98+
sum by (user, subscription, limitador_namespace) (
99+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
100+
)
101+
)
102+
* on(user, subscription, limitador_namespace)
103+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
104+
> 0
105+
)
106+
) or vector(0)
85107
seriesNameFormat: Users
86108
successRate:
87109
kind: Panel
@@ -105,7 +127,41 @@ spec:
105127
datasource:
106128
kind: PrometheusDatasource
107129
name: kuadrant-prometheus-datasource
108-
query: '((sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))) / ((sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) + (sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0))) > 0)) or vector(1)'
130+
query: |-
131+
(
132+
(
133+
sum(
134+
sum by (user, subscription, limitador_namespace) (
135+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
136+
)
137+
* on(user, subscription, limitador_namespace)
138+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
139+
)
140+
)
141+
/
142+
(
143+
(
144+
sum(
145+
sum by (user, subscription, limitador_namespace) (
146+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
147+
)
148+
* on(user, subscription, limitador_namespace)
149+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
150+
)
151+
+
152+
(
153+
sum(
154+
sum by (user, subscription, limitador_namespace) (
155+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
156+
)
157+
* on(user, subscription, limitador_namespace)
158+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
159+
)
160+
or vector(0)
161+
)
162+
) > 0
163+
)
164+
) or vector(1)
109165
seriesNameFormat: Success Rate
110166
tokenConsumptionByUser:
111167
kind: Panel
@@ -180,7 +236,15 @@ spec:
180236
query: |-
181237
round(
182238
sum by (user, subscription, model) (
183-
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
239+
(
240+
(
241+
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
242+
+
243+
sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
244+
)
245+
or
246+
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
247+
)
184248
* on(user, subscription, limitador_namespace) group_left(model)
185249
(0 * max by (user, subscription, limitador_namespace, model) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
186250
)
@@ -228,7 +292,15 @@ spec:
228292
datasource:
229293
kind: PrometheusDatasource
230294
name: kuadrant-prometheus-datasource
231-
query: 'sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0)'
295+
query: |-
296+
sum(
297+
sum by (user, subscription, limitador_namespace) (
298+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
299+
)
300+
* on(user, subscription, limitador_namespace)
301+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
302+
)
303+
or vector(0)
232304
seriesNameFormat: Errors
233305
totalRequests:
234306
kind: Panel
@@ -253,7 +325,28 @@ spec:
253325
datasource:
254326
kind: PrometheusDatasource
255327
name: kuadrant-prometheus-datasource
256-
query: '(sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0)) + (sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0))'
328+
query: |-
329+
(
330+
sum(
331+
sum by (user, subscription, limitador_namespace) (
332+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
333+
)
334+
* on(user, subscription, limitador_namespace)
335+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
336+
)
337+
or vector(0)
338+
)
339+
+
340+
(
341+
sum(
342+
sum by (user, subscription, limitador_namespace) (
343+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
344+
)
345+
* on(user, subscription, limitador_namespace)
346+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
347+
)
348+
or vector(0)
349+
)
257350
seriesNameFormat: Requests
258351
totalTokens:
259352
kind: Panel

docs/content/advanced-administration/observability.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -144,18 +144,19 @@ The observability stack consists of:
144144

145145
There are two ways to enable deployment-based observability:
146146

147-
1. **Operator-managed** (recommended): Enable via ModelsAsService CR
147+
1. **Operator-managed** (recommended): Enable via Tenant CR
148148
2. **Kustomize-based**: Deploy manifests directly
149149

150150
### Option 1: Operator-Managed Telemetry
151151

152-
When using the ODH/RHOAI operator, telemetry can be enabled via the ModelsAsService CR:
152+
When using the ODH/RHOAI operator, telemetry can be enabled via the Tenant CR (self-bootstrapped by `maas-controller` in the `models-as-a-service` namespace):
153153

154154
```yaml
155-
apiVersion: components.platform.opendatahub.io/v1alpha1
156-
kind: ModelsAsService
155+
apiVersion: maas.opendatahub.io/v1alpha1
156+
kind: Tenant
157157
metadata:
158-
name: default-modelsasservice
158+
name: default-tenant
159+
namespace: models-as-a-service
159160
spec:
160161
telemetry:
161162
enabled: true # Enable TelemetryPolicy and Istio Telemetry
@@ -169,25 +170,25 @@ spec:
169170
Or patch an existing CR:
170171
171172
```bash
172-
kubectl patch modelsasservice default-modelsasservice --type=merge \
173+
kubectl patch tenant default-tenant -n models-as-a-service --type=merge \
173174
-p '{"spec":{"telemetry":{"enabled":true}}}'
174175
```
175176

176-
**What the operator creates when `telemetry.enabled: true`:**
177+
**What the Tenant reconciler creates when `telemetry.enabled: true`:**
177178

178179
| Resource | Namespace | Purpose |
179180
|----------|-----------|---------|
180181
| TelemetryPolicy (`maas-telemetry`) | Gateway namespace | Adds `user`, `subscription`, `model` labels to Limitador usage metrics |
181182
| Istio Telemetry (`latency-per-subscription`) | Gateway namespace | Adds `subscription` label to gateway latency metrics |
182183

183184
!!! note "Prerequisites for Operator-Managed Telemetry"
184-
The operator-managed telemetry feature requires:
185+
The Tenant reconciler telemetry feature requires:
185186

186187
- **OpenShift Service Mesh (Istio)** 2.4+ — for Istio Telemetry CRD
187188
- **Kuadrant/RHCL** — for TelemetryPolicy CRD and AuthPolicy header injection
188189
- **Gateway deployed** — Telemetry targets the gateway via selector
189190

190-
The operator checks for CRD availability before creating resources. If a CRD is not present, that resource is silently skipped.
191+
The Tenant reconciler checks for CRD availability before creating resources. If a CRD is not present, that resource is silently skipped.
191192

192193
!!! warning "AuthPolicy Header Dependency"
193194
The Istio Telemetry reads the `subscription` value from the `X-MaaS-Subscription` header, which must be injected by AuthPolicy:

docs/content/concepts/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ graph TB
9797
6. Only the hash and metadata (username, groups, name, `subscription` — the MaaSSubscription name bound at mint, `expiresAt`) are stored in PostgreSQL.
9898
7. The plaintext key is returned to the user **only in this minting response** (show-once), along with `expiresAt`; it is **not** exposed again on later reads. The diagram below stops at storage and does not show the HTTP response back to the user.
9999

100-
Every key expires. With **operator-managed** MaaS, the cluster operator sets the maximum lifetime on the **`ModelsAsService`** CR: **`spec.apiKeys.maxExpirationDays`** (see [ModelsAsService CR](../install/maas-setup.md#modelsasservice-cr)). **`maas-api`** applies that cap as **`API_KEY_MAX_EXPIRATION_DAYS`** (for example 90 days by default when defaults apply). Omit **`expiresIn`** on create to use that maximum, or set a shorter **`expiresIn`** (e.g., `30d`, `90d`, `1h`) within the configured cap. The response always includes **`expiresAt`** (RFC3339).
100+
Every key expires. With **operator-managed** MaaS, the cluster operator sets the maximum lifetime on the **`Tenant`** CR: **`spec.apiKeys.maxExpirationDays`** (see [Tenant CR](../install/maas-setup.md#tenant-cr)). **`maas-api`** applies that cap as **`API_KEY_MAX_EXPIRATION_DAYS`** (for example 90 days by default when defaults apply). Omit **`expiresIn`** on create to use that maximum, or set a shorter **`expiresIn`** (e.g., `30d`, `90d`, `1h`) within the configured cap. The response always includes **`expiresAt`** (RFC3339).
101101

102102
```mermaid
103103
graph TB

0 commit comments

Comments
 (0)