Skip to content

Commit 7361ec8

Browse files
Yuriy TeodorovychYuriy Teodorovych
authored andcommitted
Merge branch 'main' into yt-sad-admin-cache
2 parents 03a8e73 + f27c5a8 commit 7361ec8

10 files changed

Lines changed: 258 additions & 54 deletions

File tree

.github/hack/install-odh.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,9 @@ EOF
208208
fi
209209
fi
210210

211-
# 7. Apply DataScienceCluster (modelsAsService Unmanaged - MaaS deployed separately)
211+
# 7. Apply DataScienceCluster (KServe + ModelsAsService Managed)
212+
# The manifest filename retains "unmanaged" for backward compat; contents include
213+
# modelsAsService.managementState: Managed so the operator deploys maas-controller.
212214
echo "7. Applying DataScienceCluster..."
213215
if kubectl get datasciencecluster -A --no-headers 2>/dev/null | grep -q .; then
214216
echo " DataScienceCluster already exists, skipping"

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

Lines changed: 98 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,29 @@ spec:
8181
datasource:
8282
kind: PrometheusDatasource
8383
name: kuadrant-prometheus-datasource
84-
query: 'count(count by (user) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) > 0)) or vector(0)'
84+
query: |-
85+
count(
86+
count by (user) (
87+
(
88+
(
89+
sum by (user, subscription, limitador_namespace) (
90+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
91+
)
92+
+
93+
sum by (user, subscription, limitador_namespace) (
94+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
95+
)
96+
)
97+
or
98+
sum by (user, subscription, limitador_namespace) (
99+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
100+
)
101+
)
102+
* on(user, subscription, limitador_namespace)
103+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
104+
> 0
105+
)
106+
) or vector(0)
85107
seriesNameFormat: Users
86108
successRate:
87109
kind: Panel
@@ -105,7 +127,41 @@ spec:
105127
datasource:
106128
kind: PrometheusDatasource
107129
name: kuadrant-prometheus-datasource
108-
query: '((sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))) / ((sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) + (sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0))) > 0)) or vector(1)'
130+
query: |-
131+
(
132+
(
133+
sum(
134+
sum by (user, subscription, limitador_namespace) (
135+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
136+
)
137+
* on(user, subscription, limitador_namespace)
138+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
139+
)
140+
)
141+
/
142+
(
143+
(
144+
sum(
145+
sum by (user, subscription, limitador_namespace) (
146+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
147+
)
148+
* on(user, subscription, limitador_namespace)
149+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
150+
)
151+
+
152+
(
153+
sum(
154+
sum by (user, subscription, limitador_namespace) (
155+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
156+
)
157+
* on(user, subscription, limitador_namespace)
158+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
159+
)
160+
or vector(0)
161+
)
162+
) > 0
163+
)
164+
) or vector(1)
109165
seriesNameFormat: Success Rate
110166
tokenConsumptionByUser:
111167
kind: Panel
@@ -180,7 +236,15 @@ spec:
180236
query: |-
181237
round(
182238
sum by (user, subscription, model) (
183-
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
239+
(
240+
(
241+
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
242+
+
243+
sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
244+
)
245+
or
246+
sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
247+
)
184248
* on(user, subscription, limitador_namespace) group_left(model)
185249
(0 * max by (user, subscription, limitador_namespace, model) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
186250
)
@@ -228,7 +292,15 @@ spec:
228292
datasource:
229293
kind: PrometheusDatasource
230294
name: kuadrant-prometheus-datasource
231-
query: 'sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0)'
295+
query: |-
296+
sum(
297+
sum by (user, subscription, limitador_namespace) (
298+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
299+
)
300+
* on(user, subscription, limitador_namespace)
301+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
302+
)
303+
or vector(0)
232304
seriesNameFormat: Errors
233305
totalRequests:
234306
kind: Panel
@@ -253,7 +325,28 @@ spec:
253325
datasource:
254326
kind: PrometheusDatasource
255327
name: kuadrant-prometheus-datasource
256-
query: '(sum(increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0)) + (sum(increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) or vector(0))'
328+
query: |-
329+
(
330+
sum(
331+
sum by (user, subscription, limitador_namespace) (
332+
increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
333+
)
334+
* on(user, subscription, limitador_namespace)
335+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
336+
)
337+
or vector(0)
338+
)
339+
+
340+
(
341+
sum(
342+
sum by (user, subscription, limitador_namespace) (
343+
increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
344+
)
345+
* on(user, subscription, limitador_namespace)
346+
(0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
347+
)
348+
or vector(0)
349+
)
257350
seriesNameFormat: Requests
258351
totalTokens:
259352
kind: Panel

docs/content/advanced-administration/observability.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -144,18 +144,19 @@ The observability stack consists of:
144144

145145
There are two ways to enable deployment-based observability:
146146

147-
1. **Operator-managed** (recommended): Enable via ModelsAsService CR
147+
1. **Operator-managed** (recommended): Enable via Tenant CR
148148
2. **Kustomize-based**: Deploy manifests directly
149149

150150
### Option 1: Operator-Managed Telemetry
151151

152-
When using the ODH/RHOAI operator, telemetry can be enabled via the ModelsAsService CR:
152+
When using the ODH/RHOAI operator, telemetry can be enabled via the Tenant CR (self-bootstrapped by `maas-controller` in the `models-as-a-service` namespace):
153153

154154
```yaml
155-
apiVersion: components.platform.opendatahub.io/v1alpha1
156-
kind: ModelsAsService
155+
apiVersion: maas.opendatahub.io/v1alpha1
156+
kind: Tenant
157157
metadata:
158-
name: default-modelsasservice
158+
name: default-tenant
159+
namespace: models-as-a-service
159160
spec:
160161
telemetry:
161162
enabled: true # Enable TelemetryPolicy and Istio Telemetry
@@ -169,25 +170,25 @@ spec:
169170
Or patch an existing CR:
170171
171172
```bash
172-
kubectl patch modelsasservice default-modelsasservice --type=merge \
173+
kubectl patch tenant default-tenant -n models-as-a-service --type=merge \
173174
-p '{"spec":{"telemetry":{"enabled":true}}}'
174175
```
175176

176-
**What the operator creates when `telemetry.enabled: true`:**
177+
**What the Tenant reconciler creates when `telemetry.enabled: true`:**
177178

178179
| Resource | Namespace | Purpose |
179180
|----------|-----------|---------|
180181
| TelemetryPolicy (`maas-telemetry`) | Gateway namespace | Adds `user`, `subscription`, `model` labels to Limitador usage metrics |
181182
| Istio Telemetry (`latency-per-subscription`) | Gateway namespace | Adds `subscription` label to gateway latency metrics |
182183

183184
!!! note "Prerequisites for Operator-Managed Telemetry"
184-
The operator-managed telemetry feature requires:
185+
The Tenant reconciler telemetry feature requires:
185186

186187
- **OpenShift Service Mesh (Istio)** 2.4+ — for Istio Telemetry CRD
187188
- **Kuadrant/RHCL** — for TelemetryPolicy CRD and AuthPolicy header injection
188189
- **Gateway deployed** — Telemetry targets the gateway via selector
189190

190-
The operator checks for CRD availability before creating resources. If a CRD is not present, that resource is silently skipped.
191+
The Tenant reconciler checks for CRD availability before creating resources. If a CRD is not present, that resource is silently skipped.
191192

192193
!!! warning "AuthPolicy Header Dependency"
193194
The Istio Telemetry reads the `subscription` value from the `X-MaaS-Subscription` header, which must be injected by AuthPolicy:

docs/content/concepts/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ graph TB
9797
6. Only the hash and metadata (username, groups, name, `subscription` — the MaaSSubscription name bound at mint, `expiresAt`) are stored in PostgreSQL.
9898
7. The plaintext key is returned to the user **only in this minting response** (show-once), along with `expiresAt`; it is **not** exposed again on later reads. The diagram below stops at storage and does not show the HTTP response back to the user.
9999

100-
Every key expires. With **operator-managed** MaaS, the cluster operator sets the maximum lifetime on the **`ModelsAsService`** CR: **`spec.apiKeys.maxExpirationDays`** (see [ModelsAsService CR](../install/maas-setup.md#modelsasservice-cr)). **`maas-api`** applies that cap as **`API_KEY_MAX_EXPIRATION_DAYS`** (for example 90 days by default when defaults apply). Omit **`expiresIn`** on create to use that maximum, or set a shorter **`expiresIn`** (e.g., `30d`, `90d`, `1h`) within the configured cap. The response always includes **`expiresAt`** (RFC3339).
100+
Every key expires. With **operator-managed** MaaS, the cluster operator sets the maximum lifetime on the **`Tenant`** CR: **`spec.apiKeys.maxExpirationDays`** (see [Tenant CR](../install/maas-setup.md#tenant-cr)). **`maas-api`** applies that cap as **`API_KEY_MAX_EXPIRATION_DAYS`** (for example 90 days by default when defaults apply). Omit **`expiresIn`** on create to use that maximum, or set a shorter **`expiresIn`** (e.g., `30d`, `90d`, `1h`) within the configured cap. The response always includes **`expiresAt`** (RFC3339).
101101

102102
```mermaid
103103
graph TB

0 commit comments

Comments
 (0)