You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/content/configuration-and-management/maas-controller-overview.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,7 @@ flowchart TB
57
57
58
58
**Summary:** You declare intent with MaaS CRs; the controller turns that into Gateway/Kuadrant resources that attach to the same HTTPRoute and backend (e.g. KServe LLMInferenceService).
59
59
60
-
The **MaaS API** GET /v1/models endpoint uses MaaSModelRef CRs as its primary source: it lists them in the API namespace, then **validates access** by probing each model’s `/v1/models` endpoint with the client’s **Authorization header** (passed through as-is). Only models that return 2xx or 405 are included. So the catalogue returned to the client is the set of MaaSModelRef objects the controller reconciles, filtered to those the client can actually access. No token exchange is performed; the header is forwarded as-is. (Once minting is in place, this may be revisited.)
60
+
The **MaaS API** GET /v1/models endpoint uses MaaSModelRef CRs as its primary source: it reads them cluster-wide (all namespaces), then **validates access** by probing each model’s `/v1/models` endpoint with the client’s **Authorization header** (passed through as-is). Only models that return 2xx or 405 are included. So the catalogue returned to the client is the set of MaaSModelRef objects the controller reconciles, filtered to those the client can actually access. No token exchange is performed; the header is forwarded as-is.
61
61
62
62
---
63
63
@@ -213,7 +213,7 @@ flowchart LR
213
213
Deploy --> Examples
214
214
```
215
215
216
-
-**Namespace**: Controller and default MaaS CRs live in **opendatahub**(configurable).
216
+
-**Namespaces**: MaaS API and controller default to **opendatahub** (configurable). MaaSAuthPolicy and MaaSSubscription default to **models-as-a-service** (configurable). MaaSModelRef must live in the **same namespace**as the model it references (e.g. **llm**).
217
217
-**Install**: `./scripts/deploy.sh` installs the full stack including the controller. Optionally run `./scripts/install-examples.sh` for sample MaaSModelRef, MaaSAuthPolicy, and MaaSSubscription.
218
218
219
219
---
@@ -233,7 +233,7 @@ The Kuadrant AuthPolicy validates API keys via the MaaS API and validates user t
233
233
| Topic | Summary |
234
234
|-------|---------|
235
235
|**What**| MaaS Controller = control plane that reconciles MaaSModelRef, MaaSAuthPolicy, and MaaSSubscription into Gateway API and Kuadrant resources. |
236
-
|**Where**| Single controller in `maas-controller`; CRs and generated resources can live in opendatahub or other namespaces. |
236
+
|**Where**| Single controller in `opendatahub`; MaaSAuthPolicy / MaaSSubscription default to `models-as-a-service`; MaaSModelRef and generated Kuadrant policies target their model’s namespace. |
237
237
|**How**| Three reconcilers watch MaaS CRs (and related resources); each creates/updates HTTPRoutes, AuthPolicies, or TokenRateLimitPolicies. |
238
238
|**Identity bridge**| AuthPolicy exposes all user groups as a comma-separated `groups_str`; TokenRateLimitPolicy uses `groups_str.split(",").exists(...)` for subscription matching (the “string trick”). |
239
239
|**Deploy**| Run `./scripts/deploy.sh`; optionally install examples. |
- The API reads MaaSModelRefs from the informer cache, maps each to an API model (`id`, `url`, `ready`, `kind`, etc.)
66
+
- The API reads MaaSModelRefs cluster-wide, maps each to an API model (`id`, `url`, `ready`, `kind`, etc.)
67
67
- **Access validation**: Probes each model's `/v1/models` endpoint with the request's Authorization header. Only models that return 2xx or 405 are included.
68
68
- **Kind on the wire**: Each model in the GET /v1/models response carries a `kind` field from `spec.modelRef.kind`
69
69
@@ -92,6 +92,6 @@ To support a new backend type (a new **kind** in `spec.modelRef`):
92
92
## Summary
93
93
94
94
- **modelRef** is the backend reference (kind, name, optional namespace), analogous to [Gateway API BackendRef](https://gateway-api.sigs.k8s.io/reference/spec/#backendref).
95
-
- **Listing:** Always from MaaSModelRef cache; no kind-specific listing logic.
95
+
- **Listing:** Always from MaaSModelRef resources cluster-wide; no kind-specific listing logic.
96
96
- **Access validation:** Same probe (GET endpoint with the request's Authorization header as-is) for all kinds unless kind-specific probes are added later.
97
97
- **New kinds:** Implement in the controller (resolve referent, set status.endpoint and status.phase); extend the API only if the new kind cannot use the same probe path or needs different enrichment.
Copy file name to clipboardExpand all lines: docs/content/configuration-and-management/model-listing-flow.md
+6-7Lines changed: 6 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
This document describes how the **GET /v1/models** endpoint discovers and returns the list of available models.
4
4
5
-
The list is **based on MaaSModelRef** custom resources: the API returns models that are registered as MaaSModelRefs in its configured namespace.
5
+
The list is **based on MaaSModelRef** custom resources: the API considers MaaSModelRef objects cluster-wide (all namespaces), then filters by access.
6
6
7
7
## Overview
8
8
@@ -17,9 +17,9 @@ Each entry includes an `id`, **`url`** (the model’s endpoint), a `ready` flag,
17
17
18
18
## MaaSModelRef flow
19
19
20
-
When the [MaaS controller](https://github.com/opendatahub-io/models-as-a-service/tree/main/maas-controller) is installed and the API is configured with a MaaSModelRef lister and namespace, the flow is:
20
+
When the [MaaS controller](https://github.com/opendatahub-io/models-as-a-service/tree/main/maas-controller) is installed and the API is configured with a MaaSModelRef lister, the flow is:
21
21
22
-
1. The MaaS API lists all **MaaSModelRef** custom resources in its configured namespace (e.g. `opendatahub`). It reads them from an **in-memory cache**in the maas-api component (maintained by a Kubernetes informer), so it does not call the Kubernetes API on every request.
22
+
1. The MaaS API discovers **MaaSModelRef** custom resources **cluster-wide**(all namespaces) without calling the Kubernetes API on every request.
23
23
24
24
2. For each MaaSModelRef, it reads **id** (`metadata.name`), **url** (`status.endpoint`), **ready** (`status.phase == "Ready"`), and related metadata. The controller has populated `status.endpoint` and `status.phase` from the underlying LLMInferenceService (for llmisvc) or HTTPRoute/Gateway.
25
25
@@ -55,7 +55,7 @@ sequenceDiagram
55
55
56
56
-**Consistent with gateway**: The same model names and routes are used for inference; the list matches what the gateway will accept for that client.
57
57
58
-
If the API is not configured with a MaaSModelRef lister and namespace, or if listing fails (e.g. CRD not installed, no RBAC, or server error), the API returns an empty list or an error.
58
+
If the API is not configured with a MaaSModelRef lister, or if listing fails (e.g. CRD not installed, no RBAC, or server error), the API returns an empty list or an error.
59
59
60
60
## Subscription Filtering and Aggregation
61
61
@@ -174,7 +174,7 @@ To have models appear via the **MaaSModelRef** flow:
174
174
kind: MaaSModelRef
175
175
metadata:
176
176
name: my-model-name # This becomes the model "id" in GET /v1/models
177
-
namespace: opendatahub
177
+
namespace: llm # Same namespace as the LLMInferenceService
178
178
annotations:
179
179
openshift.io/display-name: "My Model" # optional: human-readable name
180
180
openshift.io/description: "A general-purpose LLM" # optional: description
@@ -184,9 +184,8 @@ To have models appear via the **MaaSModelRef** flow:
184
184
modelRef:
185
185
kind: LLMInferenceService
186
186
name: my-llm-isvc-name
187
-
namespace: llm
188
187
189
-
4. The controller reconciles the MaaSModelRef and sets `status.endpoint` and `status.phase`. The MaaS API (in the same namespace) will then include this model in GET /v1/models when it lists MaaSModelRef CRs.
188
+
4. The controller reconciles the MaaSModelRef and sets `status.endpoint` and `status.phase`. The MaaS API will then include this model in GET /v1/models when it lists MaaSModelRef CRs.
190
189
191
190
You can use the [maas-system samples](https://github.com/opendatahub-io/models-as-a-service/tree/main/docs/samples/maas-system) as a template; the install script deploys LLMInferenceService + MaaSModelRef + MaaSAuthPolicy + MaaSSubscription together so dependencies resolve correctly.
Copy file name to clipboardExpand all lines: docs/content/install/model-setup.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,11 +91,11 @@ kubectl get pods -n llm
91
91
**Validate MaaSModelRef status** — The MaaS controller populates `status.endpoint` and `status.phase` on each MaaSModelRef from the LLMInferenceService. The MaaSModelRef `status.endpoint` should match the URL exposed by the LLMInferenceService (via the gateway). Verify:
92
92
93
93
```bash
94
-
# Check MaaSModelRef status (use opendatahub for ODH, redhat-ods-applications for RHOAI)
95
-
kubectl get maasmodelref -n opendatahub -o wide
94
+
# Check MaaSModelRef status (same namespace as the LLMInferenceService, e.g. llm)
95
+
kubectl get maasmodelref -n llm -o wide
96
96
97
97
# Verify status.endpoint is populated and phase is Ready
98
-
kubectl get maasmodelref -n opendatahub -o jsonpath='{range .items[*]}{.metadata.name}: phase={.status.phase} endpoint={.status.endpoint}{"\n"}{end}'
98
+
kubectl get maasmodelref -n llm -o jsonpath='{range .items[*]}{.metadata.name}: phase={.status.phase} endpoint={.status.endpoint}{"\n"}{end}'
99
99
100
100
# Compare with LLMInferenceService — status.endpoint should match the URL from LLMIS status.addresses or status.url
101
101
kubectl get llminferenceservice -n llm -o yaml | grep "url:"
Copy file name to clipboardExpand all lines: docs/content/reference/crds/maas-model-ref.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# MaaSModelRef
2
2
3
-
Identifies an AI/ML model on the cluster. The MaaS API lists models from MaaSModelRef resources (using `status.endpoint` and `status.phase`).
3
+
Identifies an AI/ML model on the cluster. Create MaaSModelRef in the **same namespace** as the backend (`LLMInferenceService`, `ExternalModel`, etc.). The MaaS API lists models from MaaSModelRef resources cluster-wide (using `status.endpoint` and `status.phase`).
Copy file name to clipboardExpand all lines: maas-controller/README.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,8 +35,8 @@ Models with no MaaSAuthPolicy or MaaSSubscription are denied at the gateway leve
35
35
### CRDs and what they generate
36
36
37
37
As MaaS API and controller are conventionally deployed in the operator namespace (e.g., `opendatahub`), MaaS CRs need to be separated so that they can be managed with lower cluster privileges. Therefore,
38
-
-**MaasModelRef** is located in the same namespace as the **HTTPRoute** and **LLMInderenceService** it refers to; and
39
-
-**MaaSAuthPolicy** and **MaaSSubscription** are located in a dedicated subscription namespace (default: `models-as-a-service`). Set `--maas-subscription-namespace` or the `MAAS_SUBSCRIPTION_NAMESPACE` env var in `maas-controller` deployment to use another namespace. MaaS controller will only watch and reconcile those CRs this configured namespace.
38
+
-**MaaSModelRef** is located in the same namespace as the **HTTPRoute** and **LLMInferenceService** it refers to; and
39
+
-**MaaSAuthPolicy** and **MaaSSubscription** are located in a dedicated subscription namespace (default: `models-as-a-service`). Set `--maas-subscription-namespace` or the `MAAS_SUBSCRIPTION_NAMESPACE` env var in `maas-controller` deployment to use another namespace. MaaS controller will only watch and reconcile those CRs in this configured namespace.
40
40
41
41
| You create | Controller generates | Per | Targets |
# MaaSAuthPolicy in opendatahub namespace references model in llm namespace
69
+
# MaaSAuthPolicy in models-as-a-service namespace references model in llm namespace
70
70
apiVersion: maas.opendatahub.io/v1alpha1
71
71
kind: MaaSAuthPolicy
72
72
metadata:
73
73
name: my-policy
74
-
namespace: opendatahub
74
+
namespace: models-as-a-service
75
75
spec:
76
76
modelRefs:
77
77
- name: my-model
@@ -81,7 +81,7 @@ spec:
81
81
- name: my-group
82
82
```
83
83
84
-
The controller creates a Kuadrant **AuthPolicy** in the `llm` namespace (where the model and HTTPRoute exist), not in `opendatahub` (where the MaaSAuthPolicy lives).
84
+
The controller creates a Kuadrant **AuthPolicy** in the `llm` namespace (where the model and HTTPRoute exist), not in `models-as-a-service` (where the MaaSAuthPolicy lives).
85
85
86
86
**Same model name, different namespaces:**
87
87
@@ -99,7 +99,7 @@ spec:
99
99
100
100
This creates two separate AuthPolicies: one in `team-a`, one in `team-b`.
101
101
102
-
**Model list API:** When the MaaS controller is installed, the MaaS API **GET /v1/models** endpoint lists models by reading **MaaSModelRef** CRs (in the API's namespace). Each MaaSModelRef's `metadata.name` becomes the model `id`, and `status.endpoint` / `status.phase` supply the URL and readiness. So the set of MaaSModelRef objects is the source of truth for "which models are available" in MaaS. See [docs/content/configuration-and-management/model-listing-flow.md](../docs/content/configuration-and-management/model-listing-flow.md) in the repo for the full flow.
102
+
**Model list API:** When the MaaS controller is installed, the MaaS API **GET /v1/models** endpoint lists models by reading **MaaSModelRef** CRs cluster-wide (all namespaces). Each MaaSModelRef's `metadata.name` becomes the model `id`, and `status.endpoint` / `status.phase` supply the URL and readiness. So the set of MaaSModelRef objects is the source of truth for "which models are available" in MaaS. See [docs/content/configuration-and-management/model-listing-flow.md](../docs/content/configuration-and-management/model-listing-flow.md) in the repo for the full flow.
- `MaaSAuthPolicy/premium-simulator-access` (group: `premium-user`) and `MaaSSubscription/premium-simulator-subscription` (1000 tokens/min) in `models-as-a-service`
281
281
282
282
Replace `free-user` and `premium-user` in the example CRs with groups from your identity provider.
@@ -285,7 +285,7 @@ Then verify:
285
285
286
286
```bash
287
287
# Check CRs
288
-
kubectl get maasmodelref -n opendatahub
288
+
kubectl get maasmodelref -n llm
289
289
kubectl get maasauthpolicy,maassubscription -n models-as-a-service
0 commit comments