Skip to content

Commit f4078fb

Browse files
committed
Merge branch 'main' into jr_55467
Resolved conflicts by keeping jr_55467's security improvements: - has() safety checks in CEL expressions - userId-based cache keys (collision-resistant) - Conditional apiKeyValidation with when clause - string() function for proper JSON group serialization
2 parents 1a321f1 + 9dc081d commit f4078fb

20 files changed

+354
-230
lines changed

docs/content/configuration-and-management/group-membership-known-issues.md

Lines changed: 0 additions & 170 deletions
This file was deleted.

docs/content/configuration-and-management/maas-controller-overview.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@
22

33
This document describes the **MaaS Controller**: what was built, how it fits into the Models-as-a-Service (MaaS) stack, and how the pieces work together. It is intended for presentations, onboarding, and technical deep-dives.
44

5+
!!! todo "Documentation cleanup"
6+
TODO: Clean up this documentation.
7+
58
---
69

710
## 1. What Is the MaaS Controller?

docs/content/configuration-and-management/maas-models.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,11 @@
22

33
MaaS uses **MaaSModelRef** to identify model servers that live on the cluster. Each MaaSModelRef is a reference to a model server—it holds the information MaaS needs to perform authentication, authorization, and rate limiting.
44

5-
By using a single unified object (MaaSModelRef) for all model types, MaaS can handle different kinds of model servers—each with its own backend and lifecycle—through one consistent interface. The controller uses a **provider paradigm** to distinguish between types: each model type (for example, LLMInferenceService, external APIs) has a provider that knows how to reconcile and resolve that type. Today, vLLM (via LLMInferenceService) is the supported provider; additional providers may be added in the future.
5+
By using a single unified object (MaaSModelRef) for all model types, MaaS can handle different kinds of model servers—each with its own backend and lifecycle—through one consistent interface. The controller uses a **provider paradigm** to distinguish between types: each model type (for example, LLMInferenceService, external APIs) has a provider that knows how to reconcile and resolve that type.
6+
7+
**Supported LLMs:** Most model families should work; an official validated list is in progress.
8+
9+
**Supported inference services:** vLLM through LLMInferenceService (KServe) is the initial supported release for on-cluster models; additional backends are planned for future releases.
610

711
## The Model Reference
812

docs/content/configuration-and-management/model-setup.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,23 @@ This guide explains how to configure models so they appear in the MaaS platform
77

88
## Supported model types
99

10-
MaaS is planning support for multiple model types through a **provider paradigm**: each MaaSModelRef references a model backend by `kind` (e.g., `LLMInferenceService`, `ExternalModel`). The controller uses provider-specific logic to reconcile and resolve each type.
10+
MaaS distinguishes between **supported LLMs** (the model weights/architectures) and **supported inference services** (the runtime backends).
1111

12-
**LLMInferenceService** will be initially supported. The initial release focuses on using KServe for on-cluster models. This guide describes the configuration differences between the default LLMInferenceService and the MaaS-enabled one to help users understand the differences.
12+
### Supported LLMs
13+
14+
Most LLM model families should work (e.g., Llama, Mistral, Qwen, GPT-style models). We are working on an official validated list. If you encounter issues with a specific model, please report them.
15+
16+
### Supported inference services
17+
18+
MaaS uses a **provider paradigm**: each MaaSModelRef references a model backend by `kind` (e.g., `LLMInferenceService`, `ExternalModel`). The controller uses provider-specific logic to reconcile and resolve each type. Supported inference runtimes include:
19+
20+
| Inference service | Status |
21+
|-------------------|--------|
22+
| **vLLM** (via LLMInferenceService / KServe) | Initial supported release. This is the primary supported backend for on-cluster models. |
23+
| **KServe** (LLMInferenceService) | Runtime framework. vLLM workloads run through LLMInferenceService. |
24+
| **Additional backends** | Planned for future releases. |
25+
26+
This guide describes the configuration differences between the default LLMInferenceService and the MaaS-enabled one to help users understand the differences.
1327

1428
## How the model list is built
1529

docs/content/install/maas-setup.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,17 @@ postgresql://USERNAME:PASSWORD@HOSTNAME:PORT/DATABASE?sslmode=require
3232
./scripts/setup-database.sh
3333
```
3434

35-
Use `NAMESPACE=redhat-ods-applications` for RHOAI. The full `scripts/deploy.sh` script also creates PostgreSQL automatically when deploying MaaS.
35+
**Setting the namespace:** The script defaults to `opendatahub`. Set the `NAMESPACE` environment variable if your MaaS deployment uses a different namespace:
36+
37+
```bash
38+
# RHOAI uses redhat-ods-applications
39+
NAMESPACE=redhat-ods-applications ./scripts/setup-database.sh
40+
41+
# Custom namespace
42+
NAMESPACE=my-maas-namespace ./scripts/setup-database.sh
43+
```
44+
45+
The full `scripts/deploy.sh` script also creates PostgreSQL automatically when deploying MaaS.
3646

3747
!!! note "Restarting maas-api"
3848
If you add or update the Secret after the DataScienceCluster already has modelsAsService in managed state, restart the maas-api deployment to pick up the config:
@@ -54,6 +64,20 @@ The Gateway must exist before enabling modelsAsService in your DataScienceCluste
5464
./scripts/setup-authorino-tls.sh
5565
```
5666

67+
**Setting the namespace:** The script defaults to `kuadrant-system` (ODH with Kuadrant). Set `AUTHORINO_NAMESPACE` for RHOAI, which uses RHCL:
68+
69+
```bash
70+
AUTHORINO_NAMESPACE=rh-connectivity-link ./scripts/setup-authorino-tls.sh
71+
```
72+
73+
!!! note "Required annotations"
74+
The Gateway **must** include these annotations for MaaS to work correctly:
75+
76+
| Annotation | Purpose |
77+
|------------|---------|
78+
| `opendatahub.io/managed: "false"` | Read by **maas-controller**: allows it to manage AuthPolicies and related resources; prevents the ODH Model Controller from overwriting them. |
79+
| `security.opendatahub.io/authorino-tls-bootstrap: "true"` | Used by the ODH platform (not maas-controller) to create the EnvoyFilter for Gateway → Authorino TLS when Authorino uses a TLS listener. Required when Authorino TLS is enabled (see [TLS Configuration](../configuration-and-management/tls-configuration.md)). |
80+
5781
```yaml
5882
CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
5983
# Use default ingress cert for HTTPS, or set CERT_NAME to your TLS secret name
@@ -66,6 +90,9 @@ kind: Gateway
6690
metadata:
6791
name: maas-default-gateway
6892
namespace: openshift-ingress
93+
annotations:
94+
opendatahub.io/managed: "false"
95+
security.opendatahub.io/authorino-tls-bootstrap: "true"
6996
spec:
7097
gatewayClassName: openshift-default
7198
listeners:

docs/content/install/model-setup.md

Lines changed: 70 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,81 @@ Our sample models are packaged as Kustomize overlays that deploy:
1414
For more detail on each resource, see [Access and Quota Overview](../configuration-and-management/subscription-overview.md).
1515

1616
!!! tip "Create llm namespace (optional)"
17-
Models deploy to the `llm` namespace. If it does not exist, create it first (idempotent—safe to run even if it already exists):
17+
Our example models deploy to the `llm` namespace. If it does not exist, create it before deploying the samples below (idempotent—safe to run even if it already exists):
1818

1919
```bash
2020
kubectl create namespace llm --dry-run=client -o yaml | kubectl apply -f -
2121
```
2222

23+
## Understanding the Deployment Flow
24+
25+
Deploying a model through MaaS follows a specific order. Each resource depends on the previous one. The following walkthrough deploys the **simulator model** step by step so you can see what each resource does.
26+
27+
Set the project root (run from the repository root):
28+
29+
```bash
30+
PROJECT_DIR=$(git rev-parse --show-toplevel)
31+
```
32+
33+
### Step 1: Deploy the LLMInferenceService (Model)
34+
35+
The LLMInferenceService is the actual inference workload. It must exist first and use the `maas-default-gateway` gateway reference so traffic flows through MaaS for authentication and rate limiting.
36+
37+
```bash
38+
kustomize build ${PROJECT_DIR}/docs/samples/maas-system/free/llm/ | kubectl apply -f -
39+
```
40+
41+
This deploys the simulator workload (a lightweight mock that generates responses without a real LLM). The resource is named `facebook-opt-125m-simulated` in the `llm` namespace. Verify it is ready:
42+
43+
```bash
44+
kubectl get llminferenceservice -n llm
45+
kubectl get pods -n llm
46+
```
47+
48+
### Step 2: Deploy the MaaSModelRef
49+
50+
The MaaSModelRef registers the model with MaaS so it appears in the catalog and the `/v1/models` API. It references the LLMInferenceService by name. The maas-controller watches MaaSModelRefs and populates `status.endpoint` and `status.phase` from the underlying LLMInferenceService.
51+
52+
```bash
53+
kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-model.yaml
54+
```
55+
56+
After a short moment, the controller reconciles. Verify status is populated:
57+
58+
```bash
59+
kubectl get maasmodelref -n llm facebook-opt-125m-simulated -o jsonpath='{.status.phase}' && echo
60+
kubectl get maasmodelref -n llm facebook-opt-125m-simulated -o jsonpath='{.status.endpoint}' && echo
61+
```
62+
63+
**Expected output:** `status.phase` should be `Ready` and `status.endpoint` should be a non-empty URL. If either is missing, wait briefly and retry—the controller may still be reconciling (see [Verify Model Deployment](#verify-model-deployment) below).
64+
65+
### Step 3: Deploy the MaaSSubscription
66+
67+
The MaaSSubscription defines token rate limits (quotas) for groups. It references the MaaSModelRef by name and namespace. This controls how many tokens each group can consume per model.
68+
69+
Create the `models-as-a-service` namespace if it does not exist, then apply:
70+
71+
```bash
72+
kubectl create namespace models-as-a-service --dry-run=client -o yaml | kubectl apply -f -
73+
kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-subscription.yaml
74+
```
75+
76+
This sample grants `system:authenticated` (all authenticated users) a limit of 100 tokens per minute for the simulator model.
77+
78+
### Step 4: Deploy the MaaSAuthPolicy
79+
80+
The MaaSAuthPolicy defines who can access the model. It references the MaaSModelRef by name and namespace. Without this, requests to the model are denied even if the user has a subscription.
81+
82+
```bash
83+
kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-auth-policy.yaml
84+
```
85+
86+
This sample grants access to `system:authenticated`. The maas-controller creates per-model AuthPolicies and TokenRateLimitPolicies that enforce this.
87+
88+
---
89+
90+
You have now deployed the full simulator stack manually. The sections below deploy all required objects (Model, ModelRef, Subscription, AuthPolicy) together using a single Kustomize command for each sample.
91+
2392
## Deploy Sample Models
2493

2594
### Simulator Model (CPU)

docs/mkdocs.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,14 @@ nav:
7575
- Quota and Access Configuration: configuration-and-management/quota-and-access-configuration.md
7676
- Token Management: configuration-and-management/token-management.md
7777
- TLS Configuration: configuration-and-management/tls-configuration.md
78+
- Subscription Known Issues: configuration-and-management/subscription-known-issues.md
7879
- Models:
7980
- Model Setup (On Cluster): configuration-and-management/model-setup.md
8081
- Model Listing Flow: configuration-and-management/model-listing-flow.md
82+
- Model Access Behavior: configuration-and-management/model-access-behavior.md
83+
- MaaS Model Kinds: configuration-and-management/maas-model-kinds.md
84+
- MaaS Controller:
85+
- Controller Overview: configuration-and-management/maas-controller-overview.md
8186
- Advanced Administration:
8287
- Observability: advanced-administration/observability.md
8388
- Limitador Persistence: advanced-administration/limitador-persistence.md

0 commit comments

Comments
 (0)