opendatahub-io
diff --git a/‎docs/content/configuration-and-management/group-membership-known-issues.md‎
Lines changed: 0 additions & 170 deletions b/‎docs/content/configuration-and-management/group-membership-known-issues.md‎
Lines changed: 0 additions & 170 deletions
diff --git a/‎docs/content/configuration-and-management/maas-controller-overview.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/content/configuration-and-management/maas-controller-overview.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/content/configuration-and-management/maas-models.md‎
Lines changed: 5 additions & 1 deletion b/‎docs/content/configuration-and-management/maas-models.md‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎docs/content/configuration-and-management/model-setup.md‎
Lines changed: 16 additions & 2 deletions b/‎docs/content/configuration-and-management/model-setup.md‎
Lines changed: 16 additions & 2 deletions
diff --git a/‎docs/content/install/maas-setup.md‎
Lines changed: 28 additions & 1 deletion b/‎docs/content/install/maas-setup.md‎
Lines changed: 28 additions & 1 deletion
diff --git a/‎docs/content/install/model-setup.md‎
Lines changed: 70 additions & 1 deletion b/‎docs/content/install/model-setup.md‎
Lines changed: 70 additions & 1 deletion
diff --git a/‎docs/mkdocs.yml‎
Lines changed: 5 additions & 0 deletions b/‎docs/mkdocs.yml‎
Lines changed: 5 additions & 0 deletions
@@ -2,6 +2,9 @@
 
 This document describes the **MaaS Controller**: what was built, how it fits into the Models-as-a-Service (MaaS) stack, and how the pieces work together. It is intended for presentations, onboarding, and technical deep-dives.
 
+!!! todo "Documentation cleanup"
+    TODO: Clean up this documentation.
+
 ---
 
 ## 1. What Is the MaaS Controller?
 
@@ -2,7 +2,11 @@
 
 MaaS uses **MaaSModelRef** to identify model servers that live on the cluster. Each MaaSModelRef is a reference to a model server—it holds the information MaaS needs to perform authentication, authorization, and rate limiting.
 
-By using a single unified object (MaaSModelRef) for all model types, MaaS can handle different kinds of model servers—each with its own backend and lifecycle—through one consistent interface. The controller uses a **provider paradigm** to distinguish between types: each model type (for example, LLMInferenceService, external APIs) has a provider that knows how to reconcile and resolve that type. Today, vLLM (via LLMInferenceService) is the supported provider; additional providers may be added in the future.
+By using a single unified object (MaaSModelRef) for all model types, MaaS can handle different kinds of model servers—each with its own backend and lifecycle—through one consistent interface. The controller uses a **provider paradigm** to distinguish between types: each model type (for example, LLMInferenceService, external APIs) has a provider that knows how to reconcile and resolve that type.
+
+**Supported LLMs:** Most model families should work; an official validated list is in progress.
+
+**Supported inference services:** vLLM through LLMInferenceService (KServe) is the initial supported release for on-cluster models; additional backends are planned for future releases.
 
 ## The Model Reference
 
 
@@ -7,9 +7,23 @@ This guide explains how to configure models so they appear in the MaaS platform
 
 ## Supported model types
 
-MaaS is planning support for multiple model types through a **provider paradigm**: each MaaSModelRef references a model backend by `kind` (e.g., `LLMInferenceService`, `ExternalModel`). The controller uses provider-specific logic to reconcile and resolve each type.
+MaaS distinguishes between **supported LLMs** (the model weights/architectures) and **supported inference services** (the runtime backends).
 
-**LLMInferenceService** will be initially supported. The initial release focuses on using KServe for on-cluster models. This guide describes the configuration differences between the default LLMInferenceService and the MaaS-enabled one to help users understand the differences.
+### Supported LLMs
+
+Most LLM model families should work (e.g., Llama, Mistral, Qwen, GPT-style models). We are working on an official validated list. If you encounter issues with a specific model, please report them.
+
+### Supported inference services
+
+MaaS uses a **provider paradigm**: each MaaSModelRef references a model backend by `kind` (e.g., `LLMInferenceService`, `ExternalModel`). The controller uses provider-specific logic to reconcile and resolve each type. Supported inference runtimes include:
+
+| Inference service | Status |
+|-------------------|--------|
+| **vLLM** (via LLMInferenceService / KServe) | Initial supported release. This is the primary supported backend for on-cluster models. |
+| **KServe** (LLMInferenceService) | Runtime framework. vLLM workloads run through LLMInferenceService. |
+| **Additional backends** | Planned for future releases. |
+
+This guide describes the configuration differences between the default LLMInferenceService and the MaaS-enabled one to help users understand the differences.
 
 ## How the model list is built
 
 
@@ -32,7 +32,17 @@ postgresql://USERNAME:PASSWORD@HOSTNAME:PORT/DATABASE?sslmode=require
     ./scripts/setup-database.sh
     ```
 
-    Use `NAMESPACE=redhat-ods-applications` for RHOAI. The full `scripts/deploy.sh` script also creates PostgreSQL automatically when deploying MaaS.
+    **Setting the namespace:** The script defaults to `opendatahub`. Set the `NAMESPACE` environment variable if your MaaS deployment uses a different namespace:
+
+    ```bash
+    # RHOAI uses redhat-ods-applications
+    NAMESPACE=redhat-ods-applications ./scripts/setup-database.sh
+
+    # Custom namespace
+    NAMESPACE=my-maas-namespace ./scripts/setup-database.sh
+    ```
+
+    The full `scripts/deploy.sh` script also creates PostgreSQL automatically when deploying MaaS.
 
 !!! note "Restarting maas-api"
     If you add or update the Secret after the DataScienceCluster already has modelsAsService in managed state, restart the maas-api deployment to pick up the config:
@@ -54,6 +64,20 @@ The Gateway must exist before enabling modelsAsService in your DataScienceCluste
     ./scripts/setup-authorino-tls.sh
     ```
 
+    **Setting the namespace:** The script defaults to `kuadrant-system` (ODH with Kuadrant). Set `AUTHORINO_NAMESPACE` for RHOAI, which uses RHCL:
+
+    ```bash
+    AUTHORINO_NAMESPACE=rh-connectivity-link ./scripts/setup-authorino-tls.sh
+    ```
+
+!!! note "Required annotations"
+    The Gateway **must** include these annotations for MaaS to work correctly:
+
+    | Annotation | Purpose |
+    |------------|---------|
+    | `opendatahub.io/managed: "false"` | Read by **maas-controller**: allows it to manage AuthPolicies and related resources; prevents the ODH Model Controller from overwriting them. |
+    | `security.opendatahub.io/authorino-tls-bootstrap: "true"` | Used by the ODH platform (not maas-controller) to create the EnvoyFilter for Gateway → Authorino TLS when Authorino uses a TLS listener. Required when Authorino TLS is enabled (see [TLS Configuration](../configuration-and-management/tls-configuration.md)). |
+
 ```yaml
 CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
 # Use default ingress cert for HTTPS, or set CERT_NAME to your TLS secret name
@@ -66,6 +90,9 @@ kind: Gateway
 metadata:
   name: maas-default-gateway
   namespace: openshift-ingress
+  annotations:
+    opendatahub.io/managed: "false"
+    security.opendatahub.io/authorino-tls-bootstrap: "true"
 spec:
   gatewayClassName: openshift-default
   listeners:
 
@@ -14,12 +14,81 @@ Our sample models are packaged as Kustomize overlays that deploy:
 For more detail on each resource, see [Access and Quota Overview](../configuration-and-management/subscription-overview.md).
 
 !!! tip "Create llm namespace (optional)"
-    Models deploy to the `llm` namespace. If it does not exist, create it first (idempotent—safe to run even if it already exists):
+    Our example models deploy to the `llm` namespace. If it does not exist, create it before deploying the samples below (idempotent—safe to run even if it already exists):
 
     ```bash
     kubectl create namespace llm --dry-run=client -o yaml | kubectl apply -f -
     ```
 
+## Understanding the Deployment Flow
+
+Deploying a model through MaaS follows a specific order. Each resource depends on the previous one. The following walkthrough deploys the **simulator model** step by step so you can see what each resource does.
+
+Set the project root (run from the repository root):
+
+```bash
+PROJECT_DIR=$(git rev-parse --show-toplevel)
+```
+
+### Step 1: Deploy the LLMInferenceService (Model)
+
+The LLMInferenceService is the actual inference workload. It must exist first and use the `maas-default-gateway` gateway reference so traffic flows through MaaS for authentication and rate limiting.
+
+```bash
+kustomize build ${PROJECT_DIR}/docs/samples/maas-system/free/llm/ | kubectl apply -f -
+```
+
+This deploys the simulator workload (a lightweight mock that generates responses without a real LLM). The resource is named `facebook-opt-125m-simulated` in the `llm` namespace. Verify it is ready:
+
+```bash
+kubectl get llminferenceservice -n llm
+kubectl get pods -n llm
+```
+
+### Step 2: Deploy the MaaSModelRef
+
+The MaaSModelRef registers the model with MaaS so it appears in the catalog and the `/v1/models` API. It references the LLMInferenceService by name. The maas-controller watches MaaSModelRefs and populates `status.endpoint` and `status.phase` from the underlying LLMInferenceService.
+
+```bash
+kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-model.yaml
+```
+
+After a short moment, the controller reconciles. Verify status is populated:
+
+```bash
+kubectl get maasmodelref -n llm facebook-opt-125m-simulated -o jsonpath='{.status.phase}' && echo
+kubectl get maasmodelref -n llm facebook-opt-125m-simulated -o jsonpath='{.status.endpoint}' && echo
+```
+
+**Expected output:** `status.phase` should be `Ready` and `status.endpoint` should be a non-empty URL. If either is missing, wait briefly and retry—the controller may still be reconciling (see [Verify Model Deployment](#verify-model-deployment) below).
+
+### Step 3: Deploy the MaaSSubscription
+
+The MaaSSubscription defines token rate limits (quotas) for groups. It references the MaaSModelRef by name and namespace. This controls how many tokens each group can consume per model.
+
+Create the `models-as-a-service` namespace if it does not exist, then apply:
+
+```bash
+kubectl create namespace models-as-a-service --dry-run=client -o yaml | kubectl apply -f -
+kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-subscription.yaml
+```
+
+This sample grants `system:authenticated` (all authenticated users) a limit of 100 tokens per minute for the simulator model.
+
+### Step 4: Deploy the MaaSAuthPolicy
+
+The MaaSAuthPolicy defines who can access the model. It references the MaaSModelRef by name and namespace. Without this, requests to the model are denied even if the user has a subscription.
+
+```bash
+kubectl apply -f ${PROJECT_DIR}/docs/samples/maas-system/free/maas/maas-auth-policy.yaml
+```
+
+This sample grants access to `system:authenticated`. The maas-controller creates per-model AuthPolicies and TokenRateLimitPolicies that enforce this.
+
+---
+
+You have now deployed the full simulator stack manually. The sections below deploy all required objects (Model, ModelRef, Subscription, AuthPolicy) together using a single Kustomize command for each sample.
+
 ## Deploy Sample Models
 
 ### Simulator Model (CPU)
 
@@ -75,9 +75,14 @@ nav:
       - Quota and Access Configuration: configuration-and-management/quota-and-access-configuration.md
       - Token Management: configuration-and-management/token-management.md
       - TLS Configuration: configuration-and-management/tls-configuration.md
+      - Subscription Known Issues: configuration-and-management/subscription-known-issues.md
     - Models:
       - Model Setup (On Cluster): configuration-and-management/model-setup.md
       - Model Listing Flow: configuration-and-management/model-listing-flow.md
+      - Model Access Behavior: configuration-and-management/model-access-behavior.md
+      - MaaS Model Kinds: configuration-and-management/maas-model-kinds.md
+    - MaaS Controller:
+      - Controller Overview: configuration-and-management/maas-controller-overview.md
     - Advanced Administration:
       - Observability: advanced-administration/observability.md
       - Limitador Persistence: advanced-administration/limitador-persistence.md