Skip to content

Commit 5d9bd72

Browse files
ryancham715jrhynessclaude
authored
docs: remove reliance on subscription header in inference and models endpoint (#614)
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> ## How Has This Been Tested? <!--- Please describe in detail how you tested your changes. --> <!--- Include details of your testing environment, and the tests you ran to --> <!--- see how your change affects other areas of the code, etc. --> ## Merge criteria: <!--- This PR will be merged by any repository approver when it meets all the points in the checklist --> <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> - [ ] The commits are squashed in a cohesive manner and have meaningful messages. - [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). - [ ] The developer has manually tested the changes and verified that the changes work <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Clarified API key authentication flow: keys now bind to a specific subscription at creation time * Updated guidance on when to use `X-MaaS-Subscription` header—required only for user tokens with multiple subscriptions, not for API key-based inference * Revised model listing and inference examples to reflect the new subscription binding behavior for API keys * Updated authentication flow diagrams and troubleshooting guidance to reflect current behavior <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Jim Rhyness <jrhyness@redhat.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent 95b149d commit 5d9bd72

File tree

11 files changed

+68
-53
lines changed

11 files changed

+68
-53
lines changed

docs/content/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ The MaaSAuthPolicy delegates to the MaaS API for key validation and subscription
214214
2. MaaS API validates the key (format, not revoked, not expired) and returns username, groups, and subscription.
215215
3. Authorino calls MaaS API to check subscription (groups, username, requested subscription from the key).
216216
4. If the user lacks access to the requested subscription → error (403).
217-
5. On success, returns selected subscription; Authorino caches the result (e.g., 60s TTL). AuthPolicy may inject `X-MaaS-Subscription` for downstream rate limiting.
217+
5. On success, returns selected subscription; Authorino caches the result (e.g., 60s TTL). AuthPolicy may inject `X-MaaS-Subscription` **server-side** for downstream rate limiting and metrics. Clients do not send this header on inference; subscription comes from the API key record created at mint time.
218218

219219
```mermaid
220220
graph TB

docs/content/configuration-and-management/maas-controller-overview.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -220,14 +220,11 @@ flowchart LR
220220

221221
## 9. Authentication (Current Behavior)
222222

223-
For **GET /v1/models**, the API forwards the client’s **Authorization** header as-is to each model endpoint (no token exchange). For inference, until MaaS API token minting is in place, use the **OpenShift token**:
223+
For **GET /v1/models**, the maas-api forwards the client’s **Authorization** header as-is to each model endpoint (no token exchange). You can use an **OpenShift token** or an **API key** (`sk-oai-*`). With a user token, you may send `X-MaaS-Subscription` to filter when you have access to multiple subscriptions.
224224

225-
```bash
226-
export TOKEN=$(oc whoami -t)
227-
curl -H "Authorization: Bearer $TOKEN" "https://<gateway-host>/llm/<model-name>/v1/chat/completions" -d '...'
228-
```
225+
For **model inference** (requests to `…/llm/<model>/v1/chat/completions` and similar), use an **API key** created via `POST /v1/api-keys` only. Each key is bound to one MaaSSubscription at mint time.
229226

230-
The Kuadrant AuthPolicy validates this token via **Kubernetes TokenReview** and derives user/groups for authorization and for the identity passed to TokenRateLimitPolicy (including `groups_str`).
227+
The Kuadrant AuthPolicy validates API keys via the MaaS API and validates user tokens via `Kubernetes TokenReview`, deriving user/groups for authorization and for TokenRateLimitPolicy (including `groups_str`).
231228

232229
---
233230

docs/content/configuration-and-management/model-listing-flow.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,6 @@ The `/v1/models` endpoint automatically filters models based on your authenticat
6464
#### API Key Authentication (Bearer sk-oai-*)
6565
When using an API key, the subscription is automatically determined from the key:
6666
- Returns **only** models from the subscription bound to the API key at mint time
67-
- The `X-MaaS-Subscription` header is automatically injected by the gateway
68-
- **No manual headers required**
6967

7068
```bash
7169
# API key bound to "premium-subscription"

docs/content/configuration-and-management/quota-and-access-configuration.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -222,12 +222,6 @@ When a user belongs to multiple groups that each have a subscription, the access
222222

223223
## Troubleshooting
224224

225-
### 403 Forbidden: "must specify X-MaaS-Subscription"
226-
227-
**Cause:** User has multiple subscriptions and did not send the header.
228-
229-
**Fix:** Add `X-MaaS-Subscription: <subscription-name>` to the request.
230-
231225
### 403 Forbidden: "no access to subscription"
232226

233227
**Cause:** User requested a subscription they do not belong to (group membership).

docs/content/configuration-and-management/subscription-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ MaaSAuthPolicy and MaaSSubscription are namespace-scoped to `models-as-a-service
66

77
```mermaid
88
flowchart TD
9-
User([User / App]) -- "Request (Model + SubID)" --> Gateway{MaaS API Gateway}
9+
User([User / App]) -- "Request (API key + model)" --> Gateway{MaaS API Gateway}
1010
1111
subgraph Validation ["Dual-Check Gate"]
1212
direction LR

docs/content/configuration-and-management/token-management.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ sequenceDiagram
107107
DB-->>MaaS: username, groups, subscription, status
108108
MaaS->>MaaS: Check status (active/revoked/expired)
109109
MaaS-->>AuthPolicy: 4. valid: true, userId, groups, subscription
110-
AuthPolicy->>AuthPolicy: Subscription check, inject headers (incl. X-MaaS-Subscription), rate limits
110+
AuthPolicy->>AuthPolicy: Subscription check, inject headers, rate limits
111111
AuthPolicy->>Model: 5. Authorized request (identity headers)
112112
Model-->>Gateway: Response
113113
Gateway-->>User: Response
@@ -138,9 +138,12 @@ flowchart LR
138138

139139
This means you can:
140140

141-
1. **Authenticate with OpenShift or OIDC** — use your existing identity and the same token you would use for inference.
142-
2. **Use an API key** — use your `sk-oai-*` key in the Authorization header.
143-
3. **Call `/v1/models` immediately** — see only the models you can access, without creating an API key first (if using OpenShift token).
141+
1. **Authenticate with OpenShift or OIDC** — use your existing identity token for `GET /v1/models` (optional `X-MaaS-Subscription` when you have multiple subscriptions).
142+
2. **Use an API key** — use your `sk-oai-*` key in the Authorization header for listing and for inference.
143+
3. **Call `/v1/models` immediately** — see only the models you can access, without creating an API key first (if using an OpenShift token).
144+
145+
!!! note "Inference vs listing"
146+
Inference (calls to each model’s chat/completions URL) requires an API key in `Authorization: Bearer` only. Do not send `X-MaaS-Subscription` on inference—the subscription is the one bound at API key mint time. `GET /v1/models` accepts either an API key or an OpenShift token; with a user token, `X-MaaS-Subscription` remains supported for filtering.
144147

145148
---
146149

docs/content/install/validation.md

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ API_KEY_RESPONSE=$(curl -sSk \
3333
-H "Authorization: Bearer $(oc whoami -t)" \
3434
-H "Content-Type: application/json" \
3535
-X POST \
36-
-d '{"name": "validation-key", "description": "Key for validation", "expiresIn": "1h"}' \
36+
-d '{"name": "validation-key", "description": "Key for validation", "expiresIn": "1h", "subscription": "simulator-subscription"}' \
3737
"${HOST}/maas-api/v1/api-keys") && \
3838
API_KEY=$(echo $API_KEY_RESPONSE | jq -r .key) && \
3939
echo "API key obtained: ${API_KEY:0:20}..."
@@ -43,21 +43,16 @@ echo "API key obtained: ${API_KEY:0:20}..."
4343
The plaintext API key is returned **only at creation time**. We do not store the API key, so there is no way to retrieve it again. Store it securely when it is displayed. If you run into errors, see [Troubleshooting](troubleshooting.md).
4444

4545
!!! note
46-
For more information about API keys, see [Understanding Token Management](../configuration-and-management/token-management.md).
46+
`subscription` is the MaaSSubscription metadata name to bind (here `simulator-subscription` matches the [maas-system](https://github.com/opendatahub-io/models-as-a-service/tree/main/docs/samples/maas-system) free sample). Use your own name or omit the field to auto-select by `spec.priority`. For details, see [Understanding Token Management](../configuration-and-management/token-management.md).
4747

4848
### 3. List Available Models
4949

50-
Set the subscription name (required when your API key matches multiple subscriptions; use the name from your MaaSSubscription CR):
51-
52-
```bash
53-
export MaaS_SUBSCRIPTION="simulator-subscription" # or your subscription name
54-
```
50+
Each API key is bound to one MaaSSubscription at creation time. `GET /v1/models` with an API key does not require `X-MaaS-Subscription`—the list is scoped to that subscription. (With an OpenShift user token instead of an API key, you can optionally send `X-MaaS-Subscription` to filter when you have access to multiple subscriptions.)
5551

5652
```bash
5753
MODELS=$(curl -sSk ${HOST}/maas-api/v1/models \
5854
-H "Content-Type: application/json" \
59-
-H "Authorization: Bearer $API_KEY" \
60-
-H "X-MaaS-Subscription: ${MaaS_SUBSCRIPTION}" | jq -r .) && \
55+
-H "Authorization: Bearer $API_KEY" | jq -r .) && \
6156
echo $MODELS | jq . && \
6257
MODEL_NAME=$(echo $MODELS | jq -r '.data[0].id') && \
6358
MODEL_URL=$(echo $MODELS | jq -r '.data[0].url') && \

docs/content/user-guide/self-service-model-access.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ API_KEY_RESPONSE=$(curl -sSk \
3838
-H "Authorization: Bearer ${OC_TOKEN}" \
3939
-H "Content-Type: application/json" \
4040
-X POST \
41-
-d '{"name": "my-api-key", "description": "Key for model access", "expiresIn": "90d"}' \
41+
-d '{"name": "my-api-key", "description": "Key for model access", "expiresIn": "90d", "subscription": "simulator-subscription"}' \
4242
"${MAAS_API_URL}/maas-api/v1/api-keys")
4343

4444
API_KEY=$(echo $API_KEY_RESPONSE | jq -r .key)
@@ -48,7 +48,7 @@ echo "Key prefix: ${API_KEY:0:16}..."
4848
echo "Bound subscription: ${SUBSCRIPTION}"
4949
```
5050

51-
To pin a specific subscription, add it to the JSON body, for example: `"subscription": "my-team-subscription"`.
51+
Replace `simulator-subscription` with your MaaSSubscription metadata name, or remove the `subscription` field to bind the **highest-priority** subscription you can access.
5252

5353
!!! warning "API key shown only once"
5454
The plaintext API key is returned **only at creation time**. We do not store the API key, so there is no way to retrieve it again. Store it securely when it is displayed. If you run into errors, see [Troubleshooting](../install/troubleshooting.md).
@@ -110,6 +110,8 @@ echo $MODEL_INFO | jq .
110110

111111
## Making Inference Requests
112112

113+
Use **only** your API key in `Authorization: Bearer`. The subscription is fixed when the key was created.
114+
113115
### Basic Chat Completion
114116

115117
Make a simple chat completion request:

maas-api/README.md

Lines changed: 25 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -193,7 +193,8 @@ API_KEY_RESPONSE=$(curl -sSk \
193193
-X POST \
194194
-d '{
195195
"name": "my-api-key",
196-
"description": "Production API key for my application"
196+
"description": "Production API key for my application",
197+
"subscription": "simulator-subscription"
197198
}' \
198199
"${HOST}/maas-api/v1/api-keys")
199200

@@ -208,14 +209,18 @@ API_KEY_RESPONSE=$(curl -sSk \
208209
-d '{
209210
"name": "my-short-lived-key",
210211
"description": "30-day test key",
211-
"expiresIn": "30d"
212+
"expiresIn": "30d",
213+
"subscription": "simulator-subscription"
212214
}' \
213215
"${HOST}/maas-api/v1/api-keys")
214216

215217
echo $API_KEY_RESPONSE | jq -r .
216218
API_KEY=$(echo $API_KEY_RESPONSE | jq -r .key)
217219
```
218220

221+
> [!NOTE]
222+
> Replace `simulator-subscription` with your `MaaSSubscription` metadata name. To rely on **auto-selection** instead, remove the `subscription` field; maas-api then picks the accessible subscription with the highest `spec.priority`.
223+
219224
> [!IMPORTANT]
220225
> The plaintext API key is shown ONLY ONCE at creation time. Store it securely - it cannot be retrieved again.
221226
@@ -306,22 +311,26 @@ For production deployments, see the [Database Prerequisites](../docs/content/ins
306311

307312
#### Listing models with subscription filtering
308313

309-
The `/v1/models` endpoint supports subscription filtering and aggregation:
314+
The `/v1/models` endpoint supports subscription filtering and aggregation. Use an **OpenShift token** or an **API key** in `Authorization: Bearer`. With a **user token**, optional `X-MaaS-Subscription` filters to one subscription when you have access to several. With an **API key**, the subscription is fixed at key mint time—no client `X-MaaS-Subscription` is needed for listing.
310315

311316
HOST="$(kubectl get gateway -l app.kubernetes.io/instance=maas-default-gateway -n openshift-ingress -o jsonpath='{.items[0].status.addresses[0].value}')"
312317

313318
# List models from all accessible subscriptions
314319
curl ${HOST}/v1/models \
315320
-H "Content-Type: application/json" \
316-
-H "Authorization: Bearer $TOKEN" \
317-
-H "X-MaaS-Return-All-Models: true" | jq .
321+
-H "Authorization: Bearer $TOKEN" | jq .
318322

319323
# List models from a specific subscription
320324
curl ${HOST}/v1/models \
321325
-H "Content-Type: application/json" \
322326
-H "Authorization: Bearer $TOKEN" \
323327
-H "X-MaaS-Subscription: my-subscription" | jq .
324328

329+
# List models from the subscription bound to an API key
330+
curl ${HOST}/v1/models \
331+
-H "Content-Type: application/json" \
332+
-H "Authorization: Bearer $API_KEY" | jq .
333+
325334
**Subscription Aggregation**: When the same model (same ID and URL) is accessible via multiple subscriptions, it appears once in the response with an array of all subscriptions providing access:
326335

327336
{
@@ -340,14 +349,20 @@ The `/v1/models` endpoint supports subscription filtering and aggregation:
340349

341350
#### Calling the model and hitting the rate limit
342351

343-
Using model discovery:
352+
Inference requires an API key (mint with `POST /v1/api-keys` using your OpenShift token). Send **only** `Authorization: Bearer <api-key>`; subscription is taken from the key at mint time.
353+
354+
Using model discovery (maas-api URL matches the [validation guide](../docs/content/install/validation.md); model `url` values come from the list response):
344355

345356
```shell
346-
HOST="$(kubectl get gateway -l app.kubernetes.io/instance=maas-default-gateway -n openshift-ingress -o jsonpath='{.items[0].status.addresses[0].value}')"
357+
CLUSTER_DOMAIN=$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
358+
MAAS_API="https://maas.${CLUSTER_DOMAIN}/maas-api"
359+
API_KEY=$(curl -sSk -H "Authorization: Bearer $(oc whoami -t)" -H "Content-Type: application/json" \
360+
-X POST -d '{"name":"rate-limit-demo","subscription":"simulator-subscription"}' \
361+
"${MAAS_API}/v1/api-keys" | jq -r .key)
347362

348-
MODELS=$(curl ${HOST}/v1/models \
363+
MODELS=$(curl -sSk "${MAAS_API}/v1/models" \
349364
-H "Content-Type: application/json" \
350-
-H "Authorization: Bearer $TOKEN" | jq . -r)
365+
-H "Authorization: Bearer ${API_KEY}" | jq . -r)
351366

352367
echo $MODELS | jq .
353368
MODEL_URL=$(echo $MODELS | jq -r '.data[0].url')
@@ -356,7 +371,7 @@ MODEL_NAME=$(echo $MODELS | jq -r '.data[0].id')
356371
for i in {1..16}
357372
do
358373
curl -sSk -o /dev/null -w "%{http_code}\n" \
359-
-H "Authorization: Bearer $TOKEN" \
374+
-H "Authorization: Bearer ${API_KEY}" \
360375
-d "{
361376
\"model\": \"${MODEL_NAME}\",
362377
\"prompt\": \"Not really understood prompt\",

maas-controller/README.md

Lines changed: 21 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -182,12 +182,16 @@ deny-unsubscribed (0): matches "NOT in premium-user AND NOT in free-user"
182182

183183
## Authentication
184184

185-
Until API token minting is in place, the controller uses **OpenShift tokens directly** for inference:
185+
Create API keys with `POST /v1/api-keys` on the maas-api (authenticate with your OpenShift token). Each key is bound to one MaaSSubscription at mint time: set `"subscription": "<name>"` in the JSON body, or omit it and the platform selects the **highest-priority** accessible subscription (`MaaSSubscription.spec.priority`).
186186

187187
```bash
188-
export TOKEN=$(oc whoami -t)
189-
curl -H "Authorization: Bearer $TOKEN" \
190-
"https://<gateway-host>/llm/<model-name>/v1/chat/completions" \
188+
MAAS_API="https://<gateway-host>/maas-api"
189+
API_KEY=$(curl -sSk -H "Authorization: Bearer $(oc whoami -t)" -H "Content-Type: application/json" \
190+
-X POST -d '{"name":"demo","subscription":"<maas-subscription-name>"}' \
191+
"${MAAS_API}/v1/api-keys" | jq -r .key)
192+
193+
curl -sSk "https://<gateway-host>/llm/<model-name>/v1/chat/completions" \
194+
-H "Authorization: Bearer ${API_KEY}" \
191195
-H "Content-Type: application/json" \
192196
-d '{"model":"<model>","messages":[{"role":"user","content":"Hello"}],"max_tokens":10}'
193197
```
@@ -289,22 +293,29 @@ kubectl get authpolicy,tokenratelimitpolicy -n llm
289293
290294
# Test inference (set GATEWAY_HOST and TOKEN once)
291295
GATEWAY_HOST="maas.$(kubectl get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')"
296+
MAAS_API="https://${GATEWAY_HOST}/maas-api"
292297
TOKEN=$(oc whoami -t)
293298
294-
# Regular model: 401 without auth, 200 with auth (user must be in free-user)
299+
# Regular tier: log in as a user in free-user, then mint a key for simulator-subscription
300+
FREE_API_KEY=$(curl -sSk -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
301+
-X POST -d '{"name":"readme-free","subscription":"simulator-subscription"}' \
302+
"${MAAS_API}/v1/api-keys" | jq -r .key)
303+
295304
curl -sSk -o /dev/null -w "%{http_code}\n" "https://${GATEWAY_HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" \
296305
-H "Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hi"}],"max_tokens":5}'
297306
curl -sSk -o /dev/null -w "%{http_code}\n" "https://${GATEWAY_HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" \
298-
-H "Authorization: Bearer $TOKEN" \
299-
-H "x-maas-subscription: simulator-subscription" \
307+
-H "Authorization: Bearer $FREE_API_KEY" \
300308
-H "Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hi"}],"max_tokens":5}'
301309
302-
# Premium model: 401 without auth, 200 with auth (user must be in premium-user)
310+
# Premium tier: log in as a user in premium-user, mint a key for premium-simulator-subscription, then call the premium route
311+
PREMIUM_API_KEY=$(curl -sSk -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
312+
-X POST -d '{"name":"readme-premium","subscription":"premium-simulator-subscription"}' \
313+
"${MAAS_API}/v1/api-keys" | jq -r .key)
314+
303315
curl -sSk -o /dev/null -w "%{http_code}\n" "https://${GATEWAY_HOST}/llm/premium-simulated-simulated-premium/v1/chat/completions" \
304316
-H "Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hi"}],"max_tokens":5}'
305317
curl -sSk -o /dev/null -w "%{http_code}\n" "https://${GATEWAY_HOST}/llm/premium-simulated-simulated-premium/v1/chat/completions" \
306-
-H "Authorization: Bearer $TOKEN" \
307-
-H "x-maas-subscription: premium-simulator-subscription" \
318+
-H "Authorization: Bearer $PREMIUM_API_KEY" \
308319
-H "Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hi"}],"max_tokens":5}'
309320
```
310321

0 commit comments

Comments
 (0)