This guide provides step-by-step instructions for configuring MaaSModelRef, MaaSAuthPolicy, and MaaSSubscription. For conceptual overview, see Access and Quota Overview and MaaS Models.
- MaaS platform installed — See Install MaaS Components
- LLMInferenceService for your model (or external model endpoints)
- Cluster admin or equivalent permissions to create CRs in the
models-as-a-serviceand model namespaces
!!! note "Deploy a sample model"
Command to deploy simulator model as free-model:
```bash
kustomize build 'https://github.com/opendatahub-io/models-as-a-service//docs/samples/maas-system/free/llm?ref=main' | \
sed 's/facebook-opt-125m-simulated/free-model/g' | kubectl apply -f -
```
Before running the configuration steps, here is the flow you will set up:
-
Register models — Create a MaaSModelRef for each model so MaaS knows about it and can expose it through the API. The controller reconciles each ref and populates the endpoint.
-
Grant access — Create MaaSAuthPolicy resources that define which groups or users can use which models. A user must match a policy to see and call a model.
-
Define subscriptions — Create MaaSSubscription resources that define quota (token rate limits) for groups or users. A user must have both access (policy) and quota (subscription) to use a model.
-
Validate — Confirm the CRs are reconciled, policies are enforced, and you can list models and run inference.
Set the namespace and name of your LLMInferenceService (used in the commands below):
MODEL_NS=llm
MODEL_NAME=free-model # From the Prerequisites deploy; or: kubectl get llminferenceservice -n $MODEL_NS -o jsonpath='{.items[0].metadata.name}'Create a MaaSModelRef for each model you want to expose through MaaS. The MaaSModelRef must be in the same namespace as the LLMInferenceService. The spec.modelRef.name must match the LLMInferenceService name.
kubectl apply -f - <<EOF
apiVersion: maas.opendatahub.io/v1alpha1
kind: MaaSModelRef
metadata:
name: ${MODEL_NAME}-ref
namespace: ${MODEL_NS}
spec:
modelRef:
kind: LLMInferenceService
name: ${MODEL_NAME}
EOFVerify the controller has reconciled and set the endpoint:
kubectl get maasmodelref -n ${MODEL_NS}
# Check status.endpoint and status.phaseCreate an MaaSAuthPolicy to define which groups/users can access which models:
kubectl apply -f - <<EOF
# MaaSAuthPolicy must be deployed to the models-as-a-service namespace
apiVersion: maas.opendatahub.io/v1alpha1
kind: MaaSAuthPolicy
metadata:
name: free-access
namespace: models-as-a-service
spec:
modelRefs:
- name: ${MODEL_NAME}-ref
namespace: ${MODEL_NS}
subjects:
groups:
- name: free-users
users: []
EOFVerify the MaaSAuthPolicy is active (AuthPolicy created and enforced):
# List generated AuthPolicy (may take a moment for controller to reconcile)
kubectl get authpolicy -n ${MODEL_NS} -l maas.opendatahub.io/model=${MODEL_NAME}-ref
# Wait for AuthPolicy to be enforced (re-run if the controller has not reconciled yet)
AUTH_POLICY=$(kubectl get authpolicy -n ${MODEL_NS} -l maas.opendatahub.io/model=${MODEL_NAME}-ref -o jsonpath='{.items[0].metadata.name}')
[[ -n "$AUTH_POLICY" ]] && kubectl wait --for=condition=Enforced=true authpolicy/${AUTH_POLICY} -n ${MODEL_NS} --timeout=120sMultiple policies per model: You can create multiple MaaSAuthPolicies that reference the same model. The controller aggregates them — a user matching any policy gets access.
Create a MaaSSubscription to define per-model token rate limits for owner groups:
kubectl apply -f - <<EOF
# MaaSSubscription must be deployed to the models-as-a-service namespace
apiVersion: maas.opendatahub.io/v1alpha1
kind: MaaSSubscription
metadata:
name: free-subscription
namespace: models-as-a-service
spec:
owner:
groups:
- name: free-users
users: []
modelRefs:
- name: ${MODEL_NAME}-ref
namespace: ${MODEL_NS}
tokenRateLimits:
- limit: 100
window: 1m
priority: 10
EOFVerify the MaaSSubscription is active (TokenRateLimitPolicy created and enforced):
# List generated TokenRateLimitPolicy (may take a moment for controller to reconcile)
kubectl get tokenratelimitpolicy -n ${MODEL_NS} -l maas.opendatahub.io/model=${MODEL_NAME}-ref
# Wait for TokenRateLimitPolicy to be enforced (re-run if the controller has not reconciled yet)
TRLP=$(kubectl get tokenratelimitpolicy -n ${MODEL_NS} -l maas.opendatahub.io/model=${MODEL_NAME}-ref -o jsonpath='{.items[0].metadata.name}')
[[ -n "$TRLP" ]] && kubectl wait --for=condition=Enforced=true tokenratelimitpolicy/${TRLP} -n ${MODEL_NS} --timeout=120s!!! note "Namespace requirements"
Both MaaSAuthPolicy and MaaSSubscription must be installed in the models-as-a-service namespace. Each modelRefs entry must specify the namespace where the MaaSModelRef lives (e.g. llm).
!!! warning "Using the users field"
The subjects.users (MaaSAuthPolicy) and owner.users (MaaSSubscription) fields should be used only for Service Accounts and similar programmatic identities, not for many individual human users. Having too many distinct users can cause cardinality issues in rate limiting and policy enforcement. Prefer groups for human users.
Premium example with higher limits:
kubectl apply -f - <<EOF
# MaaSSubscription must be deployed to the models-as-a-service namespace
apiVersion: maas.opendatahub.io/v1alpha1
kind: MaaSSubscription
metadata:
name: premium-subscription
namespace: models-as-a-service
spec:
owner:
groups:
- name: premium-users
users: []
modelRefs:
- name: ${MODEL_NAME}-ref
namespace: ${MODEL_NS}
tokenRateLimits:
- limit: 100000
window: 24h
priority: 20
tokenMetadata:
organizationId: "premium-org"
costCenter: "ai-team"
EOFCheck CRs and generated policies:
kubectl get maasmodelref -n ${MODEL_NS}
kubectl get maasauthpolicy,maassubscription -n models-as-a-service
kubectl get authpolicy,tokenratelimitpolicy -n ${MODEL_NS}Create groups and add users (OpenShift example):
# Create groups if they do not exist
oc adm groups new free-users 2>/dev/null || true
oc adm groups new premium-users 2>/dev/null || true
# Add users to each group
oc adm groups add-users free-users alice@example.com
oc adm groups add-users premium-users bob@example.comCreate API keys and verify access:
Create an API key as a user in free-users and another as a user in premium-users. Follow the Validation guide to:
- Get the gateway endpoint and create an API key for each user
- List models and run inference with each API key
- Test with both groups to confirm access and different rate limits — free-users (100 tokens/min) vs premium-users (100,000 tokens/24h)
To grant a user access to a subscription, add them to the appropriate Kubernetes group:
# Create groups if they do not exist
oc adm groups new free-users 2>/dev/null || true
oc adm groups new premium-users 2>/dev/null || true
# Add users (method depends on your IdP; OpenShift example)
oc adm groups add-users free-users alice@example.com
oc adm groups add-users premium-users bob@example.comUsers will get subscription access on their next request (after group membership propagates).
When a user belongs to multiple groups that each have a subscription, the access depends on the API key used. A subscription is bound to each API key at minting (explicit or highest priority). See Understanding Token Management.
Cause: User requested a subscription they do not belong to (group membership).
Fix: Ensure the user is in a group listed in the subscription's spec.owner.groups.
Cause: User exceeded token rate limit for the model.
Fix: Wait for the rate limit window to reset, or upgrade to a subscription with higher limits.
Cause: MaaSModelRef missing, not reconciled, or access probe failed.
Fix:
- Verify MaaSModelRef exists in the model namespace (e.g.
llm) and hasstatus.phase: Ready - Check MaaSAuthPolicy in
models-as-a-serviceincludes the user's groups and references the MaaSModelRef with correctname(e.g.${MODEL_NAME}-ref) andnamespace - Ensure MaaSSubscription in
models-as-a-serviceexists for the model and user's groups
Cause: Kuadrant controller may need to re-sync.
Fix:
kubectl delete pod -l control-plane=controller-manager -n kuadrant-system
kubectl wait --for=condition=Enforced=true tokenratelimitpolicy/<policy-name> -n llm --timeout=2m- Access and Quota Overview — How policies and subscriptions work together
- MaaS Models — Conceptual overview
- Token Management
- Validation