Skip to content

feat: enable configurable Authorino caching for AuthPolicy evaluators#621

Merged
openshift-merge-bot[bot] merged 13 commits intoopendatahub-io:mainfrom
jrhyness:jr_55467
Mar 30, 2026
Merged

feat: enable configurable Authorino caching for AuthPolicy evaluators#621
openshift-merge-bot[bot] merged 13 commits intoopendatahub-io:mainfrom
jrhyness:jr_55467

Conversation

@jrhyness
Copy link
Copy Markdown
Contributor

@jrhyness jrhyness commented Mar 26, 2026

Description

Add Kuadrant AuthPolicy caching for apiKeyValidation metadata and all OPA authorization rules (auth-valid, subscription-valid, require-group-membership). Make cache TTLs configurable via controller environment variables.

Changes:

  • Add --metadata-cache-ttl and --authz-cache-ttl flags to maas-controller
  • Add cache blocks to apiKeyValidation metadata (HTTP call to /api-keys/validate)
  • Update subscription-info metadata cache to use configurable TTL
  • Add cache blocks to auth-valid, subscription-valid, and require-group-membership OPA authorization evaluators
  • Add METADATA_CACHE_TTL and AUTHZ_CACHE_TTL environment variables to deployment
  • Add params.env configuration for ODH overlay deployments
  • Add comprehensive tests for cache configuration and key isolation
  • Add security tests verifying cache keys prevent cross-principal leakage
  • Document cache configuration and security properties

Cache Key Design:

  • apiKeyValidation: keyed by API key value
  • subscription-info: keyed by username|groups|subscription|model
  • auth-valid: keyed by auth-type|identity|model
  • subscription-valid: same as subscription-info (cache coherence)
  • require-group-membership: keyed by username|groups|model

All cache keys include sufficient dimensions to prevent cross-user, cross-API-key, and cross-model cache sharing.

Default TTL: 60 seconds for both metadata and authorization caches Configurable via deployment/overlays/odh/params.env

How Has This Been Tested?

Tests Added

  1. TestMaaSAuthPolicyReconciler_CachingConfiguration (4 test cases)
  • Purpose: Verifies cache blocks are generated with configurable TTLs
  • What it tests:
    • Default TTLs (60s for both metadata and authz)
    • Custom metadata TTL (300s)
    • Custom authz TTL (30s)
    • Both custom TTLs together
  • Coverage:
    • All 5 evaluators have cache blocks present
    • TTL values match configuration
    • Cache key selectors are non-empty
    • Tests both metadata evaluators (apiKeyValidation, subscription-info)
    • Tests all 3 authorization evaluators (auth-valid, subscription-valid, require-group-membership)
  1. TestMaaSAuthPolicyReconciler_CacheKeyIsolation (5 subtests)
  • Purpose: Security-critical test verifying cache keys prevent cross-principal leakage
  • What it tests:
    • apiKeyValidation: Includes API key from request headers
    • subscription-info: Includes username, groups, and model namespace/name
    • auth-valid: Distinguishes API keys vs K8s tokens, includes identity and model
    • subscription-valid: Matches subscription-info key exactly (cache coherence)
    • require-group-membership: Includes username, groups, and model
  • Coverage:
    • Verifies required components are present in cache keys (username, groups, model, etc.)
    • Confirms groups are joined for stable string representation
    • Validates cache coherence between metadata and authorization
  1. TestMaaSAuthPolicyReconciler_CacheKeyModelIsolation (4 subtests)
  • Purpose: Verifies different models generate different cache keys
  • What it tests:
    • Creates two models (llm-1, llm-2) and reconciles both
    • For each of 4 evaluators (subscription-info, auth-valid, subscription-valid, require-group-membership):
      • Cache keys differ between models
      • Each model's cache key contains its own model name
  • Coverage:
    • Prevents cross-model cache sharing
    • Confirms model namespace/name is part of cache key

Test Results

✅ All existing tests continue to pass (48 tests total)
✅ New cache tests: 13 subtests across 3 test functions
✅ 100% pass rate

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • New Features

    • Configurable cache TTLs for metadata and authorization via METADATA_CACHE_TTL and AUTHZ_CACHE_TTL (defaults: 60s); authorization TTL is capped to metadata TTL and logs a startup warning if exceeded.
  • Documentation

    • Added an Authorino caching guide with deployment examples, cache-key composition, tuning/monitoring, and operational/security notes; README updated with the new env vars.
  • Behavior Change

    • API key validation now returns the internal key ID as the user identifier.
  • Tests

    • Added reconciliation and cache-key isolation tests covering TTL behavior, capping, and selector composition.

Add Kuadrant AuthPolicy caching for apiKeyValidation metadata and all OPA
authorization rules (auth-valid, subscription-valid, require-group-membership).
Make cache TTLs configurable via controller environment variables.

Changes:
- Add --metadata-cache-ttl and --authz-cache-ttl flags to maas-controller
- Add cache blocks to apiKeyValidation metadata (HTTP call to /api-keys/validate)
- Update subscription-info metadata cache to use configurable TTL
- Add cache blocks to auth-valid, subscription-valid, and require-group-membership
  OPA authorization evaluators
- Add METADATA_CACHE_TTL and AUTHZ_CACHE_TTL environment variables to deployment
- Add params.env configuration for ODH overlay deployments
- Add comprehensive tests for cache configuration and key isolation
- Add security tests verifying cache keys prevent cross-principal leakage
- Document cache configuration and security properties

Cache Key Design:
- apiKeyValidation: keyed by API key value
- subscription-info: keyed by username|groups|subscription|model
- auth-valid: keyed by auth-type|identity|model
- subscription-valid: same as subscription-info (cache coherence)
- require-group-membership: keyed by username|groups|model

All cache keys include sufficient dimensions to prevent cross-user,
cross-API-key, and cross-model cache sharing.

Default TTL: 60 seconds for both metadata and authorization caches
Configurable via deployment/overlays/odh/params.env

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci openshift-ci bot requested review from SB159 and nirrozenbaum March 26, 2026 14:02
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds configurable TTLs for Authorino metadata and OPA authorization caches via new flags/env vars (METADATA_CACHE_TTL, AUTHZ_CACHE_TTL) and wires them into the MaaS controller. The reconciler now emits cache blocks for metadata evaluators (apiKeyValidation, subscription-info) and authorization evaluators (auth-valid, subscription-valid, require-group-membership) with composed cache keys including token/identity, subscription, model, and groups. Manifests, overlay params, CLI flags, extensive unit tests for TTL and cache-key behavior, a docs page on Authorino caching, and API-key UUID validation were added.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Security Issues Requiring Action

  1. Raw bearer token embedded in cache key — sensitive data exposure (CWE-200, CWE-209)
  • Action: Replace raw bearer token usage in generated cache keys with a one-way digest (HMAC-SHA256 with a server-side secret or SHA-256). Ensure only the digest is emitted in policies and templates; never log or surface raw tokens.
  1. Authz TTL > Metadata TTL currently only logged and capped at runtime — race/consistency risk (CWE-613)
  • Action: Enforce the cap at configuration validation time: either reject configs where AUTHZ > METADATA (fail fast) or normalize and persist the effective value. If auto-normalizing, emit a startup-level warning and record the effective TTL used to allow deterministic cache behavior.
  1. TTL bounds validation incomplete — permissive acceptance (CWE-20)
  • Action: Add strict bounds (min = 0 allowed to disable caching; set a conservative max, e.g., 86400). Fail startup on out-of-range values with clear error messages. Add unit tests for boundary conditions and invalid inputs.
  1. Cache-key collision between API-key and user-level entitlements risk — incorrect cache reuse (CWE-290, CWE-327)
  • Action: Ensure selectors unambiguously include a per-key unique identifier (database UUID or hashed key fingerprint) for API-key-scoped entitlements. Do not rely solely on username. Add unit tests that create multiple keys for the same user with different entitlements and assert unique cache keys.
  1. New UUID parsing requirement for API key IDs increases input rejection surface (CWE-20)
  • Action: Document the UUID requirement in API docs and error messages. Verify provisioning paths produce UUIDs; provide a migration/compatibility path or feature-flag if backward compatibility is required.
  1. Hashing/auditing impact — potential telemetry/log correlation risk (CWE-532)
  • Action: Audit logs and telemetry to ensure digests/fingerprints are not logged alongside plaintext tokens or other identifiers enabling correlation. Treat HMAC keys/secrets as high-value and restrict access.

Implement each item in code, tests, and documentation; add unit tests verifying hashed token use, TTL bounds enforcement, authz cap behavior, and cache-key uniqueness across multiple keys per user.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: enabling configurable Authorino caching for AuthPolicy evaluators through new TTL flags and environment variables.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go`:
- Around line 347-348: The cache key construction in the selector expression
that concatenates raw username/groups/subscription with "|" and "," is not
collision-safe; replace this fragile string join in the selector (the expression
referencing auth.metadata.apiKeyValidation.username,
auth.metadata.apiKeyValidation.groups, auth.identity.user.username,
auth.identity.user.groups, the x-maas-subscription header, and
ref.Namespace/ref.Name) with a collision-resistant representation — e.g., build
a structured payload of the relevant fields and either Base64/URL-safe encode
the JSON or compute a stable hash (sha256) of that structured object and use the
hash as the key; alternatively encode each group element unambiguously
(length-prefix or percent-encode each element) before joining so different group
lists cannot collide.
- Around line 343-350: The cache TTL must not outlive the subscription-info
metadata TTL; change the "ttl" in the "cache" map (currently set to
r.AuthzCacheTTL) to use the minimum of r.AuthzCacheTTL and r.MetadataCacheTTL
(e.g. ttl := min(r.AuthzCacheTTL, r.MetadataCacheTTL)) so cached allow entries
cannot survive past metadata refresh, and add a startup/validation check (where
the reconciler/controller is initialized) that rejects or logs an error and
fails fast if r.AuthzCacheTTL > r.MetadataCacheTTL to prevent misconfiguration;
reference the "cache" map where "ttl" is set and the r.AuthzCacheTTL and
r.MetadataCacheTTL configuration fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 884007a9-6a42-48ca-8c6a-d74cf473ff53

📥 Commits

Reviewing files that changed from the base of the PR and between 461f66d and 22529ed.

📒 Files selected for processing (8)
  • README.md
  • deployment/base/maas-controller/manager/manager.yaml
  • deployment/overlays/odh/kustomization.yaml
  • deployment/overlays/odh/params.env
  • docs/content/configuration-and-management/authorino-caching.md
  • maas-controller/cmd/manager/main.go
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller.go
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller_test.go

Comment on lines +343 to +350
// Cache authorization result keyed by subscription selection inputs.
// Uses same key dimensions as subscription-info metadata to ensure cache coherence.
// Key format: "username|groups|requested-subscription|model"
"cache": map[string]interface{}{
"key": map[string]interface{}{
"selector": fmt.Sprintf(`(auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.username : auth.identity.user.username) + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.groups : auth.identity.user.groups).join(",") + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.subscription : ("x-maas-subscription" in request.headers ? request.headers["x-maas-subscription"] : "")) + "|%s/%s"`, ref.Namespace, ref.Name),
},
"ttl": r.AuthzCacheTTL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Major: subscription-valid is not actually cache-coherent with subscription-info.

The key matches subscription-info, but the TTL does not. If AUTHZ_CACHE_TTL is greater than METADATA_CACHE_TTL, a cached allow for the same username/groups/subscription/model tuple can survive after subscription-info has already refreshed to a deny. That extends entitlement revocation beyond the metadata TTL and turns the "coherence" guarantee in the comment into a stale-authorization window (CWE-863). Reuse the metadata TTL here, or reject configs where authz TTL exceeds metadata TTL.

Possible fix
-				"ttl": r.AuthzCacheTTL,
+				"ttl": r.MetadataCacheTTL,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Cache authorization result keyed by subscription selection inputs.
// Uses same key dimensions as subscription-info metadata to ensure cache coherence.
// Key format: "username|groups|requested-subscription|model"
"cache": map[string]interface{}{
"key": map[string]interface{}{
"selector": fmt.Sprintf(`(auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.username : auth.identity.user.username) + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.groups : auth.identity.user.groups).join(",") + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.subscription : ("x-maas-subscription" in request.headers ? request.headers["x-maas-subscription"] : "")) + "|%s/%s"`, ref.Namespace, ref.Name),
},
"ttl": r.AuthzCacheTTL,
// Cache authorization result keyed by subscription selection inputs.
// Uses same key dimensions as subscription-info metadata to ensure cache coherence.
// Key format: "username|groups|requested-subscription|model"
"cache": map[string]interface{}{
"key": map[string]interface{}{
"selector": fmt.Sprintf(`(auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.username : auth.identity.user.username) + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.groups : auth.identity.user.groups).join(",") + "|" + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.subscription : ("x-maas-subscription" in request.headers ? request.headers["x-maas-subscription"] : "")) + "|%s/%s"`, ref.Namespace, ref.Name),
},
"ttl": r.MetadataCacheTTL,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go` around
lines 343 - 350, The cache TTL must not outlive the subscription-info metadata
TTL; change the "ttl" in the "cache" map (currently set to r.AuthzCacheTTL) to
use the minimum of r.AuthzCacheTTL and r.MetadataCacheTTL (e.g. ttl :=
min(r.AuthzCacheTTL, r.MetadataCacheTTL)) so cached allow entries cannot survive
past metadata refresh, and add a startup/validation check (where the
reconciler/controller is initialized) that rejects or logs an error and fails
fast if r.AuthzCacheTTL > r.MetadataCacheTTL to prevent misconfiguration;
reference the "cache" map where "ttl" is set and the r.AuthzCacheTTL and
r.MetadataCacheTTL configuration fields.

Prevent stale authorization decisions by ensuring authorization caches
cannot outlive their dependent metadata caches.

Problem:
Authorization evaluators depend on metadata evaluators:
- auth-valid depends on apiKeyValidation metadata
- subscription-valid depends on subscription-info metadata
- require-group-membership depends on apiKeyValidation metadata (groups)

If AUTHZ_CACHE_TTL > METADATA_CACHE_TTL, authorization decisions could
be cached based on expired metadata, leading to incorrect allow/deny
decisions after metadata expires.

Solution:
- Add authzCacheTTL() helper that returns min(AuthzCacheTTL, MetadataCacheTTL)
- Update all 3 authorization caches to use authzCacheTTL() instead of AuthzCacheTTL
- Add validation in SetupWithManager() that logs a warning if authz TTL
  exceeds metadata TTL
- Add test case verifying TTL capping behavior (authz=300, metadata=60 → effective=60)
- Document TTL capping behavior and recommendation

This ensures authorization cache entries never outlive the metadata they
depend on, maintaining cache coherence and preventing authorization bugs.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
maas-controller/pkg/controller/maas/maasauthpolicy_controller.go (1)

355-364: ⚠️ Potential issue | 🟠 Major

Major: delimiter-joined cache keys are still collision-prone (CWE-116).

Line 361 and Line 412 still serialize raw username, groups, and subscription with | / ,. Distinct tuples can collapse to the same key (["a,b","c"] and ["a","b,c"]), and the same group set in a different order misses the cache. A caller with delimiter-bearing identity or header values can therefore reuse another tuple's cached allow/deny. Encode each field and each group element unambiguously, or hash a structured payload. subscription-info uses the same serializer, so fix all three selectors together.

Also applies to: 406-415

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go` around
lines 355 - 364, The cache key construction that concatenates raw username,
groups and subscription into a delimited string (the "selector" value built
using auth.metadata.apiKeyValidation.*, auth.identity.user.*,
request.headers["x-maas-subscription"], and ref.Namespace/ref.Name) is
collision-prone; change it to produce an unambiguous key by either (a) building
a structured payload (e.g., {username:string, groups:sorted[] ,
subscription:string, model:"%s/%s"}) and computing a stable digest (SHA256/hex
or base64) of its JSON representation, or (b) URL/base64-encoding each field and
each group element after sorting groups and joining them, then combining those
encoded parts; apply the same fix to the other selector(s) that serialize
subscription-info metadata so all selectors use the same stable,
collision-resistant encoding and keep r.authzCacheTTL() unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go`:
- Around line 78-87: The TTL getters currently allow negative values to flow
into generated cache.ttl fields; update SetupWithManager in
MaaSAuthPolicyReconciler to validate r.AuthzCacheTTL and r.MetadataCacheTTL and
return an error (reject < 0) before controller startup, and also harden
authzCacheTTL() to defensively clamp negatives to zero (or return the
non-negative minimum) so no negative TTL can be emitted; reference the
SetupWithManager method for initial validation and the authzCacheTTL() function
for the defensive fallback.
- Around line 226-234: The metadata.apiKeyValidation evaluator is currently
unconditional and you are caching the raw Authorization header as the cache key;
constrain the evaluator and cache key to only apply to real API keys by gating
metadata.apiKeyValidation (and the cache entry that uses
request.headers.authorization.replace("Bearer ", "")) with a when clause
matching the API-key pattern (e.g. request.headers.authorization matches
"^Bearer sk-oai-.*") or skip caching when the header does not match; also fix
all cache key constructions that concatenate user-controlled fields
(username|groups, username, groups, etc.) by replacing delimiter-concatenation
with a safe, unambiguous encoding (e.g. JSON-serialize the key parts or use
length-prefixed components) wherever cache keys are built; finally validate and
reject negative TTLs by enforcing AuthzCacheTTL and MetadataCacheTTL >= 0 before
assigning to cache.ttl so negative values cannot propagate into the cache
configuration.

---

Duplicate comments:
In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go`:
- Around line 355-364: The cache key construction that concatenates raw
username, groups and subscription into a delimited string (the "selector" value
built using auth.metadata.apiKeyValidation.*, auth.identity.user.*,
request.headers["x-maas-subscription"], and ref.Namespace/ref.Name) is
collision-prone; change it to produce an unambiguous key by either (a) building
a structured payload (e.g., {username:string, groups:sorted[] ,
subscription:string, model:"%s/%s"}) and computing a stable digest (SHA256/hex
or base64) of its JSON representation, or (b) URL/base64-encoding each field and
each group element after sorting groups and joining them, then combining those
encoded parts; apply the same fix to the other selector(s) that serialize
subscription-info metadata so all selectors use the same stable,
collision-resistant encoding and keep r.authzCacheTTL() unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: b75956a5-9103-4ebd-9f6c-c49968207f8b

📥 Commits

Reviewing files that changed from the base of the PR and between 22529ed and 7e21602.

📒 Files selected for processing (3)
  • docs/content/configuration-and-management/authorino-caching.md
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller.go
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs/content/configuration-and-management/authorino-caching.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller_test.go

jrhyness and others added 4 commits March 26, 2026 11:56
…tion

Add robust cache TTL configuration with validation and conditional
metadata evaluation to improve security and prevent configuration errors.

Changes:
- Add negative TTL validation in SetupWithManager()
  - Reject negative metadata cache TTL at controller startup
  - Reject negative authorization cache TTL at controller startup
  - Log warning when authz TTL exceeds metadata TTL

- Add defensive clamping in authzCacheTTL()
  - Clamp negative values to 0 (defense in depth)
  - Return min(authz, metadata) to prevent stale auth decisions

- Add conditional execution for apiKeyValidation metadata
  - Only run for API key requests (Authorization: Bearer sk-oai-*)
  - Skip for Kubernetes token requests
  - Prevents unnecessary cache pollution and API calls

- Add comprehensive test coverage
  - TestMaaSAuthPolicyReconciler_NegativeTTLRejection
  - Validates positive, zero, and negative TTL handling

- Update documentation
  - Document cache key format and collision risks
  - Explain apiKeyValidation conditional behavior
  - Add operational tuning guidance

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace username-based cache keys with database-assigned UUIDs for API
keys to eliminate collision attack vector. This provides cryptographic
collision resistance without requiring CEL base64/hash functions.

Changes:
- maas-api: Set ValidationResult.UserID to database UUID (api_keys.id)
  - Previously used username (user-controllable, collision risk)
  - Now uses immutable database-assigned UUID
  - Updated tests to expect UUID in userID field

- maas-controller: Update cache keys to use userId instead of username
  - subscription-info metadata: userId|groups|subscription|model
  - subscription-valid authz: userId|groups|subscription|model
  - require-group-membership authz: userId|groups|model
  - For API keys: userId = database UUID (collision-resistant)
  - For K8s tokens: userId = validated username (already safe)

- Documentation: Explain UUID-based collision resistance
  - API keys: Database UUIDs eliminate user-controlled collisions
  - K8s tokens: Validated usernames (DNS-1123 format, no delimiters)
  - Remaining edge case: Group ordering (low impact)

Security Impact:
- Eliminates cache poisoning attacks via crafted usernames
- User cannot create collision by registering username like "alice|groups"
- UUID uniqueness guarantees isolation even if groups contain delimiters

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update cache key isolation tests to explicitly verify that:
- subscription-info cache keys include userId for API keys
- subscription-valid cache keys include userId for API keys
- Both still support username fallback for K8s tokens

This ensures tests validate the UUID-based collision resistance
added in commit 1e16400 instead of just checking for "username"
substring which passes even when using userId.

Changes:
- TestMaaSAuthPolicyReconciler_CacheKeyIsolation:
  - subscription-info test: Check for both userId AND username
  - subscription-valid test: Check for both userId AND username
  - Verifies API key path (userId) and K8s token path (username)

Rationale:
Previous tests checked `contains(cacheKey, "username")` which passed
because the CEL expression contains "username" in the K8s fallback
path, but didn't verify we're using userId for API keys.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix K8s token auth failures by using has() instead of checking .valid
before accessing auth.metadata.apiKeyValidation fields. When apiKeyValidation
doesn't run (for K8s tokens due to the when clause), accessing .valid throws
a CEL error before the ternary can evaluate the fallback path.

The issue: K8s tokens were returning empty model lists because cache key
CEL expressions failed to evaluate, preventing subscription-info metadata
from being called properly.

Changes:
- All cache keys: has(auth.metadata.apiKeyValidation) instead of .valid == true
- subscription-info HTTP body: same has() check
- Response headers (X-MaaS-*): same has() check

Pattern change:
  BEFORE: auth.metadata.apiKeyValidation.valid == true ? ...userId : ...username
  AFTER:  has(auth.metadata.apiKeyValidation) ? ...userId : ...username

This matches the pattern already used in auth-valid cache key which works
correctly for both API keys and K8s tokens.

Fixes: E2E test failures for K8s token /v1/models queries
- test_explicit_subscription_header
- test_user_token_returns_all_models
- test_service_account_token_multiple_subs_*

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jrhyness jrhyness closed this Mar 26, 2026
@jrhyness jrhyness reopened this Mar 26, 2026
@jrhyness
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
maas-controller/pkg/controller/maas/maasauthpolicy_controller.go (1)

283-285: ⚠️ Potential issue | 🟠 Major

Major: these selectors still serialize cache keys ambiguously (CWE-116).

Raw | concatenation plus .join(",") is not injective. For the same userId, requestedSubscription, and model, groups ["team,admin"] and ["team","admin"] collapse to the same subscription-info / subscription-valid / require-group-membership key, so a cached allow/deny can be replayed for a different membership tuple. x-maas-subscription is also concatenated raw. Encode each user-controlled component unambiguously (length-prefix, JSON, or a stable hash of a structured payload) before joining; the new comments currently overstate the collision resistance.

As per coding guidelines, **: REVIEW PRIORITIES: Security vulnerabilities.

Also applies to: 383-385, 435-437

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go` around
lines 283 - 285, The selector string construction that concatenates
user-controlled values with raw "|" and .join(",") (the map key for "selector"
near the selector = fmt.Sprintf(...) usage, and similar constructions at the
other noted occurrences) produces ambiguous cache keys and must be replaced with
an unambiguous encoding: build a structured payload of the three user-controlled
components (userId, groups array, subscription/x-maas-subscription) and
serialize it deterministically (e.g., JSON with stable ordering, or
length-prefixed components, or a stable hash like SHA256 of the JSON) before
embedding into the selector; update the code that currently formats the selector
string (the fmt.Sprintf(...) call that also uses ref.Namespace/ref.Name and
r.MetadataCacheTTL) to use the chosen encoding so groups like ["team,admin"] vs
["team","admin"] and headers cannot collide—apply the same change to the other
occurrences mentioned (the blocks around the other selector constructions).
🧹 Nitpick comments (1)
maas-api/internal/api_keys/service_test.go (1)

61-61: Enforce UUID identity invariant in assertions to protect cache-key isolation

Lines 61 and 173 only check equality with fixture IDs; they do not verify the UUID-format invariant the security model depends on. Add format assertions and UUID-like fixture IDs so collision-resistance regressions are caught.

Suggested test hardening
@@
-	keyID := "test-key-id"
+	keyID := "11111111-1111-1111-1111-111111111111"
@@
-	assert.Equal(t, keyID, result.UserID) // UserID is the database-assigned key ID (UUID)
+	assert.Equal(t, keyID, result.UserID)
+	assert.Regexp(t, `^[0-9a-fA-F-]{36}$`, result.UserID)
@@
-	keyID := "no-groups-key"
+	keyID := "22222222-2222-2222-2222-222222222222"
@@
-	assert.Equal(t, keyID, result.UserID) // UserID is the database-assigned key ID (UUID)
+	assert.Equal(t, keyID, result.UserID)
+	assert.Regexp(t, `^[0-9a-fA-F-]{36}$`, result.UserID)

As per coding guidelines, **: REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code).

Also applies to: 173-173

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@maas-api/internal/api_keys/service_test.go` at line 61, The test currently
only checks equality of fixture IDs (assert.Equal(t, keyID, result.UserID)) but
does not assert the UUID format invariant; update the tests (the assertion
around result.UserID and the similar assertion at the other occurrence) to
validate that the value is a well-formed UUID (e.g., using the project's UUID
parser/validator or a regex) and change the fixtures to use UUID-like strings so
the test verifies both equality and UUID format to preserve cache-key collision
resistance; locate and modify the assertions referencing result.UserID and the
other occurrence at the later assertion to add the UUID-format check alongside
the equality check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@maas-api/internal/api_keys/service.go`:
- Line 245: Locate the place in service.go where the struct field is set with
"UserID: metadata.ID" and add strict validation: ensure metadata.ID is non-empty
and a syntactically valid UUID (e.g., use your UUID parsing/validation helper or
uuid.Parse) before assigning; if validation fails, return an error and abort the
operation (fail-closed) instead of proceeding to create cache keys or persist
data. Update the surrounding function (the API key creation/upsert routine that
sets UserID) to perform this check early and propagate a clear error to the
caller so malformed/empty metadata.ID cannot be used for cache keys or
authorization decisions.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller_test.go`:
- Around line 546-600: The test
TestMaaSAuthPolicyReconciler_NegativeTTLRejection currently never uses
expectSetupFail so it doesn't assert that SetupWithManager rejects negative
TTLs; update the test to either call the real validation used by
SetupWithManager or move the TTL validation into a shared helper (e.g.,
ValidateCacheTTLs or ensureSetupValid) and call that helper from both
SetupWithManager and this test. Specifically, add a validation function
referenced by MaaSAuthPolicyReconciler (or expose the existing check used in
SetupWithManager) and in the table-driven test assert that for cases with
expectSetupFail==true the validation returns an error (or SetupWithManager
against a lightweight/fake manager returns an error), and for
expectSetupFail==false it succeeds; keep using authzCacheTTL() only for its
clamping behavior but not as the sole validation assertion.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go`:
- Around line 463-466: The "X-MaaS-Group" header currently constructs JSON by
string-concatenation in the plain.expression (the `'["' + ... .join('","') +
'"]'` piece), which allows injection and malformed JSON; update the expression
that populates the "X-MaaS-Group" header (the plain.expression value) to emit
properly encoded JSON by applying CEL's JSON formatting/escaping (e.g. use the
`@json/JSON` formatter or an equivalent to serialize the groups array) to the
actual groups array expression ((has(auth.metadata) &&
has(auth.metadata.apiKeyValidation)) ? auth.metadata.apiKeyValidation.groups :
auth.identity.user.groups) rather than joining strings, so empty arrays
serialize as [] and special characters are escaped.

---

Duplicate comments:
In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go`:
- Around line 283-285: The selector string construction that concatenates
user-controlled values with raw "|" and .join(",") (the map key for "selector"
near the selector = fmt.Sprintf(...) usage, and similar constructions at the
other noted occurrences) produces ambiguous cache keys and must be replaced with
an unambiguous encoding: build a structured payload of the three user-controlled
components (userId, groups array, subscription/x-maas-subscription) and
serialize it deterministically (e.g., JSON with stable ordering, or
length-prefixed components, or a stable hash like SHA256 of the JSON) before
embedding into the selector; update the code that currently formats the selector
string (the fmt.Sprintf(...) call that also uses ref.Namespace/ref.Name and
r.MetadataCacheTTL) to use the chosen encoding so groups like ["team,admin"] vs
["team","admin"] and headers cannot collide—apply the same change to the other
occurrences mentioned (the blocks around the other selector constructions).

---

Nitpick comments:
In `@maas-api/internal/api_keys/service_test.go`:
- Line 61: The test currently only checks equality of fixture IDs
(assert.Equal(t, keyID, result.UserID)) but does not assert the UUID format
invariant; update the tests (the assertion around result.UserID and the similar
assertion at the other occurrence) to validate that the value is a well-formed
UUID (e.g., using the project's UUID parser/validator or a regex) and change the
fixtures to use UUID-like strings so the test verifies both equality and UUID
format to preserve cache-key collision resistance; locate and modify the
assertions referencing result.UserID and the other occurrence at the later
assertion to add the UUID-format check alongside the equality check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 59dbbaf9-94f8-418b-a6b1-0099ba6a1924

📥 Commits

Reviewing files that changed from the base of the PR and between 7e21602 and f637c75.

📒 Files selected for processing (5)
  • docs/content/configuration-and-management/authorino-caching.md
  • maas-api/internal/api_keys/service.go
  • maas-api/internal/api_keys/service_test.go
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller.go
  • maas-controller/pkg/controller/maas/maasauthpolicy_controller_test.go

Comment on lines 463 to 466
"X-MaaS-Group": map[string]interface{}{
"plain": map[string]interface{}{
"expression": `'["' + (auth.metadata.apiKeyValidation.valid == true ? auth.metadata.apiKeyValidation.groups : auth.identity.user.groups).join('","') + '"]'`,
"expression": `'["' + ((has(auth.metadata) && has(auth.metadata.apiKeyValidation)) ? auth.metadata.apiKeyValidation.groups : auth.identity.user.groups).join('","') + '"]'`,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the file and verify it exists
find . -type f -name "maasauthpolicy_controller.go" 2>/dev/null

Repository: opendatahub-io/models-as-a-service

Length of output: 144


🏁 Script executed:

# Read the specific lines mentioned in the review (463-466) with context
if [ -f "maas-controller/pkg/controller/maas/maasauthpolicy_controller.go" ]; then
  echo "=== Lines 450-480 for context ==="
  sed -n '450,480p' maas-controller/pkg/controller/maas/maasauthpolicy_controller.go | cat -n
fi

Repository: opendatahub-io/models-as-a-service

Length of output: 1814


🏁 Script executed:

# Also check if there are other similar patterns in the file that might use hand-rolled JSON
rg -n 'join.*"' maas-controller/pkg/controller/maas/maasauthpolicy_controller.go -A 2 -B 2

Repository: opendatahub-io/models-as-a-service

Length of output: 2651


🏁 Script executed:

# Search for where apiKeyValidation.groups is populated
rg -n "apiKeyValidation.*groups|groups.*=" maas-controller/pkg/controller/maas/maasauthpolicy_controller.go | head -20

Repository: opendatahub-io/models-as-a-service

Length of output: 3160


🏁 Script executed:

# Check other files in the controller directory that might set these values
fd . maas-controller/pkg/controller/maas -type f -name "*.go" | head -10

Repository: opendatahub-io/models-as-a-service

Length of output: 248


🏁 Script executed:

# Look for any comment or code that validates group names
rg -i "group.*validat|sanitize.*group|escape.*group" maas-controller/pkg/controller/maas/ -A 2 -B 2

Repository: opendatahub-io/models-as-a-service

Length of output: 6164


🏁 Script executed:

# Search for where groups come from - look for struct definitions or auth setup
rg -n "type.*Auth|apiKeyValidation" maas-controller/pkg/controller/maas/maasauthpolicy_controller.go -A 5 | head -30

Repository: opendatahub-io/models-as-a-service

Length of output: 1467


🏁 Script executed:

# Check the response body at line 268 to see if groups are encoded safely
sed -n '260,280p' maas-controller/pkg/controller/maas/maasauthpolicy_controller.go | cat -n

Repository: opendatahub-io/models-as-a-service

Length of output: 1697


🏁 Script executed:

# Look for CEL JSON encoding functions available
rg -n "json|JSON" maas-controller/pkg/controller/maas/maasauthpolicy_controller.go | head -15

Repository: opendatahub-io/models-as-a-service

Length of output: 557


Major: X-MaaS-Group header is hand-rolled JSON (CWE-116).

Line 465 uses '["' + groups.join('","') + '"]' to construct JSON. This lacks proper encoding: empty groups becomes [""] (invalid), and a group value like foo","admin injects literal quotes, adding unintended array elements. Downstream JSON parsers can be tricked into accepting forged group memberships. Use proper JSON encoding (CEL's @json formatter or string escaping) instead of string concatenation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@maas-controller/pkg/controller/maas/maasauthpolicy_controller.go` around
lines 463 - 466, The "X-MaaS-Group" header currently constructs JSON by
string-concatenation in the plain.expression (the `'["' + ... .join('","') +
'"]'` piece), which allows injection and malformed JSON; update the expression
that populates the "X-MaaS-Group" header (the plain.expression value) to emit
properly encoded JSON by applying CEL's JSON formatting/escaping (e.g. use the
`@json/JSON` formatter or an equivalent to serialize the groups array) to the
actual groups array expression ((has(auth.metadata) &&
has(auth.metadata.apiKeyValidation)) ? auth.metadata.apiKeyValidation.groups :
auth.identity.user.groups) rather than joining strings, so empty arrays
serialize as [] and special characters are escaped.

jrhyness and others added 2 commits March 26, 2026 14:19
Three improvements based on CodeRabbit review:

1. UUID Validation (Defense-in-Depth):
   - Add strict UUID validation in ValidateAPIKey before using metadata.ID
   - Fail-closed: return error instead of proceeding with malformed IDs
   - Prevents malformed IDs from being used in cache keys or authz decisions
   - Updated tests to use realistic UUIDs instead of human-readable IDs

2. Negative TTL Test Coverage:
   - Extract ValidateCacheTTLs() helper function from SetupWithManager
   - Update TestMaaSAuthPolicyReconciler_NegativeTTLRejection to actually
     test the validation logic used by SetupWithManager
   - Test now verifies expectSetupFail cases properly reject negative TTLs

3. X-MaaS-Group JSON Construction (Injection Prevention):
   - Replace string concatenation ('["' + ... + '"]') with CEL string() conversion
   - Properly escapes special characters in group names
   - Prevents JSON injection if group names contain quotes or brackets
   - Empty arrays now serialize as [] instead of potentially malformed strings

All tests pass after changes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…p header

Replace string concatenation with proper JSON serialization in the static
maas-api AuthPolicy to prevent JSON injection vulnerabilities.

Before (vulnerable to JSON injection):
  expression: '''["'' + auth.metadata.apiKeyValidation.groups.join(''","'') + ''"]'''

After (secure):
  expression: string(auth.metadata.apiKeyValidation.groups)

The string() function in CEL properly serializes arrays to JSON with proper
escaping, preventing injection of malformed JSON or special characters.

This matches the fix already applied to controller-generated AuthPolicies
in commit 00db174, but the static maas-api AuthPolicy was missed.

Addresses CodeRabbit security finding (lines 463-466 comment).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deployment/base/maas-api/policies/auth-policy.yaml (1)

70-76: ⚠️ Potential issue | 🔴 Critical

Critical: CEL string() expression produces invalid JSON for downstream parser

The expression on line 75 uses string(auth.metadata.apiKeyValidation.groups). CEL's string() function serializes arrays as Go-style representation (e.g., [group1 group2]), not valid JSON.

The downstream consumer at maas-api/internal/token/handler.go:40 explicitly calls json.Unmarshal([]byte(header), &groups), which strictly requires valid JSON format per its documentation: ["group1","group2","group3"]. The parser will fail with:

failed to parse header as JSON array: invalid character 'g' looking for beginning of value

This breaks API-key authentication entirely. The code at line 82 correctly uses @tostr transformer for the OpenShift groups path; apply the same approach here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployment/base/maas-api/policies/auth-policy.yaml` around lines 70 - 76, The
X-MaaS-Group header currently serializes groups with
string(auth.metadata.apiKeyValidation.groups) which emits Go-style arrays and
breaks downstream json.Unmarshal; change the plain expression in the
X-MaaS-Group block to use the CEL to-string transformer (same approach as the
OpenShift groups entry) so the value is emitted as valid JSON (e.g.,
auth.metadata.apiKeyValidation.groups@tostr or equivalent transformer syntax
used elsewhere), leaving header name and priority unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@deployment/base/maas-api/policies/auth-policy.yaml`:
- Around line 70-76: The X-MaaS-Group header currently serializes groups with
string(auth.metadata.apiKeyValidation.groups) which emits Go-style arrays and
breaks downstream json.Unmarshal; change the plain expression in the
X-MaaS-Group block to use the CEL to-string transformer (same approach as the
OpenShift groups entry) so the value is emitted as valid JSON (e.g.,
auth.metadata.apiKeyValidation.groups@tostr or equivalent transformer syntax
used elsewhere), leaving header name and priority unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: e9835b6e-1618-433a-8bc1-0bd7421c777d

📥 Commits

Reviewing files that changed from the base of the PR and between 00db174 and b558679.

📒 Files selected for processing (1)
  • deployment/base/maas-api/policies/auth-policy.yaml

@jrhyness
Copy link
Copy Markdown
Contributor Author

/retest

…JSON

CodeRabbit finding: string() produces Go-style arrays like [group1 group2]
instead of valid JSON. Use @tostr transformer (same as OpenShift groups path)
to emit proper JSON format ["group1","group2"] for json.Unmarshal.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jrhyness
Copy link
Copy Markdown
Contributor Author

@coderabbitai The cache key collision concerns are partially mitigated by using database UUIDs for API keys (commit 1e16400), which eliminates the primary attack vector of user-controlled username collisions. The residual risk with delimiter-containing groups is documented in the Authorino caching guide and accepted as a trade-off given CEL's lack of built-in encoding functions (base64/hash/URL encoding).

The current implementation:

  • ✅ API keys: Use database UUIDs (not user-controllable, collision-resistant)
  • ✅ K8s tokens: Validated usernames (DNS-1123 format, no delimiters allowed)
  • ⚠️ Groups: Delimiter-based joining (documented residual risk)

This approach prioritizes the most severe attack vector (username manipulation) while accepting lower-impact risks in real-world OIDC/LDAP environments where group names rarely contain delimiter characters.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Tip

For best results, initiate chat on the files or code changes.

@jrhyness: Acknowledged — the layered mitigation makes sense given CEL's constraints.

The threat model decomposition is sound:

  • API keys → UUID: Removes the highest-severity vector entirely. A database-assigned UUID is out of the attacker's control, so no crafted userId|groups|... string can produce a collision.
  • K8s tokens → DNS-1123: The format constraint ([a-z0-9-.], max 63 chars per label) structurally excludes |, \n, and other common delimiter characters, so delimiter injection is not possible at the identity layer.
  • Groups → documented residual: The remaining risk requires a group name that contains the chosen delimiter and is present in a target user's group membership — a combination that is unlikely in well-managed OIDC/LDAP directories and is now explicitly documented.

One low-cost hardening option worth noting for the future: if Authorino ever exposes ext_authz-style header injection or a custom OPA function, a simple hash over the concatenated key components would eliminate the residual risk without CEL changes. Not blocking for this PR.


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@jrhyness jrhyness added 3.4 ready-for-review All changes are complete and CI checks have passed. labels Mar 26, 2026
Resolved conflicts by keeping jr_55467's security improvements:
- has() safety checks in CEL expressions
- userId-based cache keys (collision-resistant)
- Conditional apiKeyValidation with when clause
- string() function for proper JSON group serialization
@jrhyness
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@jrhyness
Copy link
Copy Markdown
Contributor Author

/retest

Copy link
Copy Markdown
Contributor

@jland-redhat jland-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple questions

svc, store := createTestService(t)

// Create a valid API key
keyID := "last-used-key"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we change these? I do think it made it a little more obvious what they were for the old way.

}, nil
}

// Validate metadata.ID is a syntactically valid UUID (fail-closed defense-in-depth)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little confused as to why changes are being made in MaaS API at all.

The cacheing should be purely through Kudarant so on the AuthPolicies right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Rabbit was concerned about how the cache key was originally created - using the user name - which could be manipulated to gain unauthorized access. A mitigation was to use the user id , but that was also the user name. Changing it to a uuid let to these maas-api changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai, do you want to elaborate?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrhyness — happy to elaborate for jland-redhat's benefit.

Why maas-api changes are needed even though caching lives in Kuadrant

The Authorino apiKeyValidation metadata evaluator calls the maas-api /api-keys/validate endpoint and stores the JSON response. Fields from that response are then used by downstream Authorino evaluators (auth-valid, subscription-valid, require-group-membership) — both in their CEL expressions and in their cache.key.selector strings.

The chain looks like this:

Bearer sk-oai-… → Authorino apiKeyValidation → HTTP call → maas-api /api-keys/validate
                                                           ↓
                                             ValidationResult JSON (UserID, Username, Groups…)
                                                           ↓
                               Authorino evaluator cache key selectors consume these fields

Original security problem (CWE-706 / cache key collision)

If UserID was set to metadata.Username — a value that is chosen by the user at API-key creation time — an attacker could register a username that collides with another user's identity string. Because Authorino's cache is keyed on that value, a crafted username could cause a cache hit against a different user's cached authorization result, potentially returning a valid authorization decision for a model or subscription the attacker shouldn't have access to.

The fix

Changing UserID to metadata.ID (the database-assigned UUID) breaks that attack:

Property metadata.Username metadata.ID (UUID)
User-controlled ✅ Yes ❌ No
Unique guarantee Weak (app-level) Strong (DB uuid_generate_v4())
Collision-resistant No Yes (2¹²² space)
Safe in cache key selector No Yes

The UUID validation (uuid.Parse) added in the same diff is defense-in-depth (CWE-20): it ensures that even if a database integrity issue somehow produced a malformed ID, a non-UUID string would never silently propagate into Authorino's cache key selectors where it could cause unpredictable key collisions or CEL expression errors.

Summary for jland-redhat

The caching mechanism is purely Kuadrant/Authorino, but the data those cache keys are built from comes from the maas-api validation response. Making the cache keys secure required making the source data (the UserID field) tamper-resistant — which required the small change in maas-api.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting...

Not an edge case I would have thought of

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@jland-redhat
Copy link
Copy Markdown
Contributor

/retest

@jland-redhat
Copy link
Copy Markdown
Contributor

/override ci/prow/e2e-test-maas

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 29, 2026

@jland-redhat: Overrode contexts on behalf of jland-redhat: ci/prow/e2e-test-maas

Details

In response to this:

/override ci/prow/e2e-test-maas

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jland-redhat
jland-redhat previously approved these changes Mar 29, 2026
@jland-redhat
Copy link
Copy Markdown
Contributor

/retest

@jrhyness
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@jland-redhat
Copy link
Copy Markdown
Contributor

/retest

@jland-redhat
Copy link
Copy Markdown
Contributor

/override ci/prow/e2e-test-maas

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 30, 2026

@jland-redhat: Overrode contexts on behalf of jland-redhat: ci/prow/e2e-test-maas

Details

In response to this:

/override ci/prow/e2e-test-maas

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Contributor

@jland-redhat jland-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openshift-ci openshift-ci bot added the lgtm label Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jland-redhat, jrhyness

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 30, 2026

@jrhyness: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-test-maas 97865c6 link false /test e2e-test-maas

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 4f06bcf into opendatahub-io:main Mar 30, 2026
16 of 17 checks passed
mynhardtburger added a commit to mynhardtburger/models-as-a-service that referenced this pull request Apr 2, 2026
…d of username

The cache key CEL expression was changed from apiKeyValidation.username
to apiKeyValidation.userId in PR opendatahub-io#621, but the test assertion was not
updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mynhardt Burger <mynhardt@gmail.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.4 approved lgtm ready-for-review All changes are complete and CI checks have passed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants