feat(auth,prompts,inference): multi-tenancy MVP for MaaS deployments by franciscojavierarceo · Pull Request #5614 · ogx-ai/ogx

franciscojavierarceo · 2026-04-24T13:30:24Z

Summary

Implements Phase 1 multi-tenancy support for MaaS, llm-d, and vLLM deployments.

attribute_headers on UpstreamHeaderAuthConfig — maps multiple HTTP headers to attribute categories (e.g., X-MaaS-Group → teams, X-MaaS-Subscription → namespaces). Values merge with the existing attributes_header field. Enables MaaS Authorino integration where identity is spread across multiple upstream headers.
Prompts migrated from KVStore to AuthorizedSqlStore — prompts now have row-level access control via owner_principal and access_attributes, matching the pattern used by conversations, responses, and other stateful resources. Breaking change: existing KV-stored prompts must be recreated.
fairness_header_attribute on vLLM config — injects x-gateway-inference-fairness-id on outgoing API calls from the authenticated user's attributes. Used by llm-d EPP Flow Control for per-tenant fair scheduling. Implemented as a _get_extra_request_headers() hook on OpenAIMixin so the pattern is reusable by other providers.

Files changed (12 files, +486 / -133)

Area	Files
Auth	`core/datatypes.py`, `core/server/auth_providers.py`
Prompts	`core/prompts/prompts.py`, `core/storage/datatypes.py`, `core/stack.py`
Inference	`providers/remote/inference/vllm/config.py`, `providers/remote/inference/vllm/vllm.py`, `providers/utils/inference/openai_mixin.py`
Docs	`docs/docs/providers/inference/remote_vllm.mdx` (auto-generated)
Tests	`tests/unit/server/test_auth_upstream_header.py`, `tests/unit/providers/inference/test_remote_vllm.py`, `tests/unit/prompts/prompts/conftest.py`

Design decisions

_get_extra_request_headers() hook vs. direct injection — the RFC suggested injecting the fairness header directly in the vLLM adapter. Instead, this PR adds a hook on OpenAIMixin that any OpenAI-compatible provider can override, using the SDK's extra_headers kwarg on create() calls.
Agent state KV key prefixing skipped — persistence_store in providers/inline/responses/builtin/impl.py is dead code (initialized, never read/written). Actual state uses ResponsesStore which already backs AuthorizedSqlStore.
set_default_version crash safety — new default is set before clearing old defaults, so a crash mid-operation leaves two defaults (recoverable) rather than zero (data loss).

Known limitations (Phase 2)

The following resources still use plain KVStore with no tenant isolation:

Resource	Risk	Notes
Connectors	High	May store auth credentials; migrate to AuthorizedSqlStore
Batch state	High	Contains job results with user data
Vector store metadata	High	All vector_io providers (FAISS, Chroma, Qdrant, etc.)
Distribution registry	Medium	Leaks provider info; arguably admin-only
Quota tracking	Low	Per-client rate limiting; should not be tenant-scoped

Test plan

Unit tests: uv run pytest tests/unit/ -x --tb=short (56 new + existing tests pass)
Pre-commit hooks: uv run pre-commit run --all-files
Integration tests (replay): uv run --no-sync ./scripts/integration-tests.sh --stack-config server:ci-tests --setup gpt --suite responses
Verify attribute_headers with MaaS Authorino headers
Verify fairness_header_attribute sends header to llm-d EPP

Generated with Claude Code

…oyments Add multi-header identity mapping for upstream gateway auth (attribute_headers), migrate prompts from KVStore to AuthorizedSqlStore for tenant-scoped access control, and add llm-d fairness header propagation through a per-request header hook in OpenAIMixin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

…rness header tests Reorder set_default_version to set the new default before clearing old ones, preventing a crash from leaving zero defaults. Add unit tests for the vLLM fairness header injection via _get_extra_request_headers covering all code paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

mergify · 2026-04-24T18:07:21Z

This pull request has merge conflicts that must be resolved before it can be merged. @franciscojavierarceo please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

franciscojavierarceo and others added 2 commits April 23, 2026 23:02

mergify Bot added the needs-rebase label Apr 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auth,prompts,inference): multi-tenancy MVP for MaaS deployments#5614

feat(auth,prompts,inference): multi-tenancy MVP for MaaS deployments#5614
franciscojavierarceo wants to merge 2 commits intoogx-ai:mainfrom
franciscojavierarceo:worktree-multi-tenancy-mvp

franciscojavierarceo commented Apr 24, 2026 •

edited

Loading

Uh oh!

mergify Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

franciscojavierarceo commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files changed (12 files, +486 / -133)

Design decisions

Known limitations (Phase 2)

Test plan

Uh oh!

mergify Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

franciscojavierarceo commented Apr 24, 2026 •

edited

Loading