Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions docs/my-website/docs/proxy/model_access.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,151 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \

### [API Reference](https://litellm-api.up.railway.app/#/team%20management/new_team_team_new_post)

## **Per-Member Model Overrides (Team-Scoped Defaults)**

:::info

Requires `TEAM_MODEL_OVERRIDES=true` environment variable or `litellm.team_model_overrides_enabled = True`.

:::

By default, every team member can access all models in `team.models`. With per-member model overrides, you can:

- Set **`default_models`** on a team — the models every member gets by default
- Set **`models`** on individual team members — additional models only they can access

A member's **effective models** = `default_models` ∪ `member.models`. If neither is set, falls back to `team.models` (full backward compatibility).

### Enable the Feature

Add to your `config.yaml`:

```yaml
environment_variables:
TEAM_MODEL_OVERRIDES: "true"
```

### 1. Create a Team with Default Models

```shell
curl -L 'http://localhost:4000/team/new' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_alias": "engineering",
"models": ["gpt-4", "gpt-4o-mini", "gpt-4o"],
"default_models": ["gpt-4o-mini"]
}'
Comment on lines +144 to +150
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 access_group_ids silently bypasses per-member model restrictions

The docs describe the feature as providing fine-grained per-member model access control. However, can_team_access_model in auth_checks.py falls back to the team's access_group_ids after the per-member effective-models check fails — without any restriction from the user's effective_models set. The code comment acknowledges this:

access groups are a team-level concept and are NOT restricted by per-member model overrides. If a team has access_group_ids configured, any member can access models from those groups regardless of their effective_models set.

This means:

  • An admin grants User A only ["gpt-4o-mini"] via per-member overrides
  • The team also has access_group_ids = ["premium-models"] which includes ["gpt-4", "claude-3"]
  • User A can access gpt-4 and claude-3 through the access-group bypass even though they're not in their effective model set

The docs should explicitly call this out so admins understand that access_group_ids is an additive team-wide grant that overrides per-member restrictions. Without this note, an admin who configures per-member restrictions for compliance or cost-control may unknowingly leave a bypass path open.

```

Comment on lines +144 to +152
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Docs omit service-key behaviour and capping semantics

The docs state:

A member's effective models = default_modelsmember.models. If neither is set, falls back to team.models.

Two behaviours are not documented:

  1. Capping: the union is further intersected with team.models (the team pool). If default_models = ["gpt-4"] but team.models = ["gpt-4o-mini"], the effective set is [] → falls back to ["gpt-4o-mini"].
  2. Service/bot keys (team-level keys with no user_id): current code restricts them to default_models at runtime (see inline comment on get_effective_team_models). The docs claim "Zero extra database queries on the auth hot path" and "full backward compatibility", which is only accurate for keys that are user-scoped. Service key users enabling this feature will experience a silent regression (see related comment on auth_checks.py).

- `models` — the full pool of models the team is allowed to use
- `default_models` — the subset every member gets by default (must be a subset of `models`)

### 2. Add Members with Per-User Overrides

```shell
# Alice gets the default (gpt-4o-mini only)
curl -L 'http://localhost:4000/team/member_add' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"member": {"role": "user", "user_id": "alice"}
}'

# Bob gets gpt-4o in addition to the default
curl -L 'http://localhost:4000/team/member_add' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"member": {"role": "user", "user_id": "bob", "models": ["gpt-4o"]}
}'
```

| Member | Override | Effective Models |
|--------|----------|-----------------|
| Alice | none | `["gpt-4o-mini"]` |
| Bob | `["gpt-4o"]` | `["gpt-4o-mini", "gpt-4o"]` |

### 3. Generate Keys and Test

```shell
# Generate key for Bob
curl -L 'http://localhost:4000/key/generate' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{"team_id": "<team-id>", "user_id": "bob"}'
```

<Tabs>
<TabItem label="Allowed (Bob → gpt-4o)" value="allowed">

```shell
curl -L 'http://localhost:4000/chat/completions' \
-H 'Authorization: Bearer <bob-key>' \
-H 'Content-Type: application/json' \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
```

Returns `200 OK` — `gpt-4o` is in Bob's effective set.

</TabItem>
<TabItem label="Denied (Bob → gpt-4)" value="denied">

```shell
curl -L 'http://localhost:4000/chat/completions' \
-H 'Authorization: Bearer <bob-key>' \
-H 'Content-Type: application/json' \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'
```

Returns `401 Unauthorized` — `gpt-4` is in the team pool but not in Bob's effective set.

</TabItem>
</Tabs>

### 4. Update Member Overrides

```shell
# Add gpt-4 to Bob's overrides
curl -L 'http://localhost:4000/team/member_update' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"user_id": "bob",
"models": ["gpt-4o", "gpt-4"]
}'

# Remove all overrides (Bob falls back to default_models only)
curl -L 'http://localhost:4000/team/member_update' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"user_id": "bob",
"models": []
}'
```

### Validation Rules

| Rule | Error |
|------|-------|
| `default_models` must be a subset of `team.models` | `400` on `/team/new` and `/team/update` |
| Member `models` must be a subset of `team.models` | `400` on `/team/member_add` and `/team/member_update` |
| Key `models` must be a subset of effective models | `403` on `/key/generate` |
| Narrowing `team.models` auto-prunes stale `default_models` | Automatic on `/team/update` |

### Backward Compatibility

When the feature flag is off **or** when neither `default_models` nor member `models` is configured:

- `get_effective_team_models()` returns `team.models` unchanged
- All existing teams and keys work exactly as before
- Zero extra database queries on the auth hot path


## **View Available Fallback Models**

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
-- AlterTable: Add default_models to LiteLLM_TeamTable
ALTER TABLE "LiteLLM_TeamTable" ADD COLUMN IF NOT EXISTS "default_models" TEXT[] DEFAULT ARRAY[]::TEXT[];

-- AlterTable: Add models to LiteLLM_TeamMembership
ALTER TABLE "LiteLLM_TeamMembership" ADD COLUMN IF NOT EXISTS "models" TEXT[] DEFAULT ARRAY[]::TEXT[];
2 changes: 2 additions & 0 deletions litellm-proxy-extras/litellm_proxy_extras/schema.prisma
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ model LiteLLM_TeamTable {
policies String[] @default([])
model_id Int? @unique // id for LiteLLM_ModelTable -> stores team-level model aliases
allow_team_guardrail_config Boolean @default(false) // if true, team admin can configure guardrails for this team
default_models String[] @default([]) // NEW: team-wide defaults
litellm_organization_table LiteLLM_OrganizationTable? @relation(fields: [organization_id], references: [organization_id])
litellm_model_table LiteLLM_ModelTable? @relation(fields: [model_id], references: [id])
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
Expand Down Expand Up @@ -595,6 +596,7 @@ model LiteLLM_TeamMembership {
team_id String
spend Float @default(0.0)
budget_id String?
models String[] @default([]) // NEW: per-user model overrides
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
@@id([user_id, team_id])
}
Expand Down
1 change: 1 addition & 0 deletions litellm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@
os.getenv("LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES", False)
) # When True, routes OpenAI /v1/messages requests to chat/completions instead of the Responses API
retry = True
team_model_overrides_enabled = os.getenv("TEAM_MODEL_OVERRIDES", "").lower() == "true"
### AUTH ###
api_key: Optional[str] = None
openai_key: Optional[str] = None
Expand Down
17 changes: 17 additions & 0 deletions litellm/proxy/_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -1611,6 +1611,16 @@ class Member(MemberBase):
] = Field(
description="The role of the user within the team. 'admin' users can manage team settings and members, 'user' is a regular team member"
)
models: Optional[List[str]] = Field(
default=None,
description="Specific models this member can access within the team. If provided, these will be used in addition to the team's default models.",
)
tpm_limit: Optional[int] = Field(
default=None, description="Tokens per minute limit for this team member"
)
rpm_limit: Optional[int] = Field(
default=None, description="Requests per minute limit for this team member"
)


class OrgMember(MemberBase):
Expand Down Expand Up @@ -1642,6 +1652,7 @@ class TeamBase(LiteLLMPydanticObjectBase):
blocked: bool = False
router_settings: Optional[dict] = None
access_group_ids: Optional[List[str]] = None
default_models: List[str] = []


class NewTeamRequest(TeamBase):
Expand Down Expand Up @@ -1719,6 +1730,7 @@ class UpdateTeamRequest(LiteLLMPydanticObjectBase):
model_aliases: Optional[dict] = None
guardrails: Optional[List[str]] = None
policies: Optional[List[str]] = None
default_models: Optional[List[str]] = None
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None
team_member_budget: Optional[float] = None
team_member_budget_duration: Optional[str] = None
Expand Down Expand Up @@ -2387,6 +2399,8 @@ class LiteLLM_VerificationTokenView(LiteLLM_VerificationToken):
team_alias: Optional[str] = None
team_tpm_limit: Optional[int] = None
team_rpm_limit: Optional[int] = None
team_member_models: Optional[List[str]] = None
team_default_models: Optional[List[str]] = None
team_max_budget: Optional[float] = None
team_soft_budget: Optional[float] = None
team_models: List = []
Expand Down Expand Up @@ -3607,6 +3621,7 @@ class LiteLLM_TeamMembership(LiteLLMPydanticObjectBase):
team_id: str
budget_id: Optional[str] = None
spend: Optional[float] = 0.0
models: List[str] = []
litellm_budget_table: Optional[LiteLLM_BudgetTable]

def safe_get_team_member_rpm_limit(self) -> Optional[int]:
Expand Down Expand Up @@ -3729,6 +3744,7 @@ class TeamMemberDeleteRequest(MemberDeleteRequest):
class TeamMemberUpdateRequest(TeamMemberDeleteRequest):
max_budget_in_team: Optional[float] = None
role: Optional[Literal["admin", "user"]] = None
models: Optional[List[str]] = None
tpm_limit: Optional[int] = Field(
default=None, description="Tokens per minute limit for this team member"
)
Expand All @@ -3739,6 +3755,7 @@ class TeamMemberUpdateRequest(TeamMemberDeleteRequest):

class TeamMemberUpdateResponse(MemberUpdateResponse):
team_id: str
models: Optional[List[str]] = None
max_budget_in_team: Optional[float] = None
tpm_limit: Optional[int] = None
rpm_limit: Optional[int] = None
Expand Down
93 changes: 83 additions & 10 deletions litellm/proxy/auth/auth_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
3. If end_user ('user' passed to /chat/completions, /embeddings endpoint) is in budget
"""
import asyncio
import os
import re
import time
from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional, Union, cast
Expand Down Expand Up @@ -409,20 +410,16 @@ async def common_checks( # noqa: PLR0915
# 2. If team can call model
if _model and team_object:
with tracer.trace("litellm.proxy.auth.common_checks.can_team_access_model"):
if not await can_team_access_model(
# can_team_access_model returns Literal[True] or raises ProxyException
await can_team_access_model(
model=_model,
team_object=team_object,
llm_router=llm_router,
team_model_aliases=valid_token.team_model_aliases
if valid_token
else None,
):
raise ProxyException(
message=f"Team not allowed to access model. Team={team_object.team_id}, Model={_model}. Allowed team models = {team_object.models}",
type=ProxyErrorTypes.team_model_access_denied,
param="model",
code=status.HTTP_401_UNAUTHORIZED,
)
valid_token=valid_token,
)

# Require trace id for agent keys when agent has require_trace_id_on_calls_by_agent
if valid_token is not None and valid_token.agent_id:
Expand Down Expand Up @@ -2690,29 +2687,105 @@ def can_org_access_model(
)


def compute_effective_models(
team_defaults: List[str],
member_models: List[str],
team_pool: List[str],
) -> List[str]:
"""
Core computation shared by the auth hot-path and key-generation.

effective = union(team_defaults, member_models), capped by team_pool.
- If neither defaults nor overrides are set, falls back to team_pool (backward compat).
- If cap empties the list (all overrides/defaults are stale), falls back to team_pool
rather than returning [] (which would mean "allow all"). This is a deliberate security
trade-off: a member with entirely stale overrides gets team-pool access (the team's
restriction is still enforced) instead of unrestricted access. Admins should clean up
stale overrides via /team/member_update when narrowing team.models.
- team_pool=[] means "allow all" — cap is skipped.
"""
# dict.fromkeys preserves insertion order while deduplicating
effective = list(dict.fromkeys(team_defaults + member_models))

if not effective:
return team_pool

if team_pool:
effective = [m for m in effective if m in set(team_pool)]
if not effective:
return team_pool

Comment on lines +2710 to +2717
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale member overrides silently escalate to full team access

When a member has per-user model overrides set (e.g. ["gpt-4o"]), but ALL of those models are subsequently removed from team.models, compute_effective_models falls back to team_pool — the full team model list. This means the member quietly gains access to every model the team has, which is broader than their originally intended restricted set.

Concretely:

  • team.models = ["gpt-4", "gpt-3.5"]
  • member override: ["gpt-4o"] (now stale — gpt-4o removed from team)
  • effective = ["gpt-4o"] → capped by team_pool → [] → falls back to ["gpt-4", "gpt-3.5"]

The comment justifies this as "NOT [] which = allow-all", but team_pool can itself be a large set. An admin revoking a specific model override may expect the member to lose all team access until explicitly reassigned, not to receive a broad promotion.

If this fallback is intentional, add a clear doc comment explaining the privilege trade-off and consider whether team_pool fallback should be configurable (e.g. a stricter mode that returns [] = deny vs fall-through-to-team-pool).

return effective


def get_effective_team_models(
team_object: Optional[LiteLLM_TeamTable],
valid_token: Optional[UserAPIKeyAuth] = None,
) -> List[str]:
"""
Returns the effective list of models for a team member.
The union of:
- team_object.default_models (OR valid_token.team_default_models if available)
- team_membership.models (OR valid_token.team_member_models if available)

Capped by team_object.models. Falls back to team_object.models when empty.
"""
if not (
litellm.team_model_overrides_enabled
or os.getenv("TEAM_MODEL_OVERRIDES", "").lower() == "true"
):
Comment on lines +2733 to +2736
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 os.getenv() called on every auth check when feature flag is False

litellm.team_model_overrides_enabled is already initialised at module load time from the same environment variable. When the flag is False, the short-circuit or still falls through and calls os.getenv("TEAM_MODEL_OVERRIDES", "") on every request that reaches can_team_access_model.

This is the hot auth path — even a cheap os.getenv call multiplied across thousands of requests per second adds up. Consider reading the env-var only once and storing it on litellm.team_model_overrides_enabled (which already exists), then checking only litellm.team_model_overrides_enabled here:

Suggested change
if not (
litellm.team_model_overrides_enabled
or os.getenv("TEAM_MODEL_OVERRIDES", "").lower() == "true"
):
if not litellm.team_model_overrides_enabled:

If dynamic env-var toggling at runtime is intentionally supported, document it explicitly so the cost is clearly understood.

return team_object.models if team_object else []

# Get from team defaults — prefer team_object (authoritative, fresh from DB/cache)
# over valid_token (snapshot from key creation time, may be stale).
# Use `is not None` instead of truthiness so that an explicit empty list []
# (meaning "no defaults") is not confused with "field missing".
team_defaults: List[str] = []
if team_object and team_object.default_models is not None:
team_defaults = team_object.default_models
elif valid_token and valid_token.team_default_models is not None:
team_defaults = valid_token.team_default_models

# Get from member specific overrides
member_models: List[str] = []
if valid_token and valid_token.team_member_models is not None:
member_models = valid_token.team_member_models

team_pool = team_object.models if team_object else []

return compute_effective_models(team_defaults, member_models, team_pool)
Comment on lines +2721 to +2762
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Service keys (no user_id) silently restricted by default_models at runtime

get_effective_team_models has no exemption for service/bot keys (team keys without a user_id). When team_model_overrides_enabled=True and a team has default_models configured (e.g. ["gpt-4o-mini"]), the auth path computes:

  • team_defaults = team_object.default_models["gpt-4o-mini"]
  • member_models = [] — because the SQL JOIN on tm.user_id = v.user_id doesn't match for keys where v.user_id IS NULL, so valid_token.team_member_models = None
  • effective = compute_effective_models(["gpt-4o-mini"], [], full_team_pool) = ["gpt-4o-mini"]

This means service keys are silently restricted to default_models only, even though they were intended to have full team.models access. The key-generation path correctly exempts service keys (and data.user_id guard), but the runtime auth path in can_team_access_model calls get_effective_team_models(team_object, valid_token) unconditionally.

Contrast with the comment at the key-generation call-site:

service/bot keys (no user_id) use the full team.models pool, preserving pre-feature behavior

The same logic needs to be applied in get_effective_team_models. When valid_token has no user_id and no membership row, fall back to team_pool:

# In get_effective_team_models, after the feature-flag check:
# Service keys (no user_id) have no membership row, so skip per-member logic
# and use team_pool directly for backward compatibility.
if valid_token is not None and valid_token.user_id is None:
    return team_object.models if team_object else []

This is a backward-incompatible regression (per repo rules) that activates the moment an admin enables the feature flag and sets default_models on any team that also uses service keys.

Rule Used: What: avoid backwards-incompatible changes without... (source)



async def can_team_access_model(
model: Union[str, List[str]],
team_object: Optional[LiteLLM_TeamTable],
llm_router: Optional[Router],
team_model_aliases: Optional[Dict[str, str]] = None,
valid_token: Optional[UserAPIKeyAuth] = None,
) -> Literal[True]:
"""
Returns True if the team can access a specific model.

1. First checks native team-level model permissions (current implementation)
2. If not allowed natively, falls back to access_group_ids on the team
"""
effective_models = get_effective_team_models(team_object, valid_token)
try:
return _can_object_call_model(
model=model,
llm_router=llm_router,
models=team_object.models if team_object else [],
models=effective_models,
team_model_aliases=team_model_aliases,
team_id=team_object.team_id if team_object else None,
object_type="team",
)
except ProxyException:
# Fallback: check team's access_group_ids
# Fallback: check team's access_group_ids.
# Note: access groups are a team-level concept and are NOT restricted by
# per-member model overrides. If a team has access_group_ids configured,
# any member can access models from those groups regardless of their
# effective_models set. This is by design — access groups grant team-wide
# access, while default_models/member.models control the team's own model list.
team_access_group_ids = (
(team_object.access_group_ids or []) if team_object else []
)
Expand Down
Loading
Loading