feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers#24335
feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers#24335ishaan-jaff wants to merge 31 commits intomainfrom
Conversation
…ssion is set (toolset scope fix)
…deleteMCPToolset API functions
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR introduces MCP Toolsets — named, curated subsets of tools drawn from one or more MCP servers that can be assigned to API keys and teams. It adds a new Key changes:
Issues found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/_experimental/mcp_server/mcp_server_manager.py | Adds in-memory TTL caching (_toolset_perm_cache, _toolset_name_cache) for toolset DB lookups, resolve_toolset_tool_permissions, get_toolset_by_name_cached, and invalidate_toolset_cache; also fixes allow_all_keys server bleed-in when an explicit permission boundary is set. Cache design is sound but only the ASGI route path uses get_toolset_by_name_cached — the responses-API path bypasses it. |
| litellm/proxy/_experimental/mcp_server/server.py | Adds _mcp_active_toolset_id ContextVar (server-side only), _merge_toolset_permissions (union-merges toolset tool perms into the key's object_permission for normal MCP requests), and _apply_toolset_scope (replaces object_permission entirely for toolset-namespaced routes); strips x-mcp-toolset-id client header. The ContextVar approach correctly prevents client header injection. |
| litellm/proxy/proxy_server.py | Adds _stream_mcp_asgi_response (fixes the pre-existing buffered-response/SSE issue), toolset_mcp_route (/toolset/{name}/mcp), and toolset fallback in dynamic_mcp_route. _stream_mcp_asgi_response has two issues: asyncio.get_event_loop() should be asyncio.get_running_loop(), and body_iter() can deadlock if the handler task raises mid-stream (no sentinel is pushed to body_queue on task failure). |
| litellm/responses/mcp/litellm_proxy_mcp_handler.py | Adds toolset-name resolution in the responses-API path: when a server name is unknown, it tries to resolve it as a toolset and calls _apply_toolset_scope. However, it calls get_mcp_toolset_by_name directly (uncached DB query) instead of using global_mcp_server_manager.get_toolset_by_name_cached, violating the no-direct-DB-in-critical-path rule. |
| litellm/proxy/management_endpoints/mcp_management_endpoints.py | Adds CRUD endpoints for MCP toolsets. Admin-only guard on POST/PUT/DELETE is correct. GET /toolset access-control logic correctly distinguishes None (no restriction) from [] (explicitly empty), fixing the previous-thread-flagged empty-list bug. GET /toolset/{id} has no authorization check but was flagged in a previous review thread. |
| litellm/proxy/_experimental/mcp_server/toolset_db.py | New thin DB access layer for LiteLLM_MCPToolsetTable — create, get, list, update, delete. Clean and straightforward; JSON serialization/deserialization for the tools field is handled correctly. |
| litellm/proxy/management_helpers/object_permission_utils.py | Extends key-creation validation to reject toolset assignments when the key's team doesn't allow those toolsets. Logic mirrors the existing server/access-group validation pattern and correctly raises HTTP 403 for violations. |
| tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_toolset_scope.py | New test file: TestApplyToolsetScope has meaningful mock-based async tests. TestFetchMCPToolsetsAccess is effectively a no-op — it only asserts Python truthiness of local variables and never exercises the actual endpoint logic, providing no regression protection. |
| litellm/types/mcp_server/mcp_toolset.py | New Pydantic/TypedDict type definitions for MCPToolset, NewMCPToolsetRequest, UpdateMCPToolsetRequest, and MCPToolsetTool. Clean and minimal. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Client request] --> B{Route?}
B -->|/toolset/name/mcp| C[toolset_mcp_route]
B -->|/name/mcp unknown server| D[dynamic_mcp_route fallback]
B -->|/mcp direct| E[handle_streamable_http_mcp]
B -->|/v1/responses| F[LiteLLM_Proxy_MCP_Handler]
C --> G[get_toolset_by_name_cached - 60s TTL]
G --> H[Set _mcp_active_toolset_id ContextVar]
H --> I[_stream_mcp_asgi_response]
D --> J[get_toolset_by_name_cached - 60s TTL]
J --> H
E --> K{active_toolset_id set?}
K -->|Yes| L[_apply_toolset_scope\nReplace object_permission\nwith toolset tools only]
K -->|No| M[_merge_toolset_permissions\nUnion mcp_toolsets into\nobject_permission]
L --> N[filter_tools_by_key_team_permissions]
M --> N
F --> O{Name is known server?}
O -->|No| P[get_mcp_toolset_by_name\nDirect DB call - no cache]
P --> Q[_apply_toolset_scope]
O -->|Yes| R[Normal server flow]
style P fill:#ffcccc,stroke:#ff0000
style I fill:#ffe0b2,stroke:#ff9800
Last reviewed commit: "fix(mcp): cache tool..."
| toolset_id_header = next( | ||
| ( | ||
| v.decode() | ||
| for k, v in scope.get("headers", []) | ||
| if k == b"x-mcp-toolset-id" | ||
| ), | ||
| None, | ||
| ) | ||
| if toolset_id_header and user_api_key_auth is not None: | ||
| user_api_key_auth = await _apply_toolset_scope( | ||
| user_api_key_auth, toolset_id_header | ||
| ) |
There was a problem hiding this comment.
x-mcp-toolset-id header is client-injectable — privilege escalation
handle_streamable_http_mcp reads the x-mcp-toolset-id header directly from the raw ASGI scope["headers"], which is fully client-controlled. An external client can send this header on a direct request to /mcp and trick the server into scoping their key's permissions to any toolset they know the ID of.
_apply_toolset_scope then replaces the key's mcp_servers and mcp_tool_permissions with the toolset's values — with no check that the calling key is actually authorized to use that toolset. This allows a key with no MCP access to gain tool access by:
- Fetching a toolset ID (visible via
GET /v1/mcp/toolset— open to all authenticated users) - Sending
x-mcp-toolset-id: <id>directly to/mcp
The header was designed to be set server-side (by toolset_mcp_route/dynamic_mcp_route) but is indistinguishable from a client-supplied one.
Recommended fix: Before applying the scope, verify that the toolset ID is in user_api_key_auth.object_permission.mcp_toolsets, or pass the toolset ID via a ContextVar instead of an HTTP header so it cannot be set by clients.
# Add authorization check before applying scope
if toolset_id_header and user_api_key_auth is not None:
op = user_api_key_auth.object_permission
allowed_toolsets = getattr(op, "mcp_toolsets", None) or []
if toolset_id_header not in allowed_toolsets:
# Toolset not authorized for this key
raise MCPError(ErrorCode.INVALID_REQUEST, "Toolset not authorized")
user_api_key_auth = await _apply_toolset_scope(
user_api_key_auth, toolset_id_header
)| """ | ||
| from litellm.proxy._experimental.mcp_server.toolset_db import list_mcp_toolsets | ||
| from litellm.proxy.proxy_server import prisma_client | ||
|
|
||
| if not toolset_ids or prisma_client is None: | ||
| return {} | ||
|
|
||
| try: | ||
| toolsets = await list_mcp_toolsets(prisma_client, toolset_ids=toolset_ids) | ||
| tool_permissions: Dict[str, List[str]] = {} | ||
| for toolset in toolsets: | ||
| for tool in toolset.tools: | ||
| # Stored tool_names may include the server prefix (e.g. | ||
| # "server_alias__tool_name"). filter_tools_by_key_team_permissions | ||
| # compares against the unprefixed name, so strip it here. | ||
| raw_name = tool["tool_name"] | ||
| unprefixed, _ = split_server_prefix_from_name(raw_name) | ||
| tool_permissions.setdefault(tool["server_id"], []) | ||
| if unprefixed not in tool_permissions[tool["server_id"]]: | ||
| tool_permissions[tool["server_id"]].append(unprefixed) | ||
| return tool_permissions | ||
| except Exception as e: | ||
| verbose_logger.warning(f"Failed to resolve toolset permissions: {str(e)}") | ||
| return {} |
There was a problem hiding this comment.
DB query on every MCP request violates critical-path rule
resolve_toolset_tool_permissions issues a find_many against LiteLLM_MCPToolsetTable for every MCP tool request that has mcp_toolsets set on the key (called from both _merge_toolset_permissions and _apply_toolset_scope). This violates the project rule that the critical request path should not contain direct DB queries.
Similarly, get_mcp_toolset_by_name is called on every request in both toolset_mcp_route and the toolset fallback in dynamic_mcp_route in proxy_server.py (lines ~13513, ~13613).
Toolset definitions are essentially static configuration — they should be cached (e.g., in user_api_key_cache with a TTL) just like MCP server definitions are cached in the in-memory registry. Loading them on every request will degrade performance under load.
Rule Used: What: Avoid creating new database requests or Rout... (source)
| # Admins with no explicit restriction see all toolsets | ||
| if _user_has_admin_view(user_api_key_dict): | ||
| op = user_api_key_dict.object_permission | ||
| if op is None or getattr(op, "mcp_toolsets", None) is None: | ||
| return await list_mcp_toolsets(prisma_client) | ||
|
|
||
| op = user_api_key_dict.object_permission | ||
| allowed_ids = (getattr(op, "mcp_toolsets", None) or []) if op else [] | ||
| return await list_mcp_toolsets( | ||
| prisma_client, toolset_ids=allowed_ids if allowed_ids else None | ||
| ) |
There was a problem hiding this comment.
Non-admin users with no toolset restriction can list all toolsets
The access-control logic for GET /v1/mcp/toolset has a subtle flaw: when allowed_ids is an empty list (either because op is None, or mcp_toolsets is None or []), the code passes toolset_ids=None to list_mcp_toolsets, which returns all toolsets.
This means:
- A non-admin key with
object_permission.mcp_toolsets = [](explicitly restricted to zero toolsets) can still enumerate every toolset. - A non-admin key with no
object_permissionat all also sees all toolsets.
The condition allowed_ids if allowed_ids else None conflates "no restriction configured" with "restriction to empty set". The distinction matters for non-admins.
# Current (buggy):
return await list_mcp_toolsets(
prisma_client, toolset_ids=allowed_ids if allowed_ids else None
)
# Suggested — return empty list when explicitly restricted to nothing:
# Only pass None (= all toolsets) when op is None or mcp_toolsets is None (unrestricted)
op = user_api_key_dict.object_permission
raw_toolsets = getattr(op, "mcp_toolsets", None) if op else None
return await list_mcp_toolsets(
prisma_client, toolset_ids=raw_toolsets # None → all, [] → empty
)| @router.get( | ||
| "/toolset/{toolset_id}", | ||
| description="Get a specific MCP toolset by ID", | ||
| ) | ||
| @management_endpoint_wrapper | ||
| async def fetch_mcp_toolset( | ||
| toolset_id: str, | ||
| user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth), | ||
| ): | ||
| prisma_client = get_prisma_client_or_throw( | ||
| "Database not connected. Connect a database to your proxy" | ||
| ) | ||
| toolset = await get_mcp_toolset(prisma_client, toolset_id) | ||
| if toolset is None: | ||
| raise HTTPException( | ||
| status_code=status.HTTP_404_NOT_FOUND, | ||
| detail={"error": f"Toolset '{toolset_id}' not found."}, | ||
| ) | ||
| return toolset |
There was a problem hiding this comment.
GET /toolset/{toolset_id} has no access control check
Any authenticated API key (including non-admin keys) can fetch the full contents of any toolset by ID. While UUIDs are hard to guess, once a key enumerates IDs via GET /v1/mcp/toolset (which returns all toolsets for unconfigured keys — see above), they can read the tool definitions for any toolset.
Consider adding an authorization check: either restrict to admin-only, or verify the calling key's object_permission.mcp_toolsets contains the requested ID.
| ) | ||
|
|
||
| # Inject the resolved toolset_id as a header so handle_streamable_http_mcp | ||
| # can apply toolset-scoped permissions regardless of the key's own permissions. | ||
| scope = dict(request.scope) | ||
| scope["headers"] = list(scope.get("headers", [])) + [ | ||
| (b"x-mcp-toolset-id", toolset.toolset_id.encode()), | ||
| ] | ||
| scope["path"] = "/mcp" | ||
|
|
||
| response_status = 200 | ||
| response_headers: list = [] | ||
| response_body = b"" | ||
|
|
||
| async def custom_send(message): | ||
| nonlocal response_status, response_headers, response_body | ||
| if message["type"] == "http.response.start": | ||
| response_status = message["status"] | ||
| response_headers = message.get("headers", []) | ||
| elif message["type"] == "http.response.body": | ||
| response_body += message.get("body", b"") | ||
|
|
||
| await handle_streamable_http_mcp( | ||
| scope, receive=request.receive, send=custom_send | ||
| ) | ||
|
|
||
| from starlette.responses import Response | ||
|
|
||
| headers_dict = {k.decode(): v.decode() for k, v in response_headers} | ||
| return Response( | ||
| content=response_body, | ||
| status_code=response_status, | ||
| headers=headers_dict, | ||
| media_type=headers_dict.get("content-type", "application/json"), | ||
| ) | ||
|
|
||
| except HTTPException as e: | ||
| raise e | ||
| except Exception as e: | ||
| verbose_proxy_logger.error( | ||
| f"Error handling toolset MCP route for {toolset_name}: {str(e)}" | ||
| ) | ||
| raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}") |
There was a problem hiding this comment.
Buffered response breaks MCP streaming / SSE for toolset routes
Both toolset_mcp_route (lines ~13545–13575) and the toolset fallback inside dynamic_mcp_route (lines ~13620–13643) accumulate the entire ASGI response body into a bytes buffer before constructing a Response object:
response_body = b""
async def custom_send(message):
...
elif message["type"] == "http.response.body":
response_body += message.get("body", b"")
await handle_streamable_http_mcp(...)
return Response(content=response_body, ...)MCP's StreamableHTTP transport uses Server-Sent Events (SSE) — the handler sends incremental chunks. Buffering all chunks before returning means:
- No data reaches the client until the full MCP session completes.
- Long-running sessions or SSE notification streams will appear to hang.
- Memory grows unboundedly for active sessions.
Note: the same pattern exists in the regular dynamic_mcp_route (pre-existing in the codebase) — but extending it to the new toolset routes compounds the issue. The correct approach is to pass the original ASGI send callable through so chunks are forwarded incrementally, similar to how the main /mcp route works.
…fixes SSE streaming)
…ion, and access control
…-request DB calls
| get_mcp_toolset_by_name, | ||
| ) | ||
| from litellm.proxy.proxy_server import prisma_client | ||
|
|
||
| if prisma_client is not None and user_api_key_auth is not None: | ||
| toolset = await get_mcp_toolset_by_name(prisma_client, name) |
There was a problem hiding this comment.
Uncached DB query in critical request path
get_mcp_toolset_by_name is called directly here on every responses-API request that passes a toolset name, bypassing the 60-second TTL cache (global_mcp_server_manager.get_toolset_by_name_cached) that was added in this same PR precisely to avoid per-request DB hits. Under load this will issue one find_first per toolset name per request.
| get_mcp_toolset_by_name, | |
| ) | |
| from litellm.proxy.proxy_server import prisma_client | |
| if prisma_client is not None and user_api_key_auth is not None: | |
| toolset = await get_mcp_toolset_by_name(prisma_client, name) | |
| if prisma_client is not None and user_api_key_auth is not None: | |
| toolset = await global_mcp_server_manager.get_toolset_by_name_cached(prisma_client, name) |
Rule Used: What: In critical path of request, there should be... (source)
| """ | ||
| from starlette.responses import StreamingResponse | ||
|
|
||
| headers_ready: asyncio.Future = asyncio.get_event_loop().create_future() |
There was a problem hiding this comment.
asyncio.get_event_loop() deprecated inside async context
asyncio.get_event_loop() is deprecated in Python 3.10+ when called inside a running coroutine and will emit a DeprecationWarning. The correct API inside an async function is asyncio.get_running_loop(), which always returns the currently-running loop without fallback behavior.
| headers_ready: asyncio.Future = asyncio.get_event_loop().create_future() | |
| headers_ready: asyncio.Future = asyncio.get_running_loop().create_future() |
| async def body_iter(): | ||
| try: | ||
| while True: | ||
| chunk = await body_queue.get() | ||
| if chunk is None: | ||
| break | ||
| yield chunk | ||
| finally: | ||
| if not handler_task.done(): | ||
| handler_task.cancel() | ||
| with contextlib.suppress(asyncio.CancelledError): | ||
| await handler_task |
There was a problem hiding this comment.
body_iter will deadlock if the handler task raises after headers are sent
If handler_task raises an unhandled exception after headers_ready is resolved (i.e., mid-stream), bridging_send will never be called with more_body=False, so body_queue.put(None) (the EOF sentinel on line 13531) is never enqueued. The body_queue.get() on line 13550 will then block indefinitely, hanging the client connection.
A straightforward fix is to attach a done_callback to the task that pushes the sentinel into the queue on failure:
def _on_task_done(fut: asyncio.Task) -> None:
if fut.cancelled() or fut.exception():
# Ensure body_iter() unblocks on handler failure
body_queue.put_nowait(None)
handler_task = asyncio.create_task(handle_fn(scope, receive, bridging_send))
handler_task.add_done_callback(_on_task_done)| raw_toolsets: Optional[List[str]] = [] | ||
| # raw_toolsets is [] → return nothing | ||
| assert raw_toolsets is not None | ||
| assert not raw_toolsets # empty list → return [] | ||
|
|
||
| def test_none_toolsets_returns_all(self): | ||
| """Key where mcp_toolsets is absent (None) should return all toolsets.""" | ||
| raw_toolsets: Optional[List[str]] = None | ||
| # raw_toolsets is None → no restriction → return all | ||
| assert raw_toolsets is None | ||
|
|
||
| def test_populated_toolsets_filters(self): | ||
| """Key with explicit toolset IDs should only see those.""" | ||
| raw_toolsets: Optional[List[str]] = ["ts-1", "ts-2"] | ||
| assert raw_toolsets is not None | ||
| assert len(raw_toolsets) == 2 |
There was a problem hiding this comment.
TestFetchMCPToolsetsAccess tests only assert Python language semantics
The three tests in this class (test_empty_toolsets_returns_empty, test_none_toolsets_returns_all, test_populated_toolsets_filters) never call fetch_mcp_toolsets or any production code — they only assert properties of local variables (e.g. assert not raw_toolsets, assert raw_toolsets is None). These will always pass regardless of any regression in the actual endpoint's access-control logic.
The coverage gap is significant: the existing thread already flagged that non-admin keys with mcp_toolsets=[] would incorrectly see all toolsets in a prior implementation. These tests would not have caught that bug. Consider replacing them with tests that actually call the fetch_mcp_toolsets handler (or its inner logic) with mocked DB/auth, similar to how TestApplyToolsetScope is structured above.
Rule Used: What: Flag any modifications to existing tests and... (source)
Feat - Add MCP Toolsets on LiteLLM
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
Changes
Adds MCP Toolsets — a way to bundle a named subset of tools from one or more MCP servers and assign that bundle to keys/teams.
Problem: Today you can grant a key access to an entire MCP server (all 50 tools). There's no way to say "this key gets only
get_build_logsandrun_pipelinefrom CircleCI, andread_wiki_structurefrom DeepWiki."What this adds:
Backend:
LiteLLM_MCPToolsetTableDB table + prisma migration (toolset_id,toolset_name,tools: [{server_id, tool_name}])mcp_toolsets: String[]added toLiteLLM_ObjectPermissionTablePOST/GET/PUT/DELETE /v1/mcp/toolset_apply_toolset_scope()— setsobject_permission.mcp_serversandmcp_tool_permissionsto exactly the toolset's tools when a toolset route is hitallow_all_keysservers (e.g. test servers) were bleeding into toolset-scoped requests even whenmcp_serverswas explicitly set. Fixed by skippingallow_all_server_idsinjection when there's an explicit permission boundary.server_url(e.g.litellm_proxy/mcp/devtooling-prod) were not resolved to actual toolsets, causing all tools to be returned. Now resolved viaget_mcp_toolset_by_namebefore tool fetch.UI:
MCPToolsetsTab— list/create/edit/delete toolsets from the MCP pageMCPServerSelector— toolsets now appear as a third category (purple) alongside servers (blue) and access groups (green)mcp_toolsetsfrom the combined selector and send it inobject_permissionMCPServerPermissionsread view shows assigned toolsetslitellm_proxy/mcp/{name}URLDocs:
docs/mcp_toolsets.mdwith ASCII diagram, step-by-step UI screenshots, API examples, and management curl examples