fix(router): clean pattern_router state on upsert/delete by Aarkin7 · Pull Request #29601 · BerriAI/litellm

Aarkin7 · 2026-06-03T17:28:34Z

Relevant issues

Linear ticket

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have added meaningful tests
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible; it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

The bug lives in internal router state, so the user-visible symptom is "the api_key I just rotated still works for my wildcard models." The runbook below reproduces it on litellm_internal_staging and shows it gone on this branch. It needs two real OpenAI keys, because the only honest way to prove the old key is out of rotation is to revoke it upstream and watch for 401s.

Save two keys locally; OLD_KEY is the one you'll revoke later

export OLD_KEY=sk-...
export NEW_KEY=sk-...
Drop a wildcard model into litellm/proxy/dev_config.yaml pointing at OLD_KEY

model_list:
  - model_name: openai/*
    litellm_params:
      model: openai/*
      api_key: os.environ/OLD_KEY

Start the proxy

python litellm/proxy/proxy_cli.py --config litellm/proxy/dev_config.yaml --detailed_debug --reload --use_v2_migration_resolver 2>&1 | tee litellm.log

Sanity check that OLD_KEY works through the wildcard route

curl -sS -X POST http://localhost:4000/v1/chat/completions
-H "Authorization: Bearer sk-1234"
-H "Content-Type: application/json"
-d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'

Grab the deployment's model_id so you can target it for the rotation

curl -sS http://localhost:4000/model/info -H "Authorization: Bearer sk-1234"
| jq -r '.data[] | select(.model_name == "openai/*") | .model_info.id'

Rotate the key via the admin endpoint, pasting the id from step 5

curl -sS -X POST http://localhost:4000/model/update
-H "Authorization: Bearer sk-1234"
-H "Content-Type: application/json"
-d "{"model_id":"<id from step 5>","litellm_params":{"model":"openai/*","api_key":"$NEW_KEY"}}"

Revoke OLD_KEY on the OpenAI dashboard so it can no longer authenticate
Fire 20 requests through the wildcard and tally the status codes

for i in $(seq 1 20); do
curl -sS -o /dev/null -w "%{http_code}\n" -X POST http://localhost:4000/v1/chat/completions
-H "Authorization: Bearer sk-1234"
-H "Content-Type: application/json"
-d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'
done | sort | uniq -c

On litellm_internal_staging, roughly half of those 20 responses come back as 401 because the rotated-out OLD_KEY is still living inside pattern_router.patterns and the load balancer keeps round-robining onto it. On this branch every response should be a 200, and grepping litellm.log for OLD_KEY after step 6 should turn up nothing

Type

🐛 Bug Fix

Changes

PatternMatchRouter.add_pattern was append-only, and neither Router.upsert_deployment nor delete_deployment ever removed the existing entry. So every time an admin edited or deleted a wildcard model, the old deployment dict (including its old api_key) just sat there in pattern_router.patterns, and the load balancer kept round-robining onto it until proxy restart. The same leak hit provider_default_deployment_ids and the per-team team_pattern_routers

Added PatternMatchRouter.remove_deployment(model_id) and a private Router._remove_deployment_from_wildcard_state(model_id) that cleans up across all three. Wired into upsert_deployment and delete_deployment right alongside the existing index-map cleanup so the change stays narrow

Six unit tests in tests/local_testing/test_router_pattern_matching.py pin the new method's contract, and six integration tests in tests/test_litellm/test_router.py cover the actual upsert/delete paths, including team-scoped wildcards and api_key rotation as the regression test

PatternMatchRouter.add_pattern was append-only, and neither Router.upsert_deployment nor Router.delete_deployment removed the existing entry. Rotated-out api_keys stayed in the routing rotation for wildcard deployments (model_name with `*`) until proxy restart, silently defeating key rotation as an admin operation. The same leak applied to provider_default_deployment_ids and per-team pattern routers, and the patterns list grew unboundedly on every edit

codecov · 2026-06-03T17:32:48Z

Codecov Report

❌ Patch coverage is 90.00000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
litellm/router_utils/pattern_match_deployments.py	80.00%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-03T17:33:17Z

Greptile Summary

This PR fixes a stale-state bug in wildcard routing where PatternMatchRouter was append-only: editing or deleting a wildcard deployment via upsert_deployment / delete_deployment left the old entry (and its api_key) sitting in pattern_router.patterns, causing the load balancer to keep round-robining onto revoked credentials until proxy restart.

Adds PatternMatchRouter.remove_deployment(model_id), which purges all pattern entries matching the given id and drops now-empty regex keys; wires it into Router._remove_deployment_from_wildcard_state, which also cleans team_pattern_routers (removing empty team routers entirely) and provider_default_deployment_ids.
Hooks _remove_deployment_from_wildcard_state into both upsert_deployment and delete_deployment, exactly alongside the existing index-map cleanup.
Adds 12 tests (6 unit, 6 integration) covering key rotation, idempotent upserts, multi-regex span, team-scoped wildcards, and empty-router cleanup, all in-memory with no network calls.

Confidence Score: 4/5

Safe to merge; the fix is narrow, well-tested, and targets a clear state-management gap with no risk of breaking existing non-wildcard routing paths.

The core change is correct and the test suite is thorough. Two small gaps exist: the remove_deployment type annotation accepts None at runtime but declares str, which will fail static type checking; and the wildcard cleanup is skipped when deployment_id is present on the router but absent from the fast-mapping index, which could reproduce the stale-credential accumulation in an inconsistent-state scenario. Neither affects the happy path.

The upsert_deployment block in litellm/router.py (lines 8679-8691) is worth a second look for the nested-guard edge case. The type annotation in litellm/router_utils/pattern_match_deployments.py line 78 is a straightforward one-word fix.

Important Files Changed

Filename	Overview
litellm/router_utils/pattern_match_deployments.py	Adds `remove_deployment(model_id)` to `PatternMatchRouter`; logic is correct but the type annotation accepts `str` while the implementation (and test) also handles `None`
litellm/router.py	Adds `_remove_deployment_from_wildcard_state` and wires it correctly into both `upsert_deployment` and `delete_deployment`; cleanup of `team_pattern_routers` and `provider_default_deployment_ids` is complete
tests/local_testing/test_router_pattern_matching.py	Six new unit tests covering remove_deployment (single id, empty regex cleanup, multi-regex span, noop for unknown, falsy guard, missing model_info tolerance); all in-memory, no network calls
tests/test_litellm/test_router.py	Six new integration tests for upsert/delete on wildcard deployments including api_key rotation regression, idempotency, team-scoped wildcards, and empty team-router cleanup; all use internal state checks with no real API calls

Comments Outside Diff (1)

litellm/router.py, line 8679-8691 (link)

Stale wildcard state when deployment_id is not in the fast-mapping index

_remove_deployment_from_wildcard_state is only called inside if removal_idx is not None:, which is itself nested inside if deployment_id in deployment_fast_mapping:. If a wildcard deployment exists on the router (_deployment_on_router is not None) but is somehow absent from model_id_to_deployment_index_map (e.g. after index corruption or a partially-failed prior upsert), the old pattern_router entry is never cleaned up before add_deployment appends the new one — reproducing the exact stale-credential accumulation this PR aims to fix. Moving _remove_deployment_from_wildcard_state one level up (alongside the outer _deployment_on_router is not None check) would close this gap.

_{Reviews (1): Last reviewed commit: "fix(router): clean pattern_router state ..." | Re-trigger Greptile}

greptile-apps · 2026-06-03T17:33:21Z

            self.patterns[regex] = []
        self.patterns[regex].append(llm_deployment)

+    def remove_deployment(self, model_id: str) -> int:


The type annotation says model_id: str but the method also handles None (the falsy guard, and the test exercises it directly with None). This will cause a mypy error at the call site in test_remove_deployment_with_falsy_id_is_noop_even_when_entries_have_no_id. Widening to Optional[str] matches the actual contract.

Suggested change

def remove_deployment(self, model_id: str) -> int:

def remove_deployment(self, model_id: Optional[str]) -> int:

…state router_code_coverage.py greps test files for AST Call nodes and flagged the helper as untested because the existing coverage only exercised it transitively through upsert/delete. Adds two direct tests that pin the helper's contract (cleans across global pattern router, per-team routers with empty-router pop, and provider_default_deployment_ids; noop on falsy model_id)

Widen PatternMatchRouter.remove_deployment annotation to Optional[str]; the implementation already handles None via the falsy guard and the unit test exercises it directly. Move _remove_deployment_from_wildcard_state up one level in upsert_deployment so it runs whenever the prior deployment is on the router, not only when the model_id is present in the fast-mapping index. The scenario is currently unreachable (get_deployment shares the same index), but the cleanup is idempotent so this is defensive against any future divergence between those code paths.

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

Aarkin7 added 2 commits June 3, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(router): clean pattern_router state on upsert/delete#29601

fix(router): clean pattern_router state on upsert/delete#29601
Aarkin7 wants to merge 3 commits into
BerriAI:litellm_internal_stagingfrom
Aarkin7:litellm_fix_pattern_router_leak

Aarkin7 commented Jun 3, 2026

Uh oh!

codecov Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	def remove_deployment(self, model_id: str) -> int:
	def remove_deployment(self, model_id: Optional[str]) -> int:

Uh oh!

Conversation

Aarkin7 commented Jun 3, 2026

Relevant issues

Linear ticket

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

codecov Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 3, 2026 •

edited

Loading

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading