Releases · BerriAI/litellm

10 Apr 02:35

github-actions

v1.65.4.dev8

a86ab3d

v1.65.4.dev8

What's Changed

fix: claude haiku cache read pricing per token by @hewliyang in #9834
Add service annotations to litellm-helm chart by @mlhynfield in #9840
Reflect key and team update in UI by @crisshaker in #9825
Add user alias to API endpoint by @Jacobh2 in #9859
Update Azure Phi-4 pricing by @emerzon in #9862
feat: add enterpriseWebSearch tool for vertex-ai by @qvalentin in #9856
VertexAI non-jsonl file storage support by @krrishdholakia in #9781
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) by @ishaan-jaff in #9853
[Feat SSO] Debug route - allow admins to debug SSO JWT fields by @ishaan-jaff in #9835

New Contributors

@hewliyang made their first contribution in #9834
@mlhynfield made their first contribution in #9840
@crisshaker made their first contribution in #9825
@qvalentin made their first contribution in #9856

Full Changelog: v1.65.4.dev6...v1.65.4.dev8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	271.9459418950253	6.118160191369328	0.0	1829	0	215.48997299998973	3681.300501999999
Aggregated	Passed ✅	240.0	271.9459418950253	6.118160191369328	0.0	1829	0	215.48997299998973	3681.300501999999

Contributors

Jacobh2, krrishdholakia, and 6 other contributors

Assets 4

09 Apr 01:15

github-actions

v1.65.4.dev6

e537056

v1.65.4.dev6

What's Changed

build: bump litellm-proxy-extras version by @krrishdholakia in #9771
Update model_prices by @aoaim in #9768
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in #9772
Add inference providers support for Hugging Face (#8258) (#9738) by @krrishdholakia in #9773
[UI Bug fix] Don't show duplicate models on Team Admin models page by @ishaan-jaff in #9775
[UI QA/Bug Fix] - Don't change team, key, org, model values on scroll by @ishaan-jaff in #9776
[UI Polish] - Polish login screen by @ishaan-jaff in #9778
Litellm 04 05 2025 release notes by @krrishdholakia in #9785
feat: add offline swagger docs by @devdev999 in #7653
fix(gemini/transformation.py): handle file_data being passed in by @krrishdholakia in #9786
Realtime API Cost tracking by @krrishdholakia in #9795
fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema by @krrishdholakia in #8992
fix(databricks/chat/transformation.py): remove reasoning_effort from … by @krrishdholakia in #9811
Handle pydantic base model in message tool calls + Handle tools = [] + handle fireworks ai w/ 'strict' param in function call + support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 by @krrishdholakia in #9774
Allow passing thinking param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) by @krrishdholakia in #9386
[Feat] LiteLLM Tag/Policy Management by @ishaan-jaff in #9813
Remove redundant apk update in Dockerfiles (cc #5016) by @PeterDaveHello in #9055
[Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling by @ishaan-jaff in #9830
[Security Fix CVE-2024-6825] Fix remote code execution in post call rules by @ishaan-jaff in #9826
Bump next from 14.2.25 to 14.2.26 in /ui/litellm-dashboard by @dependabot in #9716

New Contributors

@aoaim made their first contribution in #9768

Full Changelog: v1.65.4-nightly...v1.65.4.dev6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev6

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	230.0	264.1081772527121	6.162437450043016	0.0	1844	0	200.65376200000173	5098.356198000033
Aggregated	Passed ✅	230.0	264.1081772527121	6.162437450043016	0.0	1844	0	200.65376200000173	5098.356198000033

Contributors

PeterDaveHello, krrishdholakia, and 4 other contributors

Assets 4

05 Apr 22:51

github-actions

v1.65.4-stable

b8597d3

v1.65.4-stable

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.65.4-stable

pip install LiteLLM Proxy

pip install litellm==1.65.4.post1

What's Changed

Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9618
fix(logging): add json formatting for uncaught exceptions (#9615) by @krrishdholakia in #9619
fix: wrong indentation of ttlSecondsAfterFinished in chart by @Dbzman in #9611
Fix anthropic thinking + response_format by @krrishdholakia in #9594
Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9625
fix(openrouter/chat/transformation.py): raise informative message for openrouter key error by @krrishdholakia in #9626
[Reliability] - Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB by @ishaan-jaff in #9608
[Refactor] - Use single class for managing DB update spend transactions by @ishaan-jaff in #9600
Add bedrock latency optimized inference support + Vertex AI Multimodal embedding cost tracking by @krrishdholakia in #9623
build(pyproject.toml): add new dev dependencies - for type checking by @krrishdholakia in #9631
install prisma migration files - connects litellm proxy to litellm's prisma migration files by @krrishdholakia in #9637
update docs for openwebui by @tan-yong-sheng in #9636
Add gemini audio input support + handle special tokens in sagemaker response by @krrishdholakia in #9640
[Docs - Release notes v0] v1.65.0-stable by @ishaan-jaff in #9643
[Feat] - MCP improvements, add support for using SSE MCP servers by @ishaan-jaff in #9642
[FIX] - Add password to sync sentinel client by @jmarshall-medallia in #9622
fix: Anthropic prompt caching on GCP Vertex AI by @sammcj in #9605
Fixes Databricks llama3.3-70b endpoint and add databricks claude 3.7 sonnet endpoint by @anton164 in #9661
fix(docs): update xAI Grok vision model reference by @colesmcintosh in #9286
docs(gemini): fix typo by @GabrielLoiseau in #9581
Update all_caches.md by @KPCOFGS in #9562
[Bug fix] - Sagemaker endpoint with inference component streaming by @ishaan-jaff in #9515
Revert "Correct Databricks llama3.3-70b endpoint and add databricks c… by @krrishdholakia in #9668
Revert "fix: Anthropic prompt caching on GCP Vertex AI" by @krrishdholakia in #9670
[Refactor] - Expose litellm.messages.acreate() and litellm.messages.create() to make LLM API calls in Anthropic API spec by @ishaan-jaff in #9567
Openrouter streaming fixes + Anthropic 'file' message support by @krrishdholakia in #9667
fix(cost_calculator.py): allows checking received + sent model name w… by @krrishdholakia in #9669
Revert "Revert "Correct Databricks llama3.3-70b endpoint and add databricks c…" by @krrishdholakia in #9676
Update model_prices_and_context_window.json add gemini-2.5-pro-exp-03-25 by @superpoussin22 in #9650
fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 by @krrishdholakia in #9671
UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls by @krrishdholakia in #9675
[Reliability] - Ensure new Redis + DB architecture tracks spend accurately by @ishaan-jaff in #9673
[Bug fix] - Service accounts - only apply service_account_settings.enforced_params on service accounts by @ishaan-jaff in #9683
UI - New Usage Tab fixes by @krrishdholakia in #9696
[Reliability Fixes] - Ensure no deadlocks occur when updating DailyUserSpendTransaction by @ishaan-jaff in #9690
Virtual key based policies in Aim Guardrails by @hxtomer in #9499
fix(streaming_handler.py): fix completion start time tracking + Anthropic 'reasoning_effort' param mapping by @krrishdholakia in #9688
Litellm user daily activity allow non admin usage by @krrishdholakia in #9695
fix(model_management_endpoints.py): fix allowing team admins to update team models by @krrishdholakia in #9697
Add support for max_completion_tokens to the Cohere chat transformati… by @simha104 in #9701
fix(gemini/): add gemini/ route embedding optional param mapping support by @krrishdholakia in #9677
Add Google AI Studio /v1/files upload API support by @krrishdholakia in #9645
[Docs] High Availability Setup (Resolve DB Deadlocks) by @ishaan-jaff in #9714
Bump image-size from 1.1.1 to 1.2.1 in /docs/my-website by @dependabot in #9708
[Bug fix] Azure o-series tool calling by @ishaan-jaff in #9694
[Reliability Fix] - Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) by @ishaan-jaff in #9715
Ban hardcoded numbers - merge of #9513 by @krrishdholakia in #9709
[Feat] Add VertexAI gemini-2.0-flash by @Dobiasd in #9723
Fix: Use request body in curl log for Gemini streaming mode by @fengjiajie in #9736
LiteLLM Minor Fixes & Improvements (04/02/2025) by @krrishdholakia in #9725
fix:Gemini Flash 2.0 implementation is not returning the logprobs by @sajdakabir in #9713
UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
Fix prompt caching for Anthropic tool calls by @aorwall in #9706
passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
[Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
[Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
[Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
Allow team members to see team models by @krrishdholakia in #9742
fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
Gemini image generation output support by @krrishdholakia in #9646
[Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
Update model_prices_and_context_window.json by @caramulrooney in #9620
[Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
[Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
[Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744
build: bump litellm-proxy-extras version by @krrishdholakia in #9771
Update model_prices by @aoaim in #9768
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in https://gith...

Contributors

Dbzman, aorwall, and 22 other contributors

Assets 4

05 Apr 17:50

github-actions

v1.65.4-nightly

af9db82

v1.65.4-nightly

What's Changed

UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
Fix prompt caching for Anthropic tool calls by @aorwall in #9706
passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
[Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
[Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
[Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
Allow team members to see team models by @krrishdholakia in #9742
fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
Gemini image generation output support by @krrishdholakia in #9646
[Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
Update model_prices_and_context_window.json by @caramulrooney in #9620
[Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
[Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
[Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744

New Contributors

@aorwall made their first contribution in #9706
@adrianlyjak made their first contribution in #9747
@chaosddp made their first contribution in #9764
@liuhu made their first contribution in #9648
@caramulrooney made their first contribution in #9620

Full Changelog: v1.65.3.dev5...v1.65.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	223.7736452877303	6.155321270706562	0.0	1842	0	181.8848560000106	4326.022138999974
Aggregated	Passed ✅	200.0	223.7736452877303	6.155321270706562	0.0	1842	0	181.8848560000106	4326.022138999974

Contributors

aorwall, chaosddp, and 5 other contributors

Assets 4

04 Apr 04:51

github-actions

v1.65.3.dev5

f1bc99a

v1.65.3.dev5

Full Changelog: v1.65.3-nightly...v1.65.3.dev5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3.dev5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	257.21842034089684	6.187487225938014	0.0	1851	0	214.41921800010277	1901.027192000015
Aggregated	Passed ✅	240.0	257.21842034089684	6.187487225938014	0.0	1851	0	214.41921800010277	1901.027192000015

Assets 4

04 Apr 15:45

github-actions

v1.65.3-nightly.post1

21de86e

v1.65.3-nightly.post1

Full Changelog: v1.65.3-nightly...v1.65.3-nightly.post1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly.post1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	190.0	229.62687951995775	6.194402083103671	0.0	1854	0	166.3033429999814	4963.725549999992
Aggregated	Passed ✅	190.0	229.62687951995775	6.194402083103671	0.0	1854	0	166.3033429999814	4963.725549999992

Assets 4

03 Apr 21:27

github-actions

v1.65.3-nightly

ef6bf02

v1.65.3-nightly

What's Changed

Litellm user daily activity allow non admin usage by @krrishdholakia in #9695
fix(model_management_endpoints.py): fix allowing team admins to update team models by @krrishdholakia in #9697
Add support for max_completion_tokens to the Cohere chat transformati… by @simha104 in #9701
fix(gemini/): add gemini/ route embedding optional param mapping support by @krrishdholakia in #9677
Add Google AI Studio /v1/files upload API support by @krrishdholakia in #9645
[Docs] High Availability Setup (Resolve DB Deadlocks) by @ishaan-jaff in #9714
Bump image-size from 1.1.1 to 1.2.1 in /docs/my-website by @dependabot in #9708
[Bug fix] Azure o-series tool calling by @ishaan-jaff in #9694
[Reliability Fix] - Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) by @ishaan-jaff in #9715
Ban hardcoded numbers - merge of #9513 by @krrishdholakia in #9709
[Feat] Add VertexAI gemini-2.0-flash by @Dobiasd in #9723
Fix: Use request body in curl log for Gemini streaming mode by @fengjiajie in #9736
LiteLLM Minor Fixes & Improvements (04/02/2025) by @krrishdholakia in #9725
fix:Gemini Flash 2.0 implementation is not returning the logprobs by @sajdakabir in #9713

New Contributors

@simha104 made their first contribution in #9701
@Dobiasd made their first contribution in #9723
@sajdakabir made their first contribution in #9713

Full Changelog: v1.65.2.dev1...v1.65.3-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	250.0	280.0203592110988	6.144626182419757	0.0	1838	0	216.55763899997282	5015.033350000011
Aggregated	Passed ✅	250.0	280.0203592110988	6.144626182419757	0.0	1838	0	216.55763899997282	5015.033350000011

Contributors

Dobiasd, fengjiajie, and 5 other contributors

Assets 4

02 Apr 06:14

github-actions

v1.65.2.dev1

23051d8

v1.65.2.dev1

What's Changed

Openrouter streaming fixes + Anthropic 'file' message support by @krrishdholakia in #9667
fix(cost_calculator.py): allows checking received + sent model name w… by @krrishdholakia in #9669
Revert "Revert "Correct Databricks llama3.3-70b endpoint and add databricks c…" by @krrishdholakia in #9676
Update model_prices_and_context_window.json add gemini-2.5-pro-exp-03-25 by @superpoussin22 in #9650
fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 by @krrishdholakia in #9671
UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls by @krrishdholakia in #9675
[Reliability] - Ensure new Redis + DB architecture tracks spend accurately by @ishaan-jaff in #9673
[Bug fix] - Service accounts - only apply service_account_settings.enforced_params on service accounts by @ishaan-jaff in #9683
UI - New Usage Tab fixes by @krrishdholakia in #9696
[Reliability Fixes] - Ensure no deadlocks occur when updating DailyUserSpendTransaction by @ishaan-jaff in #9690
Virtual key based policies in Aim Guardrails by @hxtomer in #9499
fix(streaming_handler.py): fix completion start time tracking + Anthropic 'reasoning_effort' param mapping by @krrishdholakia in #9688

Full Changelog: v1.65.1-nightly...v1.65.2.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.2.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	228.05634387180868	6.2817559363541715	0.0	1880	0	183.1938070000092	4938.761445000011
Aggregated	Passed ✅	200.0	228.05634387180868	6.2817559363541715	0.0	1880	0	183.1938070000092	4938.761445000011

Contributors

krrishdholakia, superpoussin22, and 2 other contributors

Assets 4

01 Apr 05:39

github-actions

v1.65.1-nightly

bc5cc51

v1.65.1-nightly

What's Changed

Litellm fix db testing by @krrishdholakia in #9593
Litellm new UI build by @krrishdholakia in #9601
Support max_completion_tokens on Mistral by @Cmancuso in #9589
Revert "Support max_completion_tokens on Mistral" by @krrishdholakia in #9604
fix(mistral_chat_transformation.py): add missing comma by @krrishdholakia in #9606
Support discovering gemini, anthropic, xai models by calling their /v1/model endpoint by @krrishdholakia in #9530
Connect UI to "LiteLLM_DailyUserSpend" spend table - enables usage tab to work at 1m+ spend logs by @krrishdholakia in #9603
Update README.md by @krrishdholakia in #9616
Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9618
fix(proxy_server.py): get master key from environment, if not set in … by @krrishdholakia in #9617
fix(logging): add json formatting for uncaught exceptions (#9615) by @krrishdholakia in #9619
fix: wrong indentation of ttlSecondsAfterFinished in chart by @Dbzman in #9611
Fix anthropic thinking + response_format by @krrishdholakia in #9594
Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9625
fix(openrouter/chat/transformation.py): raise informative message for openrouter key error by @krrishdholakia in #9626
[Reliability] - Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB by @ishaan-jaff in #9608
Add bedrock latency optimized inference support + Vertex AI Multimodal embedding cost tracking by @krrishdholakia in #9623
build(pyproject.toml): add new dev dependencies - for type checking by @krrishdholakia in #9631
install prisma migration files - connects litellm proxy to litellm's prisma migration files by @krrishdholakia in #9637
update docs for openwebui by @tan-yong-sheng in #9636
Add gemini audio input support + handle special tokens in sagemaker response by @krrishdholakia in #9640
[Docs - Release notes v0] v1.65.0-stable by @ishaan-jaff in #9643
[Feat] - MCP improvements, add support for using SSE MCP servers by @ishaan-jaff in #9642
[FIX] - Add password to sync sentinel client by @jmarshall-medallia in #9622
fix: Anthropic prompt caching on GCP Vertex AI by @sammcj in #9605
Fixes Databricks llama3.3-70b endpoint and add databricks claude 3.7 sonnet endpoint by @anton164 in #9661
fix(docs): update xAI Grok vision model reference by @colesmcintosh in #9286
docs(gemini): fix typo by @GabrielLoiseau in #9581
Update all_caches.md by @KPCOFGS in #9562
[Bug fix] - Sagemaker endpoint with inference component streaming by @ishaan-jaff in #9515
Revert "Correct Databricks llama3.3-70b endpoint and add databricks c… by @krrishdholakia in #9668
Revert "fix: Anthropic prompt caching on GCP Vertex AI" by @krrishdholakia in #9670
[Refactor] - Expose litellm.messages.acreate() and litellm.messages.create() to make LLM API calls in Anthropic API spec by @ishaan-jaff in #9567

New Contributors

@Cmancuso made their first contribution in #9589
@Dbzman made their first contribution in #9611
@tan-yong-sheng made their first contribution in #9636
@jmarshall-medallia made their first contribution in #9622
@GabrielLoiseau made their first contribution in #9581
@KPCOFGS made their first contribution in #9562

Full Changelog: v1.64.1.dev1...v1.65.1-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.1-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	220.0	261.03979166611845	6.112143157921839	0.0	1827	0	196.8891020000001	5075.201525000011
Aggregated	Passed ✅	220.0	261.03979166611845	6.112143157921839	0.0	1827	0	196.8891020000001	5075.201525000011

Contributors

Dbzman, sammcj, and 10 other contributors

Assets 4

30 Mar 06:12

github-actions

v1.65.0-stable

0db3eaa

v1.65.0-stable

What's Changed

Fix route check for non-proxy admins on jwt auth by @krrishdholakia in #9454
docs(predibase): fix typo by @luisegarduno in #9464
build(deps): bump next from 14.2.21 to 14.2.25 in /ui/litellm-dashboard by @dependabot in #9458
[Feat] Add OpenAI Web Search Tool Call Support - Initial support by @ishaan-jaff in #9465
Refactor vertex ai passthrough routes - fixes unpredictable behaviour w/ auto-setting default_vertex_region on router model add by @krrishdholakia in #9467
[Feat] Add testing for litellm.supports_web_search() and render supports_web_search on model hub by @ishaan-jaff in #9469
Litellm dev 03 22 2025 release note by @krrishdholakia in #9475
build: add new vertex text embedding model by @krrishdholakia in #9476
enables viewing all wildcard models on /model/info by @krrishdholakia in #9473
Litellm redis semantic caching by @tylerhutcherson in #9356
Log 'api_base' on spend logs by @krrishdholakia in #9509
[Fix] Use StandardLoggingPayload for GCS Pub Sub Logging Integration by @ishaan-jaff in #9508
[Feat] Support for exposing MCP tools on litellm proxy by @ishaan-jaff in #9426
fix(invoke_handler.py): remove hard coded final usage chunk on bedrock streaming usage by @krrishdholakia in #9512
Add vertexai topLogprobs support by @krrishdholakia in #9518
Update model_prices_and_context_window.json by @superpoussin22 in #9459
fix vertex ai multimodal embedding translation by @krrishdholakia in #9471
ci(publish-migrations.yml): add action for publishing prisma db migrations by @krrishdholakia in #9537
[Feat - New Model] Add VertexAI gemini-2.0-flash-lite and Google AI Studio gemini-2.0-flash-lite by @ishaan-jaff in #9523
Support litellm.api_base for vertex_ai + gemini/ across completion, embedding, image_generation by @krrishdholakia in #9516
Nova Canvas complete image generation tasks (#9177) by @krrishdholakia in #9525
[Feature]: Support for Fine-Tuned Vertex AI LLMs by @ishaan-jaff in #9542
feat(prisma-migrations): add baseline db migration file by @krrishdholakia in #9565
Add Daily User Spend Aggregate view - allows UI Usage tab to work > 1m rows by @krrishdholakia in #9538
Support Gemini audio token cost tracking + fix openai audio input token cost tracking by @krrishdholakia in #9535
[Reliability Fixes] - Gracefully handle exceptions when DB is having an outage by @ishaan-jaff in #9533
[Reliability Fix] - Allow Pods to startup + passing /health/readiness when allow_requests_on_db_unavailable: True and DB is down by @ishaan-jaff in #9569
Add OpenAI gpt-4o-transcribe support by @krrishdholakia in #9517
Allow viewing keyinfo on request logs by @krrishdholakia in #9568
Allow team admins to add/update/delete models on UI + show api base and model id on request logs by @krrishdholakia in #9572
Litellm fix db testing by @krrishdholakia in #9593
Litellm new UI build by @krrishdholakia in #9601
Support max_completion_tokens on Mistral by @Cmancuso in #9589
Revert "Support max_completion_tokens on Mistral" by @krrishdholakia in #9604
fix(mistral_chat_transformation.py): add missing comma by @krrishdholakia in #9606
Support discovering gemini, anthropic, xai models by calling their /v1/model endpoint by @krrishdholakia in #9530
Connect UI to "LiteLLM_DailyUserSpend" spend table - enables usage tab to work at 1m+ spend logs by @krrishdholakia in #9603
Update README.md by @krrishdholakia in #9616
fix(proxy_server.py): get master key from environment, if not set in … by @krrishdholakia in #9617

New Contributors

@luisegarduno made their first contribution in #9464
@Cmancuso made their first contribution in #9589

Full Changelog: v1.63.14-stable.patch1...v1.65.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.65.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	233.43193575834258	6.214443976298119	0.0	1858	0	180.17820199997914	4614.819022000006
Aggregated	Passed ✅	200.0	233.43193575834258	6.214443976298119	0.0	1858	0	180.17820199997914	4614.819022000006

Contributors

Cmancuso, krrishdholakia, and 5 other contributors

Assets 4

Releases: BerriAI/litellm

v1.65.4.dev8

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.4.dev6

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.4-stable

Docker Run LiteLLM Proxy

pip install LiteLLM Proxy

What's Changed

Contributors

v1.65.4-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.3.dev5

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

v1.65.3-nightly.post1

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

v1.65.3-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.2.dev1

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.1-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.65.0-stable

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors