Skip to content

Releases: BerriAI/litellm

v1.65.4.dev8

10 Apr 02:35
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.65.4.dev6...v1.65.4.dev8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 271.9459418950253 6.118160191369328 0.0 1829 0 215.48997299998973 3681.300501999999
Aggregated Passed ✅ 240.0 271.9459418950253 6.118160191369328 0.0 1829 0 215.48997299998973 3681.300501999999

v1.65.4.dev6

09 Apr 01:15
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.65.4-nightly...v1.65.4.dev6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev6

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 230.0 264.1081772527121 6.162437450043016 0.0 1844 0 200.65376200000173 5098.356198000033
Aggregated Passed ✅ 230.0 264.1081772527121 6.162437450043016 0.0 1844 0 200.65376200000173 5098.356198000033

v1.65.4-stable

05 Apr 22:51
Compare
Choose a tag to compare

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.65.4-stable

pip install LiteLLM Proxy

pip install litellm==1.65.4.post1

What's Changed

  • Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9618
  • fix(logging): add json formatting for uncaught exceptions (#9615) by @krrishdholakia in #9619
  • fix: wrong indentation of ttlSecondsAfterFinished in chart by @Dbzman in #9611
  • Fix anthropic thinking + response_format by @krrishdholakia in #9594
  • Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9625
  • fix(openrouter/chat/transformation.py): raise informative message for openrouter key error by @krrishdholakia in #9626
  • [Reliability] - Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB by @ishaan-jaff in #9608
  • [Refactor] - Use single class for managing DB update spend transactions by @ishaan-jaff in #9600
  • Add bedrock latency optimized inference support + Vertex AI Multimodal embedding cost tracking by @krrishdholakia in #9623
  • build(pyproject.toml): add new dev dependencies - for type checking by @krrishdholakia in #9631
  • install prisma migration files - connects litellm proxy to litellm's prisma migration files by @krrishdholakia in #9637
  • update docs for openwebui by @tan-yong-sheng in #9636
  • Add gemini audio input support + handle special tokens in sagemaker response by @krrishdholakia in #9640
  • [Docs - Release notes v0] v1.65.0-stable by @ishaan-jaff in #9643
  • [Feat] - MCP improvements, add support for using SSE MCP servers by @ishaan-jaff in #9642
  • [FIX] - Add password to sync sentinel client by @jmarshall-medallia in #9622
  • fix: Anthropic prompt caching on GCP Vertex AI by @sammcj in #9605
  • Fixes Databricks llama3.3-70b endpoint and add databricks claude 3.7 sonnet endpoint by @anton164 in #9661
  • fix(docs): update xAI Grok vision model reference by @colesmcintosh in #9286
  • docs(gemini): fix typo by @GabrielLoiseau in #9581
  • Update all_caches.md by @KPCOFGS in #9562
  • [Bug fix] - Sagemaker endpoint with inference component streaming by @ishaan-jaff in #9515
  • Revert "Correct Databricks llama3.3-70b endpoint and add databricks c… by @krrishdholakia in #9668
  • Revert "fix: Anthropic prompt caching on GCP Vertex AI" by @krrishdholakia in #9670
  • [Refactor] - Expose litellm.messages.acreate() and litellm.messages.create() to make LLM API calls in Anthropic API spec by @ishaan-jaff in #9567
  • Openrouter streaming fixes + Anthropic 'file' message support by @krrishdholakia in #9667
  • fix(cost_calculator.py): allows checking received + sent model name w… by @krrishdholakia in #9669
  • Revert "Revert "Correct Databricks llama3.3-70b endpoint and add databricks c…" by @krrishdholakia in #9676
  • Update model_prices_and_context_window.json add gemini-2.5-pro-exp-03-25 by @superpoussin22 in #9650
  • fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 by @krrishdholakia in #9671
  • UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls by @krrishdholakia in #9675
  • [Reliability] - Ensure new Redis + DB architecture tracks spend accurately by @ishaan-jaff in #9673
  • [Bug fix] - Service accounts - only apply service_account_settings.enforced_params on service accounts by @ishaan-jaff in #9683
  • UI - New Usage Tab fixes by @krrishdholakia in #9696
  • [Reliability Fixes] - Ensure no deadlocks occur when updating DailyUserSpendTransaction by @ishaan-jaff in #9690
  • Virtual key based policies in Aim Guardrails by @hxtomer in #9499
  • fix(streaming_handler.py): fix completion start time tracking + Anthropic 'reasoning_effort' param mapping by @krrishdholakia in #9688
  • Litellm user daily activity allow non admin usage by @krrishdholakia in #9695
  • fix(model_management_endpoints.py): fix allowing team admins to update team models by @krrishdholakia in #9697
  • Add support for max_completion_tokens to the Cohere chat transformati… by @simha104 in #9701
  • fix(gemini/): add gemini/ route embedding optional param mapping support by @krrishdholakia in #9677
  • Add Google AI Studio /v1/files upload API support by @krrishdholakia in #9645
  • [Docs] High Availability Setup (Resolve DB Deadlocks) by @ishaan-jaff in #9714
  • Bump image-size from 1.1.1 to 1.2.1 in /docs/my-website by @dependabot in #9708
  • [Bug fix] Azure o-series tool calling by @ishaan-jaff in #9694
  • [Reliability Fix] - Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) by @ishaan-jaff in #9715
  • Ban hardcoded numbers - merge of #9513 by @krrishdholakia in #9709
  • [Feat] Add VertexAI gemini-2.0-flash by @Dobiasd in #9723
  • Fix: Use request body in curl log for Gemini streaming mode by @fengjiajie in #9736
  • LiteLLM Minor Fixes & Improvements (04/02/2025) by @krrishdholakia in #9725
  • fix:Gemini Flash 2.0 implementation is not returning the logprobs by @sajdakabir in #9713
  • UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
  • Fix prompt caching for Anthropic tool calls by @aorwall in #9706
  • passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
  • [Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
  • [Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
  • Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
  • [Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
  • fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
  • Allow team members to see team models by @krrishdholakia in #9742
  • fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
  • Gemini image generation output support by @krrishdholakia in #9646
  • [Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
  • fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
  • Update model_prices_and_context_window.json by @caramulrooney in #9620
  • [Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
  • [Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
  • [Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
  • Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
  • Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744
  • build: bump litellm-proxy-extras version by @krrishdholakia in #9771
  • Update model_prices by @aoaim in #9768
  • Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in https://gith...
Read more

v1.65.4-nightly

05 Apr 17:50
Compare
Choose a tag to compare

What's Changed

  • UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
  • Fix prompt caching for Anthropic tool calls by @aorwall in #9706
  • passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
  • [Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
  • [Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
  • Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
  • [Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
  • fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
  • Allow team members to see team models by @krrishdholakia in #9742
  • fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
  • Gemini image generation output support by @krrishdholakia in #9646
  • [Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
  • fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
  • Update model_prices_and_context_window.json by @caramulrooney in #9620
  • [Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
  • [Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
  • [Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
  • Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
  • Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744

New Contributors

Full Changelog: v1.65.3.dev5...v1.65.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 223.7736452877303 6.155321270706562 0.0 1842 0 181.8848560000106 4326.022138999974
Aggregated Passed ✅ 200.0 223.7736452877303 6.155321270706562 0.0 1842 0 181.8848560000106 4326.022138999974

v1.65.3.dev5

04 Apr 04:51
Compare
Choose a tag to compare

Full Changelog: v1.65.3-nightly...v1.65.3.dev5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3.dev5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 257.21842034089684 6.187487225938014 0.0 1851 0 214.41921800010277 1901.027192000015
Aggregated Passed ✅ 240.0 257.21842034089684 6.187487225938014 0.0 1851 0 214.41921800010277 1901.027192000015

v1.65.3-nightly.post1

04 Apr 15:45
Compare
Choose a tag to compare

Full Changelog: v1.65.3-nightly...v1.65.3-nightly.post1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly.post1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 229.62687951995775 6.194402083103671 0.0 1854 0 166.3033429999814 4963.725549999992
Aggregated Passed ✅ 190.0 229.62687951995775 6.194402083103671 0.0 1854 0 166.3033429999814 4963.725549999992

v1.65.3-nightly

03 Apr 21:27
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.65.2.dev1...v1.65.3-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 280.0203592110988 6.144626182419757 0.0 1838 0 216.55763899997282 5015.033350000011
Aggregated Passed ✅ 250.0 280.0203592110988 6.144626182419757 0.0 1838 0 216.55763899997282 5015.033350000011

v1.65.2.dev1

02 Apr 06:14
23051d8
Compare
Choose a tag to compare

What's Changed

  • Openrouter streaming fixes + Anthropic 'file' message support by @krrishdholakia in #9667
  • fix(cost_calculator.py): allows checking received + sent model name w… by @krrishdholakia in #9669
  • Revert "Revert "Correct Databricks llama3.3-70b endpoint and add databricks c…" by @krrishdholakia in #9676
  • Update model_prices_and_context_window.json add gemini-2.5-pro-exp-03-25 by @superpoussin22 in #9650
  • fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 by @krrishdholakia in #9671
  • UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls by @krrishdholakia in #9675
  • [Reliability] - Ensure new Redis + DB architecture tracks spend accurately by @ishaan-jaff in #9673
  • [Bug fix] - Service accounts - only apply service_account_settings.enforced_params on service accounts by @ishaan-jaff in #9683
  • UI - New Usage Tab fixes by @krrishdholakia in #9696
  • [Reliability Fixes] - Ensure no deadlocks occur when updating DailyUserSpendTransaction by @ishaan-jaff in #9690
  • Virtual key based policies in Aim Guardrails by @hxtomer in #9499
  • fix(streaming_handler.py): fix completion start time tracking + Anthropic 'reasoning_effort' param mapping by @krrishdholakia in #9688

Full Changelog: v1.65.1-nightly...v1.65.2.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.2.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 228.05634387180868 6.2817559363541715 0.0 1880 0 183.1938070000092 4938.761445000011
Aggregated Passed ✅ 200.0 228.05634387180868 6.2817559363541715 0.0 1880 0 183.1938070000092 4938.761445000011

v1.65.1-nightly

01 Apr 05:39
bc5cc51
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.64.1.dev1...v1.65.1-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.1-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 220.0 261.03979166611845 6.112143157921839 0.0 1827 0 196.8891020000001 5075.201525000011
Aggregated Passed ✅ 220.0 261.03979166611845 6.112143157921839 0.0 1827 0 196.8891020000001 5075.201525000011

v1.65.0-stable

30 Mar 06:12
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.63.14-stable.patch1...v1.65.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.65.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 233.43193575834258 6.214443976298119 0.0 1858 0 180.17820199997914 4614.819022000006
Aggregated Passed ✅ 200.0 233.43193575834258 6.214443976298119 0.0 1858 0 180.17820199997914 4614.819022000006