Releases: BerriAI/litellm
v1.63.6-nightly
What's Changed
- pricing for jamba new models by @themrzmaster in #9032
- build(deps): bump jinja2 from 3.1.4 to 3.1.6 by @dependabot in #9014
- (docs) add section for contributing to litellm by @ishaan-jaff in #9107
- build: Add Makefile for LiteLLM project with test targets by @colesmcintosh in #8948
- (Docs) - Contributing to litellm by @ishaan-jaff in #9110
- Added tags, user_feedback and model_options to additional_keys which can be sent to athina by @vivek-athina in #8845
- fix missing comma by @niinpatel in #8746
- Update model_prices_and_context_window.json by @mounta11n in #8757
- Fix triton streaming completions bug by @minwhoo in #8386
- (docs) Update vertex.md old code example by @santibreo in #7736
- (Feat) - Allow adding Text-Completion OpenAI models through UI by @ishaan-jaff in #9102
- docs(pr-template): update unit test command in checklist by @colesmcintosh in #9119
- [UI SSO Bug fix] - Correctly use
PROXY_LOGOUT_URL
when set by @ishaan-jaff in #9117 - Validate
model_prices_and_context_window.json
with a test, clarify possiblemode
values + ensure consistent use ofmode
by @utkashd in #8956 - JWT Auth Fix - [Bug]: JWT access with Groups not working when team is assigned All Proxy Models access by @ishaan-jaff in #8934
New Contributors
- @mounta11n made their first contribution in #8757
- @minwhoo made their first contribution in #8386
- @santibreo made their first contribution in #7736
- @utkashd made their first contribution in #8956
Full Changelog: v1.63.3.dev1...v1.63.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 258.1422458623163 | 6.0939161635327785 | 0.0 | 1823 | 0 | 213.26022699997793 | 2549.854018000019 |
Aggregated | Passed ✅ | 240.0 | 258.1422458623163 | 6.0939161635327785 | 0.0 | 1823 | 0 | 213.26022699997793 | 2549.854018000019 |
v1.63.5-nightly
What's Changed
- fix(team_endpoints.py): ensure 404 raised when team not found + fix setting tags on keys by @krrishdholakia in #9038
- build(model_prices_and_context_window.json): update azure o1 mini pri… by @krrishdholakia in #9046
- Support master key rotations by @krrishdholakia in #9041
- (Feat) - add pricing for eu.amazon.nova models by @ishaan-jaff in #9056
- docs: Add project page for pgai by @Askir in #8576
- Mark several Claude models as being able to accept PDF inputs by @minhduc0711 in #9054
- (UI) - Keys Page - Show 100 Keys Per Page, Use full height, increase width of key alias by @ishaan-jaff in #9064
- (UI) Logs Page - Keep expanded log in focus on LiteLLM UI by @ishaan-jaff in #9061
- (Docs) OpenWeb x LiteLLM Docker compose + Instructions on spend tracking + logging by @ishaan-jaff in #9059
- (UI) - Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models on Admin UI by @ishaan-jaff in #9069
- UI - new API Playground for testing LiteLLM translation by @krrishdholakia in #9073
- Bug fix - String data: stripped from entire content in streamed Gemini responses by @ishaan-jaff in #9070
- (UI) - Minor improvements to logs page by @ishaan-jaff in #9076
- Bug fix: support bytes.IO when handling audio files for transcription by @tvishwanadha in #9071
- Fix batches api cost tracking + Log batch models in spend logs / standard logging payload by @krrishdholakia in #9077
- (UI) - Fix, Allow Filter Keys by Team Alias, Key Alias and Org by @ishaan-jaff in #9083
- (Clean up) - Allow switching off storing Error Logs in DB by @ishaan-jaff in #9084
- (UI) - Fix show correct count of internal user keys on Users Page by @ishaan-jaff in #9082
- New stable release notes by @krrishdholakia in #9085
- Litellm dev 03 08 2025 p3 by @krrishdholakia in #9089
- feat: prioritize api_key over tenant_id for more Azure AD token provi… by @krrishdholakia in #8701
- Fix incorrect streaming response by @5aaee9 in #9081
- Support openrouter
reasoning_content
on streaming by @krrishdholakia in #9094 - pricing for jamba new models by @themrzmaster in #9032
New Contributors
- @Askir made their first contribution in #8576
- @tvishwanadha made their first contribution in #9071
- @5aaee9 made their first contribution in #9081
Full Changelog: v1.63.3-nightly...v1.63.5-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.5-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 265.2487556257438 | 6.181834559182228 | 0.0 | 1849 | 0 | 214.44034500001408 | 3942.616398000041 |
Aggregated | Passed ✅ | 250.0 | 265.2487556257438 | 6.181834559182228 | 0.0 | 1849 | 0 | 214.44034500001408 | 3942.616398000041 |
v1.63.3.dev1
What's Changed
- fix(team_endpoints.py): ensure 404 raised when team not found + fix setting tags on keys by @krrishdholakia in #9038
- build(model_prices_and_context_window.json): update azure o1 mini pri… by @krrishdholakia in #9046
- Support master key rotations by @krrishdholakia in #9041
- (Feat) - add pricing for eu.amazon.nova models by @ishaan-jaff in #9056
- docs: Add project page for pgai by @Askir in #8576
- Mark several Claude models as being able to accept PDF inputs by @minhduc0711 in #9054
- (UI) - Keys Page - Show 100 Keys Per Page, Use full height, increase width of key alias by @ishaan-jaff in #9064
- (UI) Logs Page - Keep expanded log in focus on LiteLLM UI by @ishaan-jaff in #9061
- (Docs) OpenWeb x LiteLLM Docker compose + Instructions on spend tracking + logging by @ishaan-jaff in #9059
- (UI) - Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models on Admin UI by @ishaan-jaff in #9069
- UI - new API Playground for testing LiteLLM translation by @krrishdholakia in #9073
- Bug fix - String data: stripped from entire content in streamed Gemini responses by @ishaan-jaff in #9070
- (UI) - Minor improvements to logs page by @ishaan-jaff in #9076
- Bug fix: support bytes.IO when handling audio files for transcription by @tvishwanadha in #9071
- Fix batches api cost tracking + Log batch models in spend logs / standard logging payload by @krrishdholakia in #9077
- (UI) - Fix, Allow Filter Keys by Team Alias, Key Alias and Org by @ishaan-jaff in #9083
- (Clean up) - Allow switching off storing Error Logs in DB by @ishaan-jaff in #9084
- (UI) - Fix show correct count of internal user keys on Users Page by @ishaan-jaff in #9082
- New stable release notes by @krrishdholakia in #9085
- Litellm dev 03 08 2025 p3 by @krrishdholakia in #9089
- feat: prioritize api_key over tenant_id for more Azure AD token provi… by @krrishdholakia in #8701
- Fix incorrect streaming response by @5aaee9 in #9081
- Support openrouter
reasoning_content
on streaming by @krrishdholakia in #9094
New Contributors
- @Askir made their first contribution in #8576
- @tvishwanadha made their first contribution in #9071
- @5aaee9 made their first contribution in #9081
Full Changelog: v1.63.3-nightly...v1.63.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 228.5953675353703 | 6.234609422669878 | 0.0 | 1866 | 0 | 180.65118199996277 | 3985.566232999986 |
Aggregated | Passed ✅ | 200.0 | 228.5953675353703 | 6.234609422669878 | 0.0 | 1866 | 0 | 180.65118199996277 | 3985.566232999986 |
v1.63.2-stable
Full Changelog: v1.61.20-stable...v1.63.2-stable
-
New Models / Updated Models
- Add supports_pdf_input: true for specific Bedrock Claude models
-
LLM Translation
- Support
/openai/
passthrough for Assistant endpoints - Bedrock Claude - fix amazon anthropic claude 3 tool calling transformation on invoke route
- Bedrock Claude - response_format support for claude on invoke route
- Bedrock - pass
description
if set in response_format - Bedrock - Fix passing response_format: {"type": "text"}
- OpenAI - Handle sending image_url as str to openai
- Deepseek - Fix deepseek 'reasoning_content' error
- Caching - Support caching on reasoning content
- Bedrock - handle thinking blocks in assistant message
- Anthropic - Return signature on anthropic streaming + migrate to signature field instead of signature_delta
- Support format param for specifying image type
- Anthropic -
/v1/messages
endpoint -thinking
param support: note: this refactors the [BETA] unified/v1/messages
endpoint, to just work for the Anthropic API. - Vertex AI - handle $id in response schema when calling vertex ai
- Support
-
Spend Tracking Improvements
-
Management Endpoints / UI
- Allow team/org filters to be searchable on the Create Key Page
- Add created_by and updated_by fields to Keys table
- Show 'user_email' on key table on UI
- (Feat) - Show Error Logs on LiteLLM UI
- UI - Allow admin to control default model access for internal users
- (UI) - Allow Internal Users to View their own logs
- (UI) Fix session handling with cookies
- Keys Page - Show 100 Keys Per Page, Use full height, increase width of key alias
-
Logging / Guardrail Integrations
-
Performance / Loadbalancing / Reliability improvements
-
General Proxy Improvements
- Raise BadRequestError when unknown model passed in request
- Enforce model access restrictions on Azure OpenAI proxy route
- Reliability fix - Handle emoji’s in text - fix orjson error
- Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
- Enable setting timezone information in docker image
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 223.19371836864636 | 6.25209576552295 | 0.0033451555727784642 | 1869 | 1 | 89.92210900004238 | 1948.821826000028 |
Aggregated | Passed ✅ | 190.0 | 223.19371836864636 | 6.25209576552295 | 0.0033451555727784642 | 1869 | 1 | 89.92210900004238 | 1948.821826000028 |
v1.63.3-nightly
What's Changed
- Fix redis cluster mode for routers by @ogunoz in #9010
- [Feat] - Display
thinking
tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) by @ishaan-jaff in #9029 - (AWS Secret Manager) - Using K/V pairs in 1 AWS Secret by @ishaan-jaff in #9039
- (Docs) connect litellm to open web ui by @ishaan-jaff in #9040
- Added PDL project by @vazirim in #8925
- (UI) - Allow adding EU OpenAI models by @ishaan-jaff in #9042
New Contributors
Full Changelog: v1.63.2-nightly...v1.63.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 274.33505649537244 | 6.144475001880859 | 0.0 | 1837 | 0 | 199.62131199997657 | 3623.5841269999582 |
Aggregated | Passed ✅ | 220.0 | 274.33505649537244 | 6.144475001880859 | 0.0 | 1837 | 0 | 199.62131199997657 | 3623.5841269999582 |
v1.63.2-nightly
What's Changed
- Return signature on bedrock converse thinking + Fix
{}
empty dictionary on streaming + thinking by @krrishdholakia in #9023 - (Refactor)
/v1/messages
to follow simpler logic for Anthropic API spec by @ishaan-jaff in #9013
Full Changelog: v1.63.0-nightly...v1.63.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 283.0173457426872 | 6.168530673577194 | 0.0 | 1846 | 0 | 214.4760310000038 | 4984.3768089999685 |
Aggregated | Passed ✅ | 250.0 | 283.0173457426872 | 6.168530673577194 | 0.0 | 1846 | 0 | 214.4760310000038 | 4984.3768089999685 |
v1.63.0.dev5
What's Changed
- Return signature on bedrock converse thinking + Fix
{}
empty dictionary on streaming + thinking by @krrishdholakia in #9023 - (Refactor)
/v1/messages
to follow simpler logic for Anthropic API spec by @ishaan-jaff in #9013
Full Changelog: v1.63.0-nightly...v1.63.0.dev5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0.dev5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 278.42101090109276 | 6.116149255066882 | 0.0 | 1830 | 0 | 214.94648899999902 | 4750.29671599998 |
Aggregated | Passed ✅ | 250.0 | 278.42101090109276 | 6.116149255066882 | 0.0 | 1830 | 0 | 214.94648899999902 | 4750.29671599998 |
v1.63.0.dev1
Full Changelog: v1.63.0-nightly...v1.63.0.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 209.86284151312142 | 6.250523763835477 | 0.0 | 1867 | 0 | 163.62763399996538 | 3461.6653150000047 |
Aggregated | Passed ✅ | 190.0 | 209.86284151312142 | 6.250523763835477 | 0.0 | 1867 | 0 | 163.62763399996538 | 3461.6653150000047 |
v1.63.0-nightly
What's Changed
- Fix #7629 - Add tzdata package to Dockerfile (#8915) by @krrishdholakia in #9009
- Return
signature
on anthropic streaming + migrate tosignature
field instead ofsignature_delta
[MINOR bump] by @krrishdholakia in #9021 - Support
format
param for specifying image type by @krrishdholakia in #9019
v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature
block. Github Issue
It also moves the response structure from signature_delta
to signature
to be the same as Anthropic. Anthropic Docs
Diff
"message": {
...
"reasoning_content": "The capital of France is Paris.",
"thinking_blocks": [
{
"type": "thinking",
"thinking": "The capital of France is Paris.",
- "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+ "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
}
]
}
Full Changelog: v1.62.4-nightly...v1.63.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 272.1226933173393 | 6.127690671911355 | 0.0 | 1834 | 0 | 217.38513100001455 | 3752.371346000018 |
Aggregated | Passed ✅ | 250.0 | 272.1226933173393 | 6.127690671911355 | 0.0 | 1834 | 0 | 217.38513100001455 | 3752.371346000018 |
v1.62.4-nightly
What's Changed
- Fix deepseek 'reasoning_content' error by @krrishdholakia in #8963
- (UI) Fix session handling with cookies by @ishaan-jaff in #8969
- (UI) - Improvements to session handling logic by @ishaan-jaff in #8970
- fix(route_llm_request.py): move to using common router, for client-side credentials by @krrishdholakia in #8966
- Litellm dev 03 01 2025 p2 by @krrishdholakia in #8944
- Support caching on reasoning content + other fixes by @krrishdholakia in #8973
- fix(common_utils.py): handle $id in response schema when calling vert… by @krrishdholakia in #8991
- (bug fix) - Fix Cache Health Check for Redis when redis_version is float by @ishaan-jaff in #8979
- (UI) - Security Improvement, move to JWT Auth for Admin UI Sessions by @ishaan-jaff in #8995
- Litellm dev 03 04 2025 p3 by @krrishdholakia in #8997
- fix(base_aws_llm.py): remove region name before sending in args by @krrishdholakia in #8998
Full Changelog: v1.62.1-nightly...v1.62.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.62.4-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 255.24015655585677 | 6.161171624266898 | 0.0 | 1844 | 0 | 200.43409900000597 | 1911.432934000004 |
Aggregated | Passed ✅ | 230.0 | 255.24015655585677 | 6.161171624266898 | 0.0 | 1844 | 0 | 200.43409900000597 | 1911.432934000004 |