Skip to content

Releases: BerriAI/litellm

v1.63.6-nightly

11 Mar 06:31
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.63.3.dev1...v1.63.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.6-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 258.1422458623163 6.0939161635327785 0.0 1823 0 213.26022699997793 2549.854018000019
Aggregated Passed ✅ 240.0 258.1422458623163 6.0939161635327785 0.0 1823 0 213.26022699997793 2549.854018000019

v1.63.5-nightly

10 Mar 22:15
0fcce63
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.63.3-nightly...v1.63.5-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.5-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 265.2487556257438 6.181834559182228 0.0 1849 0 214.44034500001408 3942.616398000041
Aggregated Passed ✅ 250.0 265.2487556257438 6.181834559182228 0.0 1849 0 214.44034500001408 3942.616398000041

v1.63.3.dev1

10 Mar 20:29
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.63.3-nightly...v1.63.3.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.3.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 228.5953675353703 6.234609422669878 0.0 1866 0 180.65118199996277 3985.566232999986
Aggregated Passed ✅ 200.0 228.5953675353703 6.234609422669878 0.0 1866 0 180.65118199996277 3985.566232999986

v1.63.2-stable

09 Mar 03:12
Compare
Choose a tag to compare

Full Changelog: v1.61.20-stable...v1.63.2-stable

  1. New Models / Updated Models

    1. Add supports_pdf_input: true for specific Bedrock Claude models 
  2. LLM Translation

    1. Support /openai/ passthrough for Assistant endpoints
    2. Bedrock Claude - fix amazon anthropic claude 3 tool calling transformation on invoke route
    3. Bedrock Claude - response_format support for claude on invoke route
    4. Bedrock - pass description if set in response_format
    5. Bedrock - Fix passing response_format: {"type": "text"}
    6. OpenAI - Handle sending image_url as str to openai
    7. Deepseek - Fix deepseek 'reasoning_content' error
    8. Caching - Support caching on reasoning content
    9. Bedrock - handle thinking blocks in assistant message
    10. Anthropic - Return signature on anthropic streaming + migrate to signature field instead of signature_delta
    11. Support format param for specifying image type
    12. Anthropic - /v1/messages endpoint - thinking param support: note: this refactors the [BETA] unified /v1/messages endpoint, to just work for the Anthropic API.
    13. Vertex AI - handle $id in response schema when calling vertex ai
  3. Spend Tracking Improvements

    1. Batches API - Fix cost calculation to run on retrieve_batch
    2. Batches API - Log batch models in spend logs / standard logging payload
  4. Management Endpoints / UI

    1. Allow team/org filters to be searchable on the Create Key Page
    2. Add created_by and updated_by fields to Keys table
    3. Show 'user_email' on key table on UI
    4. (Feat) - Show Error Logs on LiteLLM UI
    5. UI - Allow admin to control default model access for internal users
    6. (UI) - Allow Internal Users to View their own logs
    7. (UI) Fix session handling with cookies
    8. Keys Page - Show 100 Keys Per Page, Use full height, increase width of key alias
  5. Logging / Guardrail Integrations

    1. Fix prometheus metrics w/ custom metrics
  6. Performance / Loadbalancing / Reliability improvements

    1. Cooldowns - Support cooldowns on models called with client side credentials
    2. Tag-based Routing - ensures tag-based routing across all endpoints (/embeddings, /image_generation, etc.)
  7. General Proxy Improvements

    1. Raise BadRequestError when unknown model passed in request
    2. Enforce model access restrictions on Azure OpenAI proxy route
    3. Reliability fix - Handle emoji’s in text - fix orjson error
    4. Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
    5. Enable setting timezone information in docker image

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 223.19371836864636 6.25209576552295 0.0033451555727784642 1869 1 89.92210900004238 1948.821826000028
Aggregated Passed ✅ 190.0 223.19371836864636 6.25209576552295 0.0033451555727784642 1869 1 89.92210900004238 1948.821826000028

v1.63.3-nightly

07 Mar 15:55
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.63.2-nightly...v1.63.3-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 220.0 274.33505649537244 6.144475001880859 0.0 1837 0 199.62131199997657 3623.5841269999582
Aggregated Passed ✅ 220.0 274.33505649537244 6.144475001880859 0.0 1837 0 199.62131199997657 3623.5841269999582

v1.63.2-nightly

06 Mar 18:24
Compare
Choose a tag to compare

What's Changed

  • Return signature on bedrock converse thinking + Fix {} empty dictionary on streaming + thinking by @krrishdholakia in #9023
  • (Refactor) /v1/messages to follow simpler logic for Anthropic API spec by @ishaan-jaff in #9013

Full Changelog: v1.63.0-nightly...v1.63.2-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 283.0173457426872 6.168530673577194 0.0 1846 0 214.4760310000038 4984.3768089999685
Aggregated Passed ✅ 250.0 283.0173457426872 6.168530673577194 0.0 1846 0 214.4760310000038 4984.3768089999685

v1.63.0.dev5

06 Mar 16:49
Compare
Choose a tag to compare

What's Changed

  • Return signature on bedrock converse thinking + Fix {} empty dictionary on streaming + thinking by @krrishdholakia in #9023
  • (Refactor) /v1/messages to follow simpler logic for Anthropic API spec by @ishaan-jaff in #9013

Full Changelog: v1.63.0-nightly...v1.63.0.dev5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0.dev5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 278.42101090109276 6.116149255066882 0.0 1830 0 214.94648899999902 4750.29671599998
Aggregated Passed ✅ 250.0 278.42101090109276 6.116149255066882 0.0 1830 0 214.94648899999902 4750.29671599998

v1.63.0.dev1

06 Mar 16:20
Compare
Choose a tag to compare

Full Changelog: v1.63.0-nightly...v1.63.0.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 209.86284151312142 6.250523763835477 0.0 1867 0 163.62763399996538 3461.6653150000047
Aggregated Passed ✅ 190.0 209.86284151312142 6.250523763835477 0.0 1867 0 163.62763399996538 3461.6653150000047

v1.63.0-nightly

06 Mar 05:07
f6535ae
Compare
Choose a tag to compare

What's Changed

v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature block. Github Issue

It also moves the response structure from signature_delta to signature to be the same as Anthropic. Anthropic Docs

Diff

"message": {
    ...
    "reasoning_content": "The capital of France is Paris.",
    "thinking_blocks": [
        {
            "type": "thinking",
            "thinking": "The capital of France is Paris.",
-            "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+            "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
        }
    ]
}

Full Changelog: v1.62.4-nightly...v1.63.0-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.63.0-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 272.1226933173393 6.127690671911355 0.0 1834 0 217.38513100001455 3752.371346000018
Aggregated Passed ✅ 250.0 272.1226933173393 6.127690671911355 0.0 1834 0 217.38513100001455 3752.371346000018

v1.62.4-nightly

05 Mar 23:55
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.62.1-nightly...v1.62.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.62.4-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 230.0 255.24015655585677 6.161171624266898 0.0 1844 0 200.43409900000597 1911.432934000004
Aggregated Passed ✅ 230.0 255.24015655585677 6.161171624266898 0.0 1844 0 200.43409900000597 1911.432934000004