Skip to content

Releases: BerriAI/litellm

v1.69.0-stable

11 May 02:24
beae5cf
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.69.0-nightly...v1.69.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.69.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 264.33108534405653 6.12787888551344 0.0 1834 0 216.09041499999648 1326.1799069999824
Aggregated Passed ✅ 240.0 264.33108534405653 6.12787888551344 0.0 1834 0 216.09041499999648 1326.1799069999824

v1.69.0-nightly

11 May 01:55
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.68.2-nightly...v1.69.0-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.0-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 292.69430995024163 6.184694862389694 0.0 1849 0 216.9113210000262 60025.948276999996
Aggregated Passed ✅ 250.0 292.69430995024163 6.184694862389694 0.0 1849 0 216.9113210000262 60025.948276999996

v1.68.2.dev6

09 May 23:14
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.68.2-nightly...v1.68.2.dev6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.2.dev6

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 210.63173736431506 6.257034907859717 0.0 1872 0 166.34112399992773 1685.74146200001
Aggregated Passed ✅ 190.0 210.63173736431506 6.257034907859717 0.0 1872 0 166.34112399992773 1685.74146200001

v1.68.2-nightly

09 May 21:18
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.68.1.dev4...v1.68.2-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 223.07673508503882 6.209370359620187 0.0033419646714855688 1858 1 75.31227999999146 4978.849046000022
Aggregated Passed ✅ 190.0 223.07673508503882 6.209370359620187 0.0033419646714855688 1858 1 75.31227999999146 4978.849046000022

v1.68.1.dev4

08 May 18:22
416429e
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.68.1-nightly...v1.68.1.dev4

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 190.0 233.10816080888745 6.241336822394705 0.0 1868 0 166.93079599997418 5406.457653000075
Aggregated Passed ✅ 190.0 233.10816080888745 6.241336822394705 0.0 1868 0 166.93079599997418 5406.457653000075

v1.68.1.dev2

06 May 22:19
Compare
Choose a tag to compare

Full Changelog: v1.68.1.dev1...v1.68.1.dev2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 271.34034604220733 6.1752223755996924 0.0 1848 0 206.34432800000013 5012.736279000023
Aggregated Passed ✅ 240.0 271.34034604220733 6.1752223755996924 0.0 1848 0 206.34432800000013 5012.736279000023

v1.68.1.dev1

06 May 18:25
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.68.0-nightly...v1.68.1.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 210.0 244.34719839029643 6.203411663807808 0.0 1855 0 183.31073700005618 5362.244745999988
Aggregated Passed ✅ 210.0 244.34719839029643 6.203411663807808 0.0 1855 0 183.31073700005618 5362.244745999988

v1.68.1-nightly

07 May 05:00
3a73309
Compare
Choose a tag to compare

What's Changed

  • Add bedrock llama4 pricing + handle llama4 templating on bedrock invoke route by @krrishdholakia in #10582

Full Changelog: v1.68.1.dev2...v1.68.1-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 210.0 234.26202141345593 6.161378945167915 0.0 1843 0 179.4365540000058 3332.6730800000064
Aggregated Passed ✅ 210.0 234.26202141345593 6.161378945167915 0.0 1843 0 179.4365540000058 3332.6730800000064

v1.68.0-nightly

04 May 06:30
Compare
Choose a tag to compare

What's Changed

  • [Contributor PR] Support Llama-api as an LLM provider (#10451) by @ishaan-jaff in #10538
  • UI - fix(model_management_endpoints.py): allow team admin to update model info + fix request logs - handle expanding other rows when existing row selected + fix(organization_endpoints.py): enable proxy admin with 'all-proxy-model' access to create new org with specific models by @krrishdholakia in #10539
  • [Bug Fix] UnicodeDecodeError: 'charmap' on Windows during litellm import by @ishaan-jaff in #10542
  • fix(converse_transformation.py): handle meta llama tool call response by @krrishdholakia in #10541

Full Changelog: v1.67.6.dev1...v1.68.0-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.0-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 180.0 210.99923315604772 6.1894793990457675 0.0 1852 0 166.69672900002297 3755.0343799999837
Aggregated Passed ✅ 180.0 210.99923315604772 6.1894793990457675 0.0 1852 0 166.69672900002297 3755.0343799999837

v1.68.0-stable

03 May 16:01
Compare
Choose a tag to compare

What's Changed

  • Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' by @krrishdholakia in #10351
  • [Feat] Add logging callback support for /moderations API by @ishaan-jaff in #10390
  • [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed by @ishaan-jaff in #10393
  • [Bug Fix] Responses API - fix for handling multiturn responses API sessions by @ishaan-jaff in #10415
  • build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website by @dependabot in #10419
  • docs: Fix link formatting in GitHub PR template by @user202729 in #10417
  • docs: Improve documentation of phoenix logging by @user202729 in #10416
  • [Feat Security] - Allow blocking web crawlers by @ishaan-jaff in #10420
  • [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests by @ishaan-jaff in #10413
  • Revert "build(deps): bump axios, @docusaurus/core, @docusaurus/plugin-google-gtag, @docusaurus/plugin-ideal-image and @docusaurus/preset-classic in /docs/my-website" by @ishaan-jaff in #10421
  • fix google studio url by @nonZero in #10095
  • [New model] Add openai/computer-use-preview cost tracking / pricing by @ishaan-jaff in #10422
  • fix(langsmith.py): respect langsmith batch size param by @krrishdholakia in #10411
  • Support x-litellm-api-key header param + allow key at max budget to call non-llm api endpoints by @krrishdholakia in #10392
  • Update fireworks ai pricing by @krrishdholakia in #10425
  • Schedule budget resets at expectable times (#10331) by @krrishdholakia in #10333
  • Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits by @krrishdholakia in #10424
  • Contributor PR - Support OPENAI_BASE_URL in addition to OPENAI_API_BASE (#9995) by @ishaan-jaff in #10423
  • New feature: Add Python client library for LiteLLM Proxy by @msabramo in #10445
  • Add key-level multi-instance tpm/rpm/max parallel request limiting by @krrishdholakia in #10458
  • [UI] Allow adding triton models on LiteLLM UI by @ishaan-jaff in #10456
  • [Feat] Vector Stores/KnowledgeBases - Allow defining Vector Store Configs by @ishaan-jaff in #10448
  • Add low-level interface to client library for doing HTTP requests by @msabramo in #10452
  • Correctly re-raise 504 errors and Add gpt-4o-mini-tts support by @krrishdholakia in #10462
  • UI - Fix filtering on key alias + support global sorting on keys by @krrishdholakia in #10455
  • [Bug Fix] Ensure Non-Admin virtual keys can access /mcp routes by @ishaan-jaff in #10473
  • [Fixes] Azure OpenAI OIDC - allow using litellm defined params for OIDC Auth by @ishaan-jaff in #10394
  • Add supports_pdf_input: true to Claude 3.7 bedrock models by @RupertoM in #9917
  • Add llamafile as a provider (#10203) by @peteski22 , in #10482
  • Fix mcp.md in documentation by @1995parham in #10493
  • docs(realtime): yaml config example for realtime model by @kmontocam in #10489
  • Fix return finish_reason = "tool_calls" for gemini tool calling by @krrishdholakia in #10485
  • Add user + team based multi-instance rate limiting by @krrishdholakia in #10497
  • mypy tweaks by @msabramo in #10490
  • Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai by @krrishdholakia in #10492
  • Fix and rewrite of token_counter by @happyherp in #10409
  • [Fix + Refactor] Trigger Soft Budget Webhooks When Key Crosses Threshold by @ishaan-jaff in #10491
  • [Bug Fix] Ensure Web Search / File Search cost are only added when the response includes the too call by @ishaan-jaff in #10476
  • Fixes for test_team_budget_metrics and test_generate_and_update_key by @S1LV3RJ1NX in #10500
  • [Feat] KnowledgeBase/Vector Store - Log StandardLoggingVectorStoreRequest for requests made when a vector store is used by @ishaan-jaff in #10509
  • Don't depend on uvloop on windows (#10060) by @ishaan-jaff in #10483
  • fix: PydanticDeprecatedSince20: Support for class-based config is eprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. by @Elijas in #9372
  • [Feat] Show Vector Store / KB Request on LiteLLM Logs Page by @ishaan-jaff in #10514
  • Fix pytest event loop warning (#9641) by @msabramo in #10512
  • UI - fix adding vertex models with reusable credentials + fix pagination on keys table + fix showing org budgets on table by @krrishdholakia in #10528
  • Playwright test for team admin (#10366) by @krrishdholakia in #10470
  • [QA] Bedrock Vector Stores Integration - Allow using with registry + in OpenAI API spec with tools by @ishaan-jaff in #10516
  • UI - allow reassigning team to other org by @krrishdholakia in #10527
  • [Models/ LLM Credentials] Fix edit credentials modal by @NANDINI-star in #10519

New Contributors

Full Changelog: v1.67.4-stable...v1.67.7-stable