Skip to content

Releases: BerriAI/litellm

v1.82.1-nightly.1

11 Mar 07:58

Choose a tag to compare

v1.82.1-focus-dev

12 Mar 09:16

Choose a tag to compare

What's Changed

  • fix handling of ResponseApplyPatchToolCall in completion bridge by @jtsaw in #20913
  • fix(router): break retry loop on non-retryable errors by @AtharvaJaiswal005 in #21370
  • fix(proxy): fix invalid OpenAPI schema for /spend/calculate and /credentials endpoints by @AtharvaJaiswal005 in #21369
  • fix: preserve usage/cached_tokens in Responses API streaming bridge by @KeremTurgutlu in #22194
  • fix(caching): inject default_in_memory_ttl in DualCache async_set_cache and async_set_cache_pipeline by @pnookala-godaddy in #22241
  • fix: apply server root path to mapped passthrough route matching by @umut-polat in #22310
  • fix(responses): merge parallel function_call items into single assist… by @Varad2001 in #23116
  • fix: handle month overflow in duration_in_seconds for multi-month durations by @jnMetaCode in #23099
  • fix: use correct divisor when averaging TTFT in lowest-latency routing by @jnMetaCode in #23100
  • fix(fireworks): strip duplicate /v1 from models endpoint URL by @s-zx in #23113
  • fix(sagemaker): Add role assumption support for embedding endpoint by @jymmi in #20435
  • merge main by @Sameerlite in #23252
  • merge main by @Sameerlite in #23253
  • Litellm oss staging 03 02 2026 by @krrishdholakia in #22628
  • oss staging 03/09/2026 by @krrishdholakia in #23164
  • Litellm oss staging 02 18 2026 by @krrishdholakia in #23222
  • fix(vertex_ai): strip LiteLLM-internal keys from extra_body before merging to Gemini request by @Sameerlite in #23131
  • fix(openai): preserve reasoning_effort summary field for Responses API by @Sameerlite in #23151
  • fix(bedrock): populate completion_tokens_details in Responses API by @Sameerlite in #23243

New Contributors

Full Changelog: v1.82.1-nightly...v1.82.1-focus-dev

v1.82.1.rc.1

12 Mar 05:52
94b0020

Choose a tag to compare

What's Changed

  • fix(gemini): preserve $ref in JSON Schema for Gemini 2.0+ by @Chesars in #21597
  • fix(transcription): move duration to _hidden_params to match OpenAI response spec by @Chesars in #22208
  • fix(anthropic): map reasoning_effort to output_config for Claude 4.6 models by @Chesars in #22220
  • feat(vertex): add gemini-3.1-flash-image-preview to model DB by @emerzon in #22223
  • perf(spendlogs): optimize old spendlog deletion cron job by @Harshit28j in #21930
  • Fix converse handling for parallel_tool_calls by @Sameerlite in #22267
  • [Fix]Preserve forwarding server side called tools by @Sameerlite in #22260
  • Fix free models working from UI by @Sameerlite in #22258
  • Add v1 for anthropic responses transformation by @Sameerlite in #22087
  • [Bug]Add ChatCompletionImageObject in OpenAIChatCompletionAssistantMessage by @Sameerlite in #22155
  • Fix: poetry lock by @Sameerlite in #22293
  • Enable local file support for OCR by @noahnistler in #22133
  • fix(mcp): Strip stale mcp-session-id to prevent 400 errors across proxy workers by @gavksingh in #21417
  • [Feature] Access group CRUD: Bidirectional team/key sync by @yuneng-jiang in #22253
  • Add LLMClientCache regression tests for httpx client eviction safety by @ryan-crabbe in #22306
  • fix(images): forward extra_headers on OpenAI code path in image_generation() by @Chesars in #22300
  • feat(models): add gpt-audio-1.5 to model cost map by @Chesars in #22303
  • feat(models): add gpt-realtime-1.5 to model cost map by @Chesars in #22304
  • fix(images): pass model_info/metadata in image_edit for custom pricing by @Chesars in #22307
  • fix(chatgpt): fix tool_calls streaming indexes by @Chesars in #21498
  • fix(openai): correct supported_openai_params for GPT-5 model family by @Chesars in #21576
  • fix(openai): correct supported params for gpt-5-search models by @Chesars in #21574
  • fix(azure_ai): resolve api_base from env var in Document Intelligence OCR by @Chesars in #21581
  • fix(models): function calling for PublicAI Apertus models by @Chesars in #21582
  • fix(vertex_ai): pass through native Gemini imageConfig params for image generation by @Chesars in #21585
  • fix(openrouter): use provider-reported usage in streaming without stream_options by @Chesars in #21592
  • fix(moonshot): preserve image_url blocks in multimodal messages by @Chesars in #21595
  • fix(types): remove StreamingChoices from ModelResponse, use ModelResponseStream by @Chesars in #21629
  • fix(responses): use output_index for parallel tool call streaming indices by @Chesars in #21337
  • Tests: add llmclientcache regression tests by @ryan-crabbe in #22313
  • Add deprecation dates for xAI grok-2-vision-1212 and grok-3-mini models by @Chesars in #20102
  • fix(containers): Fix Python 3.10 compatibility for OpenAIContainerConfig by @Chesars in #19786
  • fix(count_tokens): include system and tools in token counting API requests by @Chesars in #22301
  • fix(helicone): add Gemini and Vertex AI support to HeliconeLogger by @Chesars in #19288
  • fix(register_model): handle openrouter models without '/' in name by @Chesars in #19792
  • feat(model_prices): add OpenRouter native models to model cost map by @Chesars in #20520
  • fix(adapter): double-stripping of model names with provider-matching prefixes by @Chesars in #20516
  • docs: add OpenRouter Opus 4.6 to model map and update Claude Opus 4.6 docs by @Chesars in #20525
  • [Fix] Include timestamps in /project/list response by @yuneng-jiang in #22323
  • [Feature] UI - Projects: Add Projects page with list and create flows by @yuneng-jiang in #22315
  • Fix/claude code plugin schema by @rahulrd25 in #22271
  • Add Prometheus child_exit cleanup for gunicorn workers by @ryan-crabbe in #22324
  • docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway by @dylan-duan-aai in #21130
  • feat: add in_flight_requests metric to /health/backlog + prometheus by @ishaan-jaff in #22319
  • fix(test): update realtime guardrail test assertions for voice violation behavior by @jquinter in #22332
  • fix(test): update Azure pass-through test after Responses API routing change by @jquinter in #22334
  • fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable by @jquinter in #22335
  • fix(bedrock): restore parallel_tool_calls mapping in map_openai_params by @jquinter in #22333
  • [Feat] Agent RBAC Permission Fix - Ensure Internal Users cannot create agents by @ishaan-jaff in #22329
  • fix(mcp): update test mocks for renamed filter_server_ids_by_ip_with_info by @jquinter in #22327
  • fix: Add PROXY_ADMIN role to system user for key rotation by @milan-berri in #21896
  • fix: populate user_id and user_info for admin users in /user/info by @milan-berri in #22239
  • fix(caching): store task references in LLMClientCache._remove_key by @shivaaang in #22143
  • fix(image_generation): propagate extra_headers to Upstream by @ZeroClover in #22026
  • [Fix] Pass MCP auth headers from request into tool fetch for /v1/responses and chat completions by @shivamrawat1 in #22291
  • fix: shorten guardrail benchmark result filenames for Windows long path support by @demoray in #22039
  • Remove Apache 2 license from SKILL.md by @rasmi in #22322
  • fix(mcp): default available_on_public_internet to true by @ishaan-jaff in #22331
  • fix(bedrock): filter internal json_tool_call when mixed with real tools by @jquinter in #21107
  • fix(jwt): OIDC discovery URLs, roles array handling, dot-notation error hints by @ishaan-jaff in #22336
  • perf: streaming latency improvements — 4 targeted hot-path fixes by @ishaan-jaff in #22346
  • [Test] UI - CostTrackingSettings: Add comprehensive Vitest coverage by @yuneng-jiang in #22354
  • [Feature] Key list endpoint: Add project_id and access_group_id filters by @yuneng-jiang in #22356
  • [Feature] UI - Projects: Add Project Details Page by @yuneng-jiang in #22360
  • [Feature] UI - Projects: Add project keys table and project dropdown to key create/edit by @yuneng-jiang in #22373
  • Litellm health check tokens by @Harshit28j in #22299
  • Doc: security vulnerability scan report to v1.81.14 release notes by @Harshit28j in #22385
  • feat: ability to trace metrics datadog by @Harshit28j in #22103
  • feat(ci): add duplicate issue detection and auto-close bot by @jquinter in #22034
  • Litellm aws edge case by @Harshit28j in #22384
  • Litellm presidio stream v3 by @Harshit28j in #22283
  • fix: update_price_and_context_window workflow from running in forks by @Chesars in #18478
  • fix(ci): remove duplicate env key in scan_duplicate_issues workflow by @Chesars in #22405
  • fix(lint): suppress PLR0915 in complex transform methods by @jquinter in #22328
  • fix: atomic RPM rate limiting in model rate limit check by @jquinter in #22002
  • test(ci): add secret scan test and CI job by @jquinter in #22193
  • fix(proxy): isolate get_config failures from model sync loop by @jquinter in #22224
  • fix tts metrics issues by @Harshit28j in #20632
  • [Release Fix] by @ishaan-jaff in #22411
    *...
Read more

v1.82.1-silent-dev2

11 Mar 04:17
fdb0a46

Choose a tag to compare

What's Changed

  • fix handling of ResponseApplyPatchToolCall in completion bridge by @jtsaw in #20913
  • fix(router): break retry loop on non-retryable errors by @AtharvaJaiswal005 in #21370
  • fix(proxy): fix invalid OpenAPI schema for /spend/calculate and /credentials endpoints by @AtharvaJaiswal005 in #21369
  • fix: preserve usage/cached_tokens in Responses API streaming bridge by @KeremTurgutlu in #22194
  • fix(caching): inject default_in_memory_ttl in DualCache async_set_cache and async_set_cache_pipeline by @pnookala-godaddy in #22241
  • fix: apply server root path to mapped passthrough route matching by @umut-polat in #22310
  • fix(responses): merge parallel function_call items into single assist… by @Varad2001 in #23116
  • fix: handle month overflow in duration_in_seconds for multi-month durations by @jnMetaCode in #23099
  • fix: use correct divisor when averaging TTFT in lowest-latency routing by @jnMetaCode in #23100
  • fix(fireworks): strip duplicate /v1 from models endpoint URL by @s-zx in #23113
  • fix(sagemaker): Add role assumption support for embedding endpoint by @jymmi in #20435
  • merge main by @Sameerlite in #23252
  • merge main by @Sameerlite in #23253
  • Litellm oss staging 03 02 2026 by @krrishdholakia in #22628
  • oss staging 03/09/2026 by @krrishdholakia in #23164
  • Litellm oss staging 02 18 2026 by @krrishdholakia in #23222
  • fix(vertex_ai): strip LiteLLM-internal keys from extra_body before merging to Gemini request by @Sameerlite in #23131
  • fix(openai): preserve reasoning_effort summary field for Responses API by @Sameerlite in #23151
  • fix(bedrock): populate completion_tokens_details in Responses API by @Sameerlite in #23243
  • feat: add strategy to deployment for helmchart by @Harshit28j in #23214
  • feat: record silent metrics by @Harshit28j in #23209

New Contributors

Full Changelog: v1.82.1-nightly...v1.82.1-silent-dev2

v1.82.1-nightly

10 Mar 09:44
94b0020

Choose a tag to compare

What's Changed

  • fix(gemini): preserve $ref in JSON Schema for Gemini 2.0+ by @Chesars in #21597
  • fix(transcription): move duration to _hidden_params to match OpenAI response spec by @Chesars in #22208
  • fix(anthropic): map reasoning_effort to output_config for Claude 4.6 models by @Chesars in #22220
  • feat(vertex): add gemini-3.1-flash-image-preview to model DB by @emerzon in #22223
  • perf(spendlogs): optimize old spendlog deletion cron job by @Harshit28j in #21930
  • Fix converse handling for parallel_tool_calls by @Sameerlite in #22267
  • [Fix]Preserve forwarding server side called tools by @Sameerlite in #22260
  • Fix free models working from UI by @Sameerlite in #22258
  • Add v1 for anthropic responses transformation by @Sameerlite in #22087
  • [Bug]Add ChatCompletionImageObject in OpenAIChatCompletionAssistantMessage by @Sameerlite in #22155
  • Fix: poetry lock by @Sameerlite in #22293
  • Enable local file support for OCR by @noahnistler in #22133
  • fix(mcp): Strip stale mcp-session-id to prevent 400 errors across proxy workers by @gavksingh in #21417
  • [Feature] Access group CRUD: Bidirectional team/key sync by @yuneng-jiang in #22253
  • Add LLMClientCache regression tests for httpx client eviction safety by @ryan-crabbe in #22306
  • fix(images): forward extra_headers on OpenAI code path in image_generation() by @Chesars in #22300
  • feat(models): add gpt-audio-1.5 to model cost map by @Chesars in #22303
  • feat(models): add gpt-realtime-1.5 to model cost map by @Chesars in #22304
  • fix(images): pass model_info/metadata in image_edit for custom pricing by @Chesars in #22307
  • fix(chatgpt): fix tool_calls streaming indexes by @Chesars in #21498
  • fix(openai): correct supported_openai_params for GPT-5 model family by @Chesars in #21576
  • fix(openai): correct supported params for gpt-5-search models by @Chesars in #21574
  • fix(azure_ai): resolve api_base from env var in Document Intelligence OCR by @Chesars in #21581
  • fix(models): function calling for PublicAI Apertus models by @Chesars in #21582
  • fix(vertex_ai): pass through native Gemini imageConfig params for image generation by @Chesars in #21585
  • fix(openrouter): use provider-reported usage in streaming without stream_options by @Chesars in #21592
  • fix(moonshot): preserve image_url blocks in multimodal messages by @Chesars in #21595
  • fix(types): remove StreamingChoices from ModelResponse, use ModelResponseStream by @Chesars in #21629
  • fix(responses): use output_index for parallel tool call streaming indices by @Chesars in #21337
  • Tests: add llmclientcache regression tests by @ryan-crabbe in #22313
  • Add deprecation dates for xAI grok-2-vision-1212 and grok-3-mini models by @Chesars in #20102
  • fix(containers): Fix Python 3.10 compatibility for OpenAIContainerConfig by @Chesars in #19786
  • fix(count_tokens): include system and tools in token counting API requests by @Chesars in #22301
  • fix(helicone): add Gemini and Vertex AI support to HeliconeLogger by @Chesars in #19288
  • fix(register_model): handle openrouter models without '/' in name by @Chesars in #19792
  • feat(model_prices): add OpenRouter native models to model cost map by @Chesars in #20520
  • fix(adapter): double-stripping of model names with provider-matching prefixes by @Chesars in #20516
  • docs: add OpenRouter Opus 4.6 to model map and update Claude Opus 4.6 docs by @Chesars in #20525
  • [Fix] Include timestamps in /project/list response by @yuneng-jiang in #22323
  • [Feature] UI - Projects: Add Projects page with list and create flows by @yuneng-jiang in #22315
  • Fix/claude code plugin schema by @rahulrd25 in #22271
  • Add Prometheus child_exit cleanup for gunicorn workers by @ryan-crabbe in #22324
  • docs: update AssemblyAI docs with Universal-3 Pro, Speech Understanding, and LLM Gateway by @dylan-duan-aai in #21130
  • feat: add in_flight_requests metric to /health/backlog + prometheus by @ishaan-jaff in #22319
  • fix(test): update realtime guardrail test assertions for voice violation behavior by @jquinter in #22332
  • fix(test): update Azure pass-through test after Responses API routing change by @jquinter in #22334
  • fix(db): add missing migration for LiteLLM_ClaudeCodePluginTable by @jquinter in #22335
  • fix(bedrock): restore parallel_tool_calls mapping in map_openai_params by @jquinter in #22333
  • [Feat] Agent RBAC Permission Fix - Ensure Internal Users cannot create agents by @ishaan-jaff in #22329
  • fix(mcp): update test mocks for renamed filter_server_ids_by_ip_with_info by @jquinter in #22327
  • fix: Add PROXY_ADMIN role to system user for key rotation by @milan-berri in #21896
  • fix: populate user_id and user_info for admin users in /user/info by @milan-berri in #22239
  • fix(caching): store task references in LLMClientCache._remove_key by @shivaaang in #22143
  • fix(image_generation): propagate extra_headers to Upstream by @ZeroClover in #22026
  • [Fix] Pass MCP auth headers from request into tool fetch for /v1/responses and chat completions by @shivamrawat1 in #22291
  • fix: shorten guardrail benchmark result filenames for Windows long path support by @demoray in #22039
  • Remove Apache 2 license from SKILL.md by @rasmi in #22322
  • fix(mcp): default available_on_public_internet to true by @ishaan-jaff in #22331
  • fix(bedrock): filter internal json_tool_call when mixed with real tools by @jquinter in #21107
  • fix(jwt): OIDC discovery URLs, roles array handling, dot-notation error hints by @ishaan-jaff in #22336
  • perf: streaming latency improvements — 4 targeted hot-path fixes by @ishaan-jaff in #22346
  • [Test] UI - CostTrackingSettings: Add comprehensive Vitest coverage by @yuneng-jiang in #22354
  • [Feature] Key list endpoint: Add project_id and access_group_id filters by @yuneng-jiang in #22356
  • [Feature] UI - Projects: Add Project Details Page by @yuneng-jiang in #22360
  • [Feature] UI - Projects: Add project keys table and project dropdown to key create/edit by @yuneng-jiang in #22373
  • Litellm health check tokens by @Harshit28j in #22299
  • Doc: security vulnerability scan report to v1.81.14 release notes by @Harshit28j in #22385
  • feat: ability to trace metrics datadog by @Harshit28j in #22103
  • feat(ci): add duplicate issue detection and auto-close bot by @jquinter in #22034
  • Litellm aws edge case by @Harshit28j in #22384
  • Litellm presidio stream v3 by @Harshit28j in #22283
  • fix: update_price_and_context_window workflow from running in forks by @Chesars in #18478
  • fix(ci): remove duplicate env key in scan_duplicate_issues workflow by @Chesars in #22405
  • fix(lint): suppress PLR0915 in complex transform methods by @jquinter in #22328
  • fix: atomic RPM rate limiting in model rate limit check by @jquinter in #22002
  • test(ci): add secret scan test and CI job by @jquinter in #22193
  • fix(proxy): isolate get_config failures from model sync loop by @jquinter in #22224
  • fix tts metrics issues by @Harshit28j in #20632
  • [Release Fix] by @ishaan-jaff in #22411
    *...
Read more

v1.82.1-dev

10 Mar 13:42
81bd62e

Choose a tag to compare

What's Changed

  • fix handling of ResponseApplyPatchToolCall in completion bridge by @jtsaw in #20913
  • fix(router): break retry loop on non-retryable errors by @AtharvaJaiswal005 in #21370
  • fix(proxy): fix invalid OpenAPI schema for /spend/calculate and /credentials endpoints by @AtharvaJaiswal005 in #21369
  • fix: preserve usage/cached_tokens in Responses API streaming bridge by @KeremTurgutlu in #22194
  • fix(caching): inject default_in_memory_ttl in DualCache async_set_cache and async_set_cache_pipeline by @pnookala-godaddy in #22241
  • fix: apply server root path to mapped passthrough route matching by @umut-polat in #22310
  • fix(responses): merge parallel function_call items into single assist… by @Varad2001 in #23116
  • fix: handle month overflow in duration_in_seconds for multi-month durations by @jnMetaCode in #23099
  • fix: use correct divisor when averaging TTFT in lowest-latency routing by @jnMetaCode in #23100
  • fix(fireworks): strip duplicate /v1 from models endpoint URL by @s-zx in #23113
  • fix(sagemaker): Add role assumption support for embedding endpoint by @jymmi in #20435
  • merge main by @Sameerlite in #23252
  • merge main by @Sameerlite in #23253
  • Litellm oss staging 03 02 2026 by @krrishdholakia in #22628
  • oss staging 03/09/2026 by @krrishdholakia in #23164
  • Litellm oss staging 02 18 2026 by @krrishdholakia in #23222
  • fix(vertex_ai): strip LiteLLM-internal keys from extra_body before merging to Gemini request by @Sameerlite in #23131
  • fix(openai): preserve reasoning_effort summary field for Responses API by @Sameerlite in #23151

New Contributors

Full Changelog: v1.82.1-nightly...v1.82.1-dev

1.82.1-dev-2

10 Mar 13:07

Choose a tag to compare

v1.82.0.patch4

10 Mar 23:25

Choose a tag to compare

v1.82.rc.3

09 Mar 15:53

Choose a tag to compare

v1.82.rc.2

09 Mar 13:23

Choose a tag to compare