Skip to content

Releases: etalab-ia/OpenGateLLM

0.4.8

26 Jun 14:20
3dfe415

Choose a tag to compare

0.4.8 Pre-release
Pre-release

What's Changed

  • chore: move admin schemas in dedicated admin schema folder by @leoguillaume in #928
  • fix: circular import by @leoguillaume in #929
  • refacto(keys): simplify decode api key logic by @leoguillaume in #930
  • chore: rename user_with_role to authenticated_user by @leoguillaume in #931
  • chore: remove useless fields of authenticated_user by @leoguillaume in #932
  • chore: add npm minimum release age by @kaaloo in #907
  • chore(deps-dev): bump tmp from 0.2.5 to 0.2.7 in /docs by @dependabot[bot] in #895
  • chore(doc): update generated documentation and release versions by @github-actions[bot] in #916
  • feat(users): add id suffix to user and organization attribut of user entity user creation endpoint by @leoguillaume in #934
  • refacto(keys): refacto POST /v1/admin/keys endpoint toward clean architecture by @leoguillaume in #933
  • refacto(auth): refactoring of /login toward clean architecture by @leoguillaume in #937
  • fix(auth): post review changes by @leoguillaume in #938
  • fix(postges): close session before calling LLMs in chat completion by @benjaminpilia in #940

Full Changelog: 0.4.7...0.4.8

0.4.7

12 Jun 11:20
bae5714

Choose a tag to compare

Features

  • Add copy button when creating an API key in key page of Playground UI (#896).
  • Call /metrics of vLLM provider to set health of models by (#911).

Refacto

  • Refactoring of DELETE /v1/admin/users/{user_id} endpoint toward clean architecture (#898).
  • Refactoring of POST /v1/rerank endpoint toward clean architecture by (#905).

Bug fixes

  • Fix search by email in user admin page of Playground UI (#909).

Chore

  • Change default refresh time of ES index to 2s (#904).

Full Changelog: 0.4.6...0.4.7

0.4.6

04 Jun 12:18
8ec6af3

Choose a tag to compare

Features

  • Add possibility to set limits storage for each role (#899).
  • Add document chunk upload to max 20MB per document (#902).

Refacto

  • Refactoring of GET /v1/admin/users endpoint toward clean architecture (#893).

Bug fixes

  • Handle Mistral API non string content (#892).

Chore

  • Update generated documentation and release versions (#891).
  • Optimize CI/CD build (#900).
  • Add docker ignore and remove ARM build (block github actions CI/CD) (#901).

Full Changelog: 0.4.5...0.4.6

0.4.5

22 May 09:13
0722b46

Choose a tag to compare

Refactoring

  • Split GetModelsUseCase in two use case (#890).
  • Refactoring of POST /v1/admin/users endpoint toward clean architecture refactoring (#867).

Bug Fixes

  • Fix inject context into Langfuse (#889).

Full Changelog: 0.4.4...0.4.5

0.4.4

19 May 16:08
9981a1c

Choose a tag to compare

This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.

Features

  • Support for srt, vtt, and verbose_json formats for the /v1/audio/transcriptions endpoint with the whisperx provider (#855, #859).
  • Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).
  • New endpoint /health/models to health check status of each model (#870). More accuracy method to detect overload on a model will be add in a next release. Currently it is only based on Little's law.

Bug Fixes

  • Fixed various UI issues in the playground (#856, #860).
  • Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
  • Fixed workflow deployment chaining (#858).
  • Fix base_url configuration argument for Langfuse (#868).

Refactoring

  • Rename bootstrapadmin file (#864).
  • Rname userinforepo (#865).

Chore

Full Changelog: 0.4.3...0.4.4

0.4.3post1

07 May 13:29
af58217

Choose a tag to compare

This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.

Features

  • Support for srt, vtt, and verbose_json formats for the /v1/audio/transcriptions endpoint with the whisperx provider (#855, #859).
  • Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).

Bug Fixes

  • Fixed various UI issues in the playground (#856, #860).
  • Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
  • Fixed workflow deployment chaining (#858).

Chore

  • Updated the documentation URL in the Albert API playground (#854).

Full Changelog: 0.4.3...0.4.3post1

0.4.3

30 Apr 17:07
4f8b615

Choose a tag to compare

This release introduces a major redesign of the OGL bootstrap process, with two significant changes.

  1. Configuration-defined models are only created during initial bootstrap

    Models defined in the configuration file are now created in the database only when the database is empty. If models already exist, the configuration is ignored.

    Installing routers and providers through the configuration file (models section) was incompatible with a dynamic model management approach. Since the database became the single source of truth, this led to confusing behaviors where it was difficult to determine when a router or provider defined in the configuration was actually applied.

    To simplify the process, the model-related configuration now acts solely as a bootstrap script for the initial API startup.

  2. Removal of the master user

    Previously, a configurable master user could be defined in the configuration file. This user had several responsibilities:

    • its password was used as the encryption key for API keys;
    • it owned models (routers/providers) created from the configuration;
    • it could create additional administrator accounts through the ADMIN permission.

    This behavior has now changed.

    The API key encryption key is now fully decoupled from any user account and must be configured through a new configuration entry: auth_secret_key.

    This improves security by separating responsibilities more clearly.

    Additionally, the master user behaved like a “ghost user”: it had no user_id, and its actions were not tracked in the database.

    To address this limitation, the API now automatically creates a bootstrap_admin user and a bootstrap_admin role with the ADMIN permission whenever no administrator exists in the database.

    The credentials are configured through:

    • auth_bootstrap_admin_email (default: admin)
    • auth_bootstrap_admin_password (default: changeme)

⚠️ Known issues

If some routers or providers were originally created using the legacy master user, the bootstrap_admin account will still be created automatically by the Alembic migration, even if administrator users already exist.

  1. Default credentials security risk

    You must change the default bootstrap_admin credentials immediately.

    Default values:

    • username: admin
    • password: changeme

    Keeping these defaults creates a critical security vulnerability by exposing a publicly known administrator account.

  2. Deleting the bootstrap_admin user

    Deleting the bootstrap_admin user also deletes all associated routers and providers.

    This becomes especially problematic if the embeddings model used by the vector store is deleted. In that case, the API can no longer start because:

    • the required embeddings model no longer exists;
    • models cannot be recreated from the configuration file once other models already exist in the database.

Recovery procedure

  1. Remove the ElasticSearch dependency from the configuration file;
  2. Start the API;
  3. Recreate the embeddings model manually;
  4. Re-enable ElasticSearch in the configuration;
  5. Restart the API.

This issue will be resolved once the RAG component is extracted into a dedicated OGL project.

Features

  • Start support of Langfuse to be an alternative to PostgreSQL for usage monitoring (#812).
  • Add diarized transcription support with WisperX (#832).

Deprecated

  • Configuration file arguments auth_master_username and auth_master_key are now deprecated (#779).
    The encryption key is replaced by the new auth_secret_key argument. The master user is now created as a real user on the first API startup (or when the database contains no admin user). This user, now called bootstrap_admin, is configured via the new auth_bootstrap_admin_username and auth_bootstrap_admin_password arguments in the configuration file.

Bug fixes

  • Fix Playground form selectors (#802, #828).
  • Ensure that passwords are no longer than 72 characters in Playground (#809).
  • Set default language code from "english" to "en" for audio transcription (#807, #819).
  • Remove URLs from Prometheus timeseries (v1/document/123/chunks), replace with route pattern (#824).
  • Reduce the default number of ES shard from 24 to 12 (#829).
  • Close pdf after reading (#833).

Refactoring

  • Refactoring toward clean architecture the admin user creation bootstrap script (#799, #827).
  • Refactored /v1/admin/roles endpoints to better align with clean architecture principles, improving maintainability and code organization (#801, #817, #821, #808).
  • Refactored BaseModelProvider class toward clean architecture (#796, #826, #822).
  • Refactoring toward clean architecture the models creation bootstrap script (#823).

Security

  • Add Trivy and Semgrep to CI/CD pipeline (#793, #797).
  • Fix all package versions (#830).
  • Possibility to disable pages of Playground (#835).

Full Changelog: 0.4.2...0.4.3

0.4.2

24 Mar 07:21
1aa8825

Choose a tag to compare

This release improve the developer experience by a new CLI (see makecommand) and a new documentation with Astro Starlight. The new documentation is available here : https://docs.opengatellm.org/.

Features

  • Add a new CLI with Rich (#713)
    • make create-user command is replaced by make create-admin to create admin user to help you when you contribute to OpenGateLLM codebase.
  • Add verbose_json output format for /v1/audio/transcription endoint (#774)

Deprecated

  • carbon usage key in output of model reponses is replaced by impacts after ecologit update. This key will be removed in v0.5.0 (#791)

Bug fixes

  • Tool calling on Mistral On prem providers (#773)
  • Support legacy metadata (#777)
  • Fix typo on GET /documents/{document_id}/chunks (#782)
  • fix(collections): public filter of GET /v1/collections by @leoguillaume in #789

Refactoring

  • New documentation with Astro Starlight (#762, #771, #770, #772, #776)
  • Refactored provider endpoints to better align with clean architecture principles, improving maintainability and code organization (#718, #713, #767, #778, #783)
  • Optimize integration tests duration (#780)

Security

New Contributors

  • @github-actions[bot] made their first contribution in #771

Full Changelog: 0.4.1...0.4.2

0.4.1post2

10 Mar 08:37

Choose a tag to compare

0.4.1

03 Mar 17:36
525e3fe

Choose a tag to compare

This release aims to realign RAG features with market standards, especially the OpenAI API, to provide a more predictable experience for client integrations.

Concretely, it strengthens the end-to-end RAG pipeline: more flexible ingestion (documents with or without a file), fine-grained chunk management, more powerful search (metadata filters, sorting, targeting by collections/documents), and native use of search as a tool in /v1/chat/completions.

At the same time, this version simplifies and modernizes the API surface by removing several legacy elements, aligning parameters with market conventions, and preparing for the breaking changes announced for v0.5.0.

Features

  • New endpoints to add chunk directly into a document (#660)

    • Add optional name argument to POST /v1/documents endpoint and make file argument optional.

      Now, you need to provide either name or file. If you provide both, name overrides the file name. If you don't provide file, the document will be created without content, use POST /v1/documents/{document_id}/chunks to add content to it.

    • Add POST /v1/documents/{document_id}/chunks endpoint to add chunks to a document (empty or not).

      With this endpoint, you control the parsing and the chunking of the document. You can also add custom metadata for each chunk. The chunk ID of each chunk is determined by the order of the chunks in the request and incremented by the number of chunks already in the document.

    • Removed chunker argument in POST /v1/documents endpoint.

      Now, by default, the document is split into chunks using RecursiveCharacterTextSplitter. To split a document into chunks using a different chunker, you can use the disable_chunking argument and directly provide the chunks by POST /v1/documents/{document_id}/chunks.

    • Add DELETE /v1/documents/{document_id}/chunks/{chunk_id} endpoint to delete a chunk from a document.

    • Add GET /v1/documents/{document_id}/chunks endpoint to get chunks of a document.

    • Add GET /v1/documents/{document_id}/chunks/{chunk_id} endpoint to get a chunk from a document.

  • Add metadata filters in search endpoint (#700)

    • metadata_filters argument to POST /v1/search endpoint to filter results by metadata. metadata_filters argument is a list of filters, each filter is a dictionary with the following keys:

      • key: the key of the metadata to filter by
      • type: the type of the filter (eq (equal to), sw (starts with), ew (ends with) or co (contains))
      • value: the value of the filter


        And can be a compound filter with the following keys:
      • filters: the list of filters
      • operator: the operator to use for the compound filter (and (AND) or or (OR))
    • Replace collections argument by collection_ids argument in POST /v1/search endpoint to filter results by collection IDs.

      collections will be removed in v0.5.0. Now, you can pass an empty list to search in all your collections.

    • Add document_ids argument to POST /v1/search endpoint to filter results by document IDs.

      You can pass an empty list to search in all your documents.

  • Add order_by and order_direction arguments to GET /v1/collections and GET /v1/documents endpoints to sort collections and documents by a field (#709)

  • Add search build-in tool into POST /v1/chat/completions endpoint to search for chunks in your collections and documents (#708)

    • Pass this tool like:
    { 
       "messages": [
          {
             "role": "user",
             "content": "What is the capital of France?"
          }
       ],
       "model": "openweight-large",
       "tools": [{
          "type": "search",
          "method": "semantic",
          "limit": 10,
          "offset": 0,
          "rff_k": 60,
          ...
       }]
    ...
    }

    Arguments are the same as the POST /v1/search endpoint. Query is replace by the last message in the request. This way to call replace the deprecated search argument in POST /v1/chat/completions endpoint, removed in v0.5.0.

  • Support vLLM rerank models (#719)

  • Add inference additional prometheus metrics (#768)

Removed

  • Rerank legacy arguments of /v1/rerank endpoint are removed (#705)

    • Since rerank is aligned with Cohere API standard, we removed the legacy arguments prompt and input. These arguements were remplaced by query and documents arguments.
    • POST /v1/rerank endpoint is compatible with OpenWebUI integration.
  • Removed POST /v1/files and POST /v1/ocr-beta endpoints (#698)

    These endpoints were deprecated in v0.3.4 and replaced by POST /v1/documents endpoint.

  • k parameter in POST /v1/search endpoint is removed, replaced by limit argument. (#694)

  • Proconnect draft support is removed (#693)

Deprecated

  • collection argument of POST /v1/documents endpoint is deprecated, replaced by collection_id, removed in v0.5.0. (#660)
  • GET /v1/chunks/{document}/{chunk} endpoint is deprecated, replaced by GET /v1/documents/{document_id}/chunks/{chunk_id}, removed in v0.5.0. (#660)
  • GET /v1/chunks endpoint is deprecated, replaced by GET /v1/documents/{document_id}/chunks, removed in v0.5.0. (#660)
  • POST /v1/parse-beta endpoint is deprecated, removed in v0.5.0. (#698)
  • collections argument in POST /v1/search endpoint is deprecated, replaced by collection_ids, removed in v0.5.0. (#700)
  • search argument in POST /v1/chat/completions endpoint is deprecated, replaced by build-in search tool in tools argument, removed in v0.5.0. (#708)
  • prompt argument in POST /v1/search endpoint is deprecated, replaced by query argument, removed in v0.5.0. (#708)

Bug fixes

  • Fixed an issue affecting streaming chat responses in models to ensure proper real-time output delivery (#692)
    Version 0.4.0 introduced a streaming bug that has now been fixed. The handling of streaming and its related errors has been completely revised to use aiter_lines instead of aiter_raw, in order to ensure proper streaming formatting as reliably as possible.
  • Added proper threshold handling in the search module to improve result filtering behavior (#684)
  • Prometheus metrics now support gunicorn multi workers (#681)

Refactoring

  • Refactored routers endpoints to better align with clean architecture principles, improving maintainability and code organization (#658, #707, #712)
  • Refactored create providers endpoints to improve maintainability and code organization (#696)
  • Improved formatting of configuration errors for clearer and more actionable feedback (#672)
  • Refactored app creation to clean architecture (#691)

Chore

  • Removed router registry helper (#697)
  • Updated issue templates (#711)
  • Deleted import_circular_diagram.md (#699)

Full Changelog: 0.4.0...0.4.1