Releases · etalab-ia/OpenGateLLM

26 Jun 14:20

leoguillaume

0.4.8

3dfe415

0.4.8 Pre-release

Pre-release

What's Changed

chore: move admin schemas in dedicated admin schema folder by @leoguillaume in #928
fix: circular import by @leoguillaume in #929
refacto(keys): simplify decode api key logic by @leoguillaume in #930
chore: rename user_with_role to authenticated_user by @leoguillaume in #931
chore: remove useless fields of authenticated_user by @leoguillaume in #932
chore: add npm minimum release age by @kaaloo in #907
chore(deps-dev): bump tmp from 0.2.5 to 0.2.7 in /docs by @dependabot[bot] in #895
chore(doc): update generated documentation and release versions by @github-actions[bot] in #916
feat(users): add id suffix to user and organization attribut of user entity user creation endpoint by @leoguillaume in #934
refacto(keys): refacto POST /v1/admin/keys endpoint toward clean architecture by @leoguillaume in #933
refacto(auth): refactoring of /login toward clean architecture by @leoguillaume in #937
fix(auth): post review changes by @leoguillaume in #938
fix(postges): close session before calling LLMs in chat completion by @benjaminpilia in #940

Full Changelog: 0.4.7...0.4.8

Contributors

kaaloo, dependabot, and 2 other contributors

Assets 2

12 Jun 11:20

leoguillaume

0.4.7

bae5714

0.4.7 Latest

Latest

Features

Add copy button when creating an API key in key page of Playground UI (#896).
Call /metrics of vLLM provider to set health of models by (#911).

Refacto

Refactoring of DELETE /v1/admin/users/{user_id} endpoint toward clean architecture (#898).
Refactoring of POST /v1/rerank endpoint toward clean architecture by (#905).

Bug fixes

Fix search by email in user admin page of Playground UI (#909).

Chore

Change default refresh time of ES index to 2s (#904).

Full Changelog: 0.4.6...0.4.7

Assets 2

04 Jun 12:18

leoguillaume

0.4.6

8ec6af3

0.4.6

Features

Add possibility to set limits storage for each role (#899).
Add document chunk upload to max 20MB per document (#902).

Refacto

Refactoring of GET /v1/admin/users endpoint toward clean architecture (#893).

Bug fixes

Handle Mistral API non string content (#892).

Chore

Update generated documentation and release versions (#891).
Optimize CI/CD build (#900).
Add docker ignore and remove ARM build (block github actions CI/CD) (#901).

Full Changelog: 0.4.5...0.4.6

Assets 2

22 May 09:13

leoguillaume

0.4.5

0722b46

0.4.5

Refactoring

Split GetModelsUseCase in two use case (#890).
Refactoring of POST /v1/admin/users endpoint toward clean architecture refactoring (#867).

Bug Fixes

Fix inject context into Langfuse (#889).

Full Changelog: 0.4.4...0.4.5

Assets 2

19 May 16:08

leoguillaume

0.4.4

9981a1c

0.4.4

This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.

Features

Support for srt, vtt, and verbose_json formats for the /v1/audio/transcriptions endpoint with the whisperx provider (#855, #859).
Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).
New endpoint /health/models to health check status of each model (#870). More accuracy method to detect overload on a model will be add in a next release. Currently it is only based on Little's law.

Bug Fixes

Fixed various UI issues in the playground (#856, #860).
Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
Fixed workflow deployment chaining (#858).
Fix base_url configuration argument for Langfuse (#868).

Refactoring

Rename bootstrapadmin file (#864).
Rname userinforepo (#865).

Chore

Updated the documentation URL in the Albert API playground (#854, #862).
Ignore some CVE in debian image docker for Trivy scan, see trivignore file (#872, #873, #874).

Full Changelog: 0.4.3...0.4.4

Assets 2

07 May 13:29

leoguillaume

0.4.3post1

af58217

0.4.3post1

This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.

Features

Support for srt, vtt, and verbose_json formats for the /v1/audio/transcriptions endpoint with the whisperx provider (#855, #859).
Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).

Bug Fixes

Fixed various UI issues in the playground (#856, #860).
Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
Fixed workflow deployment chaining (#858).

Chore

Updated the documentation URL in the Albert API playground (#854).

Full Changelog: 0.4.3...0.4.3post1

Assets 2

30 Apr 17:07

leoguillaume

0.4.3

4f8b615

0.4.3

This release introduces a major redesign of the OGL bootstrap process, with two significant changes.

Configuration-defined models are only created during initial bootstrap

Models defined in the configuration file are now created in the database only when the database is empty. If models already exist, the configuration is ignored.

Installing routers and providers through the configuration file (models section) was incompatible with a dynamic model management approach. Since the database became the single source of truth, this led to confusing behaviors where it was difficult to determine when a router or provider defined in the configuration was actually applied.

To simplify the process, the model-related configuration now acts solely as a bootstrap script for the initial API startup.
Removal of the master user

Previously, a configurable master user could be defined in the configuration file. This user had several responsibilities:
- its password was used as the encryption key for API keys;
- it owned models (routers/providers) created from the configuration;
- it could create additional administrator accounts through the ADMIN permission.
This behavior has now changed.

The API key encryption key is now fully decoupled from any user account and must be configured through a new configuration entry: auth_secret_key.

This improves security by separating responsibilities more clearly.

Additionally, the master user behaved like a “ghost user”: it had no user_id, and its actions were not tracked in the database.

To address this limitation, the API now automatically creates a bootstrap_admin user and a bootstrap_admin role with the ADMIN permission whenever no administrator exists in the database.

The credentials are configured through:
- auth_bootstrap_admin_email (default: admin)
- auth_bootstrap_admin_password (default: changeme)

⚠️ Known issues

If some routers or providers were originally created using the legacy master user, the bootstrap_admin account will still be created automatically by the Alembic migration, even if administrator users already exist.

Default credentials security risk

You must change the default bootstrap_admin credentials immediately.

Default values:
- username: admin
- password: changeme
Keeping these defaults creates a critical security vulnerability by exposing a publicly known administrator account.
Deleting the bootstrap_admin user

Deleting the bootstrap_admin user also deletes all associated routers and providers.

This becomes especially problematic if the embeddings model used by the vector store is deleted. In that case, the API can no longer start because:
- the required embeddings model no longer exists;
- models cannot be recreated from the configuration file once other models already exist in the database.

Recovery procedure

Remove the ElasticSearch dependency from the configuration file;
Start the API;
Recreate the embeddings model manually;
Re-enable ElasticSearch in the configuration;
Restart the API.

This issue will be resolved once the RAG component is extracted into a dedicated OGL project.

Features

Start support of Langfuse to be an alternative to PostgreSQL for usage monitoring (#812).
Add diarized transcription support with WisperX (#832).

Deprecated

Configuration file arguments auth_master_username and auth_master_key are now deprecated (#779).
The encryption key is replaced by the new auth_secret_key argument. The master user is now created as a real user on the first API startup (or when the database contains no admin user). This user, now called bootstrap_admin, is configured via the new auth_bootstrap_admin_username and auth_bootstrap_admin_password arguments in the configuration file.

Bug fixes

Fix Playground form selectors (#802, #828).
Ensure that passwords are no longer than 72 characters in Playground (#809).
Set default language code from "english" to "en" for audio transcription (#807, #819).
Remove URLs from Prometheus timeseries (v1/document/123/chunks), replace with route pattern (#824).
Reduce the default number of ES shard from 24 to 12 (#829).
Close pdf after reading (#833).

Refactoring

Refactoring toward clean architecture the admin user creation bootstrap script (#799, #827).
Refactored /v1/admin/roles endpoints to better align with clean architecture principles, improving maintainability and code organization (#801, #817, #821, #808).
Refactored BaseModelProvider class toward clean architecture (#796, #826, #822).
Refactoring toward clean architecture the models creation bootstrap script (#823).

Security

Add Trivy and Semgrep to CI/CD pipeline (#793, #797).
Fix all package versions (#830).
Possibility to disable pages of Playground (#835).

Full Changelog: 0.4.2...0.4.3

Assets 2

24 Mar 07:21

leoguillaume

0.4.2

1aa8825

0.4.2

This release improve the developer experience by a new CLI (see makecommand) and a new documentation with Astro Starlight. The new documentation is available here : https://docs.opengatellm.org/.

Features

Add a new CLI with Rich (#713)
- make create-user command is replaced by make create-admin to create admin user to help you when you contribute to OpenGateLLM codebase.
Add verbose_json output format for /v1/audio/transcription endoint (#774)

Deprecated

carbon usage key in output of model reponses is replaced by impacts after ecologit update. This key will be removed in v0.5.0 (#791)

Bug fixes

Tool calling on Mistral On prem providers (#773)
Support legacy metadata (#777)
Fix typo on GET /documents/{document_id}/chunks (#782)
fix(collections): public filter of GET /v1/collections by @leoguillaume in #789

Refactoring

New documentation with Astro Starlight (#762, #771, #770, #772, #776)
Refactored provider endpoints to better align with clean architecture principles, improving maintainability and code organization (#718, #713, #767, #778, #783)
Optimize integration tests duration (#780)

Security

Add a semgrep workflows (#785, #794)

New Contributors

@github-actions[bot] made their first contribution in #771

Full Changelog: 0.4.1...0.4.2

Contributors

leoguillaume

Assets 2

10 Mar 08:37

leoguillaume

0.4.1post2

5829a33

0.4.1post2

Full Changelog: 0.4.1post1...0.4.1post2

Assets 2

03 Mar 17:36

leoguillaume

0.4.1

525e3fe

0.4.1

This release aims to realign RAG features with market standards, especially the OpenAI API, to provide a more predictable experience for client integrations.

Concretely, it strengthens the end-to-end RAG pipeline: more flexible ingestion (documents with or without a file), fine-grained chunk management, more powerful search (metadata filters, sorting, targeting by collections/documents), and native use of search as a tool in /v1/chat/completions.

At the same time, this version simplifies and modernizes the API surface by removing several legacy elements, aligning parameters with market conventions, and preparing for the breaking changes announced for v0.5.0.

Features

New endpoints to add chunk directly into a document (#660)
- Add optional name argument to POST /v1/documents endpoint and make file argument optional.
  
  Now, you need to provide either name or file. If you provide both, name overrides the file name. If you don't provide file, the document will be created without content, use POST /v1/documents/{document_id}/chunks to add content to it.
- Add POST /v1/documents/{document_id}/chunks endpoint to add chunks to a document (empty or not).
  
  With this endpoint, you control the parsing and the chunking of the document. You can also add custom metadata for each chunk. The chunk ID of each chunk is determined by the order of the chunks in the request and incremented by the number of chunks already in the document.
- Removed chunker argument in POST /v1/documents endpoint.
  
  Now, by default, the document is split into chunks using RecursiveCharacterTextSplitter. To split a document into chunks using a different chunker, you can use the disable_chunking argument and directly provide the chunks by POST /v1/documents/{document_id}/chunks.
- Add DELETE /v1/documents/{document_id}/chunks/{chunk_id} endpoint to delete a chunk from a document.
- Add GET /v1/documents/{document_id}/chunks endpoint to get chunks of a document.
- Add GET /v1/documents/{document_id}/chunks/{chunk_id} endpoint to get a chunk from a document.
Add metadata filters in search endpoint (#700)
- metadata_filters argument to POST /v1/search endpoint to filter results by metadata. metadata_filters argument is a list of filters, each filter is a dictionary with the following keys:
  - key: the key of the metadata to filter by
  - type: the type of the filter (eq (equal to), sw (starts with), ew (ends with) or co (contains))
  - value: the value of the filter
    
    And can be a compound filter with the following keys:
  - filters: the list of filters
  - operator: the operator to use for the compound filter (and (AND) or or (OR))
- Replace collections argument by collection_ids argument in POST /v1/search endpoint to filter results by collection IDs.
  
  collections will be removed in v0.5.0. Now, you can pass an empty list to search in all your collections.
- Add document_ids argument to POST /v1/search endpoint to filter results by document IDs.
  
  You can pass an empty list to search in all your documents.
Add order_by and order_direction arguments to GET /v1/collections and GET /v1/documents endpoints to sort collections and documents by a field (#709)

Add search build-in tool into POST /v1/chat/completions endpoint to search for chunks in your collections and documents (#708)

Pass this tool like:

{ 
   "messages": [
      {
         "role": "user",
         "content": "What is the capital of France?"
      }
   ],
   "model": "openweight-large",
   "tools": [{
      "type": "search",
      "method": "semantic",
      "limit": 10,
      "offset": 0,
      "rff_k": 60,
      ...
   }]
...
}

Arguments are the same as the POST /v1/search endpoint. Query is replace by the last message in the request. This way to call replace the deprecated search argument in POST /v1/chat/completions endpoint, removed in v0.5.0.

Support vLLM rerank models (#719)
Add inference additional prometheus metrics (#768)

Removed

Rerank legacy arguments of /v1/rerank endpoint are removed (#705)
- Since rerank is aligned with Cohere API standard, we removed the legacy arguments prompt and input. These arguements were remplaced by query and documents arguments.
- POST /v1/rerank endpoint is compatible with OpenWebUI integration.
Removed POST /v1/files and POST /v1/ocr-beta endpoints (#698)

These endpoints were deprecated in v0.3.4 and replaced by POST /v1/documents endpoint.
k parameter in POST /v1/search endpoint is removed, replaced by limit argument. (#694)
Proconnect draft support is removed (#693)

Deprecated

collection argument of POST /v1/documents endpoint is deprecated, replaced by collection_id, removed in v0.5.0. (#660)
GET /v1/chunks/{document}/{chunk} endpoint is deprecated, replaced by GET /v1/documents/{document_id}/chunks/{chunk_id}, removed in v0.5.0. (#660)
GET /v1/chunks endpoint is deprecated, replaced by GET /v1/documents/{document_id}/chunks, removed in v0.5.0. (#660)
POST /v1/parse-beta endpoint is deprecated, removed in v0.5.0. (#698)
collections argument in POST /v1/search endpoint is deprecated, replaced by collection_ids, removed in v0.5.0. (#700)
search argument in POST /v1/chat/completions endpoint is deprecated, replaced by build-in search tool in tools argument, removed in v0.5.0. (#708)
prompt argument in POST /v1/search endpoint is deprecated, replaced by query argument, removed in v0.5.0. (#708)

Bug fixes

Fixed an issue affecting streaming chat responses in models to ensure proper real-time output delivery (#692)
Version 0.4.0 introduced a streaming bug that has now been fixed. The handling of streaming and its related errors has been completely revised to use aiter_lines instead of aiter_raw, in order to ensure proper streaming formatting as reliably as possible.
Added proper threshold handling in the search module to improve result filtering behavior (#684)
Prometheus metrics now support gunicorn multi workers (#681)

Refactoring

Refactored routers endpoints to better align with clean architecture principles, improving maintainability and code organization (#658, #707, #712)
Refactored create providers endpoints to improve maintainability and code organization (#696)
Improved formatting of configuration errors for clearer and more actionable feedback (#672)
Refactored app creation to clean architecture (#691)

Chore

Removed router registry helper (#697)
Updated issue templates (#711)
Deleted import_circular_diagram.md (#699)

Full Changelog: 0.4.0...0.4.1

Assets 2

Uh oh!

Releases: etalab-ia/OpenGateLLM

0.4.8

What's Changed

Contributors

Uh oh!

0.4.7

Features

Refacto

Bug fixes

Chore

Uh oh!

0.4.6

Features

Refacto

Bug fixes

Chore

Uh oh!

0.4.5

Refactoring

Bug Fixes

Uh oh!

0.4.4

Features

Bug Fixes

Refactoring

Chore

Uh oh!

0.4.3post1

Features

Bug Fixes

Chore

Uh oh!

0.4.3

⚠️ Known issues

Recovery procedure

Features

Deprecated

Bug fixes

Refactoring

Security

Uh oh!

0.4.2

Features

Deprecated

Bug fixes

Refactoring

Security

New Contributors

Contributors

Uh oh!

0.4.1post2

Uh oh!

0.4.1

Features

Removed

Deprecated

Bug fixes

Refactoring

Chore

Uh oh!