Releases: etalab-ia/OpenGateLLM
0.4.8
What's Changed
- chore: move admin schemas in dedicated admin schema folder by @leoguillaume in #928
- fix: circular import by @leoguillaume in #929
- refacto(keys): simplify decode api key logic by @leoguillaume in #930
- chore: rename user_with_role to authenticated_user by @leoguillaume in #931
- chore: remove useless fields of authenticated_user by @leoguillaume in #932
- chore: add npm minimum release age by @kaaloo in #907
- chore(deps-dev): bump tmp from 0.2.5 to 0.2.7 in /docs by @dependabot[bot] in #895
- chore(doc): update generated documentation and release versions by @github-actions[bot] in #916
- feat(users): add id suffix to user and organization attribut of user entity user creation endpoint by @leoguillaume in #934
- refacto(keys): refacto POST /v1/admin/keys endpoint toward clean architecture by @leoguillaume in #933
- refacto(auth): refactoring of /login toward clean architecture by @leoguillaume in #937
- fix(auth): post review changes by @leoguillaume in #938
- fix(postges): close session before calling LLMs in chat completion by @benjaminpilia in #940
Full Changelog: 0.4.7...0.4.8
0.4.7
Features
- Add copy button when creating an API key in key page of Playground UI (#896).
- Call /metrics of vLLM provider to set health of models by (#911).
Refacto
- Refactoring of DELETE /v1/admin/users/{user_id} endpoint toward clean architecture (#898).
- Refactoring of POST /v1/rerank endpoint toward clean architecture by (#905).
Bug fixes
- Fix search by email in user admin page of Playground UI (#909).
Chore
- Change default refresh time of ES index to 2s (#904).
Full Changelog: 0.4.6...0.4.7
0.4.6
Features
- Add possibility to set limits storage for each role (#899).
- Add document chunk upload to max 20MB per document (#902).
Refacto
- Refactoring of GET /v1/admin/users endpoint toward clean architecture (#893).
Bug fixes
- Handle Mistral API non string content (#892).
Chore
- Update generated documentation and release versions (#891).
- Optimize CI/CD build (#900).
- Add docker ignore and remove ARM build (block github actions CI/CD) (#901).
Full Changelog: 0.4.5...0.4.6
0.4.5
Refactoring
- Split GetModelsUseCase in two use case (#890).
- Refactoring of POST /v1/admin/users endpoint toward clean architecture refactoring (#867).
Bug Fixes
- Fix inject context into Langfuse (#889).
Full Changelog: 0.4.4...0.4.5
0.4.4
This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.
Features
- Support for
srt,vtt, andverbose_jsonformats for the/v1/audio/transcriptionsendpoint with thewhisperxprovider (#855, #859). - Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).
- New endpoint
/health/modelsto health check status of each model (#870). More accuracy method to detect overload on a model will be add in a next release. Currently it is only based on Little's law.
Bug Fixes
- Fixed various UI issues in the playground (#856, #860).
- Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
- Fixed workflow deployment chaining (#858).
- Fix base_url configuration argument for Langfuse (#868).
Refactoring
Chore
- Updated the documentation URL in the Albert API playground (#854, #862).
- Ignore some CVE in debian image docker for Trivy scan, see trivignore file (#872, #873, #874).
Full Changelog: 0.4.3...0.4.4
0.4.3post1
This release fixes several side effects introduced in version 0.4.3, particularly around model bootstrapping.
Features
- Support for
srt,vtt, andverbose_jsonformats for the/v1/audio/transcriptionsendpoint with thewhisperxprovider (#855, #859). - Added a database constraint to prevent deletion of a user who owns routers or providers (mitigates known issues, see release notes for 0.4.3) (#856).
Bug Fixes
- Fixed various UI issues in the playground (#856, #860).
- Ignored CVE-2026-33845 in Trivy scans until 2026-08-01 (#857).
- Fixed workflow deployment chaining (#858).
Chore
- Updated the documentation URL in the Albert API playground (#854).
Full Changelog: 0.4.3...0.4.3post1
0.4.3
This release introduces a major redesign of the OGL bootstrap process, with two significant changes.
-
Configuration-defined models are only created during initial bootstrap
Models defined in the configuration file are now created in the database only when the database is empty. If models already exist, the configuration is ignored.
Installing routers and providers through the configuration file (
modelssection) was incompatible with a dynamic model management approach. Since the database became the single source of truth, this led to confusing behaviors where it was difficult to determine when a router or provider defined in the configuration was actually applied.To simplify the process, the model-related configuration now acts solely as a bootstrap script for the initial API startup.
-
Removal of the
master userPreviously, a configurable
masteruser could be defined in the configuration file. This user had several responsibilities:- its password was used as the encryption key for API keys;
- it owned models (routers/providers) created from the configuration;
- it could create additional administrator accounts through the
ADMINpermission.
This behavior has now changed.
The API key encryption key is now fully decoupled from any user account and must be configured through a new configuration entry:
auth_secret_key.This improves security by separating responsibilities more clearly.
Additionally, the
master userbehaved like a “ghost user”: it had nouser_id, and its actions were not tracked in the database.To address this limitation, the API now automatically creates a
bootstrap_adminuser and abootstrap_adminrole with theADMINpermission whenever no administrator exists in the database.The credentials are configured through:
auth_bootstrap_admin_email(default:admin)auth_bootstrap_admin_password(default:changeme)
⚠️ Known issues
If some routers or providers were originally created using the legacy master user, the bootstrap_admin account will still be created automatically by the Alembic migration, even if administrator users already exist.
-
Default credentials security risk
You must change the default
bootstrap_admincredentials immediately.Default values:
- username:
admin - password:
changeme
Keeping these defaults creates a critical security vulnerability by exposing a publicly known administrator account.
- username:
-
Deleting the
bootstrap_adminuserDeleting the
bootstrap_adminuser also deletes all associated routers and providers.This becomes especially problematic if the embeddings model used by the vector store is deleted. In that case, the API can no longer start because:
- the required embeddings model no longer exists;
- models cannot be recreated from the configuration file once other models already exist in the database.
Recovery procedure
- Remove the ElasticSearch dependency from the configuration file;
- Start the API;
- Recreate the embeddings model manually;
- Re-enable ElasticSearch in the configuration;
- Restart the API.
This issue will be resolved once the RAG component is extracted into a dedicated OGL project.
Features
- Start support of Langfuse to be an alternative to PostgreSQL for usage monitoring (#812).
- Add diarized transcription support with WisperX (#832).
Deprecated
- Configuration file arguments
auth_master_usernameandauth_master_keyare now deprecated (#779).
The encryption key is replaced by the newauth_secret_keyargument. The master user is now created as a real user on the first API startup (or when the database contains no admin user). This user, now called bootstrap_admin, is configured via the newauth_bootstrap_admin_usernameandauth_bootstrap_admin_passwordarguments in the configuration file.
Bug fixes
- Fix Playground form selectors (#802, #828).
- Ensure that passwords are no longer than 72 characters in Playground (#809).
- Set default language code from "english" to "en" for audio transcription (#807, #819).
- Remove URLs from Prometheus timeseries (v1/document/123/chunks), replace with route pattern (#824).
- Reduce the default number of ES shard from 24 to 12 (#829).
- Close pdf after reading (#833).
Refactoring
- Refactoring toward clean architecture the admin user creation bootstrap script (#799, #827).
- Refactored
/v1/admin/rolesendpoints to better align with clean architecture principles, improving maintainability and code organization (#801, #817, #821, #808). - Refactored BaseModelProvider class toward clean architecture (#796, #826, #822).
- Refactoring toward clean architecture the models creation bootstrap script (#823).
Security
- Add Trivy and Semgrep to CI/CD pipeline (#793, #797).
- Fix all package versions (#830).
- Possibility to disable pages of Playground (#835).
Full Changelog: 0.4.2...0.4.3
0.4.2
This release improve the developer experience by a new CLI (see makecommand) and a new documentation with Astro Starlight. The new documentation is available here : https://docs.opengatellm.org/.
Features
- Add a new CLI with Rich (#713)
make create-usercommand is replaced bymake create-adminto create admin user to help you when you contribute to OpenGateLLM codebase.
- Add
verbose_jsonoutput format for /v1/audio/transcription endoint (#774)
Deprecated
carbonusage key in output of model reponses is replaced byimpactsafter ecologit update. This key will be removed in v0.5.0 (#791)
Bug fixes
- Tool calling on Mistral On prem providers (#773)
- Support legacy metadata (#777)
- Fix typo on GET /documents/{document_id}/chunks (#782)
- fix(collections): public filter of GET /v1/collections by @leoguillaume in #789
Refactoring
- New documentation with Astro Starlight (#762, #771, #770, #772, #776)
- Refactored provider endpoints to better align with clean architecture principles, improving maintainability and code organization (#718, #713, #767, #778, #783)
- Optimize integration tests duration (#780)
Security
New Contributors
- @github-actions[bot] made their first contribution in #771
Full Changelog: 0.4.1...0.4.2
0.4.1post2
Full Changelog: 0.4.1post1...0.4.1post2
0.4.1
This release aims to realign RAG features with market standards, especially the OpenAI API, to provide a more predictable experience for client integrations.
Concretely, it strengthens the end-to-end RAG pipeline: more flexible ingestion (documents with or without a file), fine-grained chunk management, more powerful search (metadata filters, sorting, targeting by collections/documents), and native use of search as a tool in /v1/chat/completions.
At the same time, this version simplifies and modernizes the API surface by removing several legacy elements, aligning parameters with market conventions, and preparing for the breaking changes announced for v0.5.0.
Features
-
New endpoints to add chunk directly into a document (#660)
-
Add optional
nameargument to POST/v1/documentsendpoint and makefileargument optional.
Now, you need to provide eithernameorfile. If you provide both,nameoverrides thefilename. If you don't providefile, the document will be created without content, use POST/v1/documents/{document_id}/chunksto add content to it. -
Add POST
/v1/documents/{document_id}/chunksendpoint to add chunks to a document (empty or not).
With this endpoint, you control the parsing and the chunking of the document. You can also add custom metadata for each chunk. The chunk ID of each chunk is determined by the order of the chunks in the request and incremented by the number of chunks already in the document. -
Removed
chunkerargument in POST/v1/documentsendpoint.
Now, by default, the document is split into chunks usingRecursiveCharacterTextSplitter. To split a document into chunks using a different chunker, you can use thedisable_chunkingargument and directly provide the chunks by POST/v1/documents/{document_id}/chunks. -
Add DELETE
/v1/documents/{document_id}/chunks/{chunk_id}endpoint to delete a chunk from a document. -
Add GET
/v1/documents/{document_id}/chunksendpoint to get chunks of a document. -
Add GET
/v1/documents/{document_id}/chunks/{chunk_id}endpoint to get a chunk from a document.
-
-
Add metadata filters in search endpoint (#700)
-
metadata_filtersargument to POST/v1/searchendpoint to filter results by metadata.metadata_filtersargument is a list of filters, each filter is a dictionary with the following keys:key: the key of the metadata to filter bytype: the type of the filter (eq(equal to),sw(starts with),ew(ends with) orco(contains))value: the value of the filter
And can be a compound filter with the following keys:filters: the list of filtersoperator: the operator to use for the compound filter (and(AND) oror(OR))
-
Replace
collectionsargument bycollection_idsargument in POST/v1/searchendpoint to filter results by collection IDs.
collectionswill be removed in v0.5.0. Now, you can pass an empty list to search in all your collections. -
Add
document_idsargument to POST/v1/searchendpoint to filter results by document IDs.
You can pass an empty list to search in all your documents.
-
-
Add
order_byandorder_directionarguments to GET/v1/collectionsand GET/v1/documentsendpoints to sort collections and documents by a field (#709) -
Add search build-in tool into POST
/v1/chat/completionsendpoint to search for chunks in your collections and documents (#708)- Pass this tool like:
{ "messages": [ { "role": "user", "content": "What is the capital of France?" } ], "model": "openweight-large", "tools": [{ "type": "search", "method": "semantic", "limit": 10, "offset": 0, "rff_k": 60, ... }] ... }Arguments are the same as the POST
/v1/searchendpoint. Query is replace by the last message in the request. This way to call replace the deprecatedsearchargument in POST/v1/chat/completionsendpoint, removed in v0.5.0. -
Support vLLM rerank models (#719)
-
Add inference additional prometheus metrics (#768)
Removed
-
Rerank legacy arguments of
/v1/rerankendpoint are removed (#705)- Since rerank is aligned with Cohere API standard, we removed the legacy arguments
promptandinput. These arguements were remplaced byqueryanddocumentsarguments. - POST
/v1/rerankendpoint is compatible with OpenWebUI integration.
- Since rerank is aligned with Cohere API standard, we removed the legacy arguments
-
Removed POST
/v1/filesand POST/v1/ocr-betaendpoints (#698)
These endpoints were deprecated in v0.3.4 and replaced by POST/v1/documentsendpoint. -
kparameter in POST/v1/searchendpoint is removed, replaced bylimitargument. (#694) -
Proconnect draft support is removed (#693)
Deprecated
collectionargument of POST/v1/documentsendpoint is deprecated, replaced bycollection_id, removed in v0.5.0. (#660)- GET
/v1/chunks/{document}/{chunk}endpoint is deprecated, replaced byGET /v1/documents/{document_id}/chunks/{chunk_id}, removed in v0.5.0. (#660) - GET
/v1/chunksendpoint is deprecated, replaced byGET /v1/documents/{document_id}/chunks, removed in v0.5.0. (#660) - POST
/v1/parse-betaendpoint is deprecated, removed in v0.5.0. (#698) collectionsargument in POST/v1/searchendpoint is deprecated, replaced bycollection_ids, removed in v0.5.0. (#700)searchargument in POST/v1/chat/completionsendpoint is deprecated, replaced by build-insearchtool intoolsargument, removed in v0.5.0. (#708)promptargument in POST/v1/searchendpoint is deprecated, replaced byqueryargument, removed in v0.5.0. (#708)
Bug fixes
- Fixed an issue affecting streaming chat responses in models to ensure proper real-time output delivery (#692)
Version0.4.0introduced a streaming bug that has now been fixed. The handling of streaming and its related errors has been completely revised to useaiter_linesinstead ofaiter_raw, in order to ensure proper streaming formatting as reliably as possible. - Added proper threshold handling in the search module to improve result filtering behavior (#684)
- Prometheus metrics now support gunicorn multi workers (#681)
Refactoring
- Refactored routers endpoints to better align with clean architecture principles, improving maintainability and code organization (#658, #707, #712)
- Refactored create providers endpoints to improve maintainability and code organization (#696)
- Improved formatting of configuration errors for clearer and more actionable feedback (#672)
- Refactored app creation to clean architecture (#691)
Chore
- Removed router registry helper (#697)
- Updated issue templates (#711)
- Deleted import_circular_diagram.md (#699)
Full Changelog: 0.4.0...0.4.1