Releases · etalab-ia/OpenGateLLM

12 Feb 18:56

leoguillaume

0.4.0post1

c619776

0.4.0post1

v0.4.0post1 is a patch release with bug fixes and refactoring improvements.

Refactoring

Refactored routers to better align with clean architecture principles, improving maintainability and code organization (#658)
Improved formatting of configuration errors for clearer and more actionable feedback (#672)

Bug fixes

Fixed an issue affecting streaming chat responses in models to ensure proper real-time output delivery (#692)
Version 0.4.0 introduced a streaming bug that has now been fixed. The handling of streaming and its related errors has been completely revised to use aiter_lines instead of aiter_raw, in order to ensure proper streaming formatting as reliably as possible.
Added proper threshold handling in the search module to improve result filtering behavior (#684)

Full Changelog: 0.4.0...0.4.0post1

Assets 2

09 Feb 17:45

leoguillaume

0.4.0

620de23

0.4.0

With the release of OpenGateLLM version 0.4.0 (previous version 0.3.7), we have decided to revise our approach based on Elasticsearch's recommended best practices. The main changes are as follows:

Deprecation of Qdrant support in favor of Elasticsearch
Consolidation of Elasticsearch indices into a single index
Convert document metadata into a single field with constraints

These major changes require a data migration of your vector store. We provide a migration script here to help you update your instance.

Introduction

Currently, we support two vector store technologies: Qdrant and Elasticsearch. A few months ago, we decided to focus on Elasticsearch for managing our document collections. We revisit the reasons for this choice here.

However, over the past few weeks, we have encountered scalability issues with Elasticsearch. These problems stem from how we implemented Elasticsearch in OpenGateLLM. To resolve these issues, we decided to revise our approach, which involves major changes and a data migration.

To this end, we detail the modifications we have made and provide a migration script to help you update your instance.

Why Elasticsearch over Qdrant?

In Retrieval-Augmented Generation (RAG), there are 3 classic search methods:

Lexical search with BM25 (TF-IDF)
Semantic search with vector similarity
Hybrid search combining both (using the Reciprocal Rank Fusion (RRF) algorithm to combine results)

We sought to offer these 3 search methods to our users. Initially, we implemented Qdrant for semantic search due to its scalability.

However, when wanting to add lexical and hybrid search, we found that Qdrant does not natively support these methods. Their approach is based on deploying a model alongside the vector store.

Additionally, Elasticsearch excels at lexical search with BM25 and natively enables complex filtering on specific fields. For these reasons, Elasticsearch seems like a better solution for RAG search.

OpenGateLLM's goal is to support multiple vector store solutions to give you the choice of the technology that best suits your needs.

Major Changes

End of Qdrant Support

To focus on Elasticsearch support, we have decided to deprecate Qdrant support. This decision was made after consulting the community on the subject. Indeed, it turns out that currently, no one has chosen Qdrant for their OpenGateLLM instance.

Additionally, with the OpenGateLLM team having limited resources, we cannot afford to maintain two vector store solutions at this time.

We do not rule out revisiting this decision in the future if the community requests it. Moreover, the goal remains to support multiple vector store solutions in the long term, once we have the necessary resources.

Consolidation of Elasticsearch Indices into a Single Index

Currently, OpenGateLLM creates an Elasticsearch index for each collection. This approach allows collections to be managed independently. However, this is not optimal for scalability. Indeed, by default, Elasticsearch limits the number of shards (for each index, Elasticsearch creates at least one shard). The multiplication of indices can quickly become a performance bottleneck in this context.

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. Source: How many shards should I have in my Elasticsearch cluster?

To solve this problem, we decided to consolidate all Elasticsearch indices into a single index. To migrate your data, we provide a migration script (see Migration Script).

Convert document metadata into a single field with constraints

Currently, when creating a document, users can define metadata for the document. They are free to define any metadata they want with the following types: int, str, float, datetime, or bool. Each metadata is stored in a separate field in the Elasticsearch index. This dynamic addition of metadata will quickly become problematic with the consolidation of indices into a single index. Indeed, Elasticsearch is not designed to optimally support thousands of fields on an index. This risks creating performance and scalability issues.

To address this issue, we decided to convert the metadata field into a single field of type flattened. However, this solution limits filtering actions on these fields (they are then stored in a single field and interpreted as str), see Elasticsearch documentation. However, the tests we have performed have shown that the filtering capabilities on a flattened field seem sufficient for RAG search operations.

Additionally, at the Pydantic level, we have added constraints on the types of data that can be stored in the metadata field.

From now on, the metadata field must comply with the following constraints:

MIN_NUMBER, MAX_NUMBER = -9999999999999999, 9999999999999999

MetadataStr = Annotated[str, StringConstraints(strip_whitespace=True, min_length=1, max_length=255)]
MetadataInt = Annotated[int, Field(ge=MIN_NUMBER, le=MAX_NUMBER)]
MetadataFloat = Annotated[float, Field(ge=MIN_NUMBER, le=MAX_NUMBER)]
MetadataList = Annotated[list[MetadataStr | MetadataInt | MetadataFloat | bool | None], Field(max_length=8)]

ChunkMetadata = Annotated[dict[MetadataStr, MetadataStr | MetadataInt | MetadataFloat | MetadataList | bool | None], Field(description="Extra metadata for the source", min_length=1, max_length=8)]

One possible solution would have been to define fields with the flattened type. However, this solution only partially solves the performance problem and limits filtering actions on these fields (they are then stored in a single field and interpreted as str).

To ensure the scalability of the Elasticsearch index, we decided to pre-define metadata for documents. This approach avoids overloading the index with metadata while maintaining type-based filtering capabilities.

Other Changes

Fixes

Fixed minor bugs in the Playground:
- User expiration date formatting
- Removal of the old collection ID type
- Sorting and filters on Router and Provider pages
- Removal of all roles and organizations for user creation
Fixed support for the language parameter for audio transcription models with vLLM and Albert API so it can be empty.
The collections parameter in the search endpoint is now correctly typed as list[int].
The rff_k property when using the hybrid search now accepts values between 0 (included) and 16384 (included). This fix enhance the readability of the endpoint and fix a division by zero error.

Improvements

Improved code readability for form data request declarations.
Return of the usage key in stream responses from /v1/chat/completions even if the stream does not end with the [DONE] token.
The collections parameter in the search endpoint now has a maximum length of 100 to avoid overwhelming the Elasticsearch index.
collection_id and document_id have been moved to the chunk level. Previously, they were part of the chunk's metadata field, which could have led users to believe these values were editable.

Migration Script

If you are running OpenGateLLM on an existing Elasticsearch instance, we invite you to use the migration script to migrate your data. Find the migration script in the GitHub repository.

Full Changelog: 0.3.7...0.4.0

fix(playground): expiration user date format when user creation by @leoguillaume in #663
fix(search): remove old collection ID type by @leoguillaume in #662
fix(playground): router and provider pages sort and filters by @leoguillaume in #664
fix(playground): remove all roles and all organizations for user creation by @leoguillaume in #666
fix(audio): fix request_format for Albert integration by @leoguillaume in #665
feat(data): consolidate elasticsearch indices into a single index by @leoguillaume in #667
doc(adr): elasticsearch scaling by @leoguillaume in #668
feat(elastiscearch): add healthcheck to migration script and complete release note by @leoguillaume in #669
feat(documents): change default metadata by @leoguillaume in #685
remove document name from es index by @leoguillaume in #686
feat(search): fix rff_k division by 0 plus tests by @tibo-pdn in #687
feat(chunks): change chunk schema by @leoguillaume in #688

Contributors

leoguillaume and tibo-pdn

Assets 2

19 Jan 09:52

leoguillaume

0.3.7

759ccfd

0.3.7

What's Changed

minor improvment in doc by @leoguillaume in #557
Mise à jour des liens API Reference et API Swagger by @moscaale in #600
Documentation queuing by @blanch0t in #536
feat(api): remove web search references (brave, duckduckgo) by @tibo-pdn in #601
chore(deps): bump qs and express in /docs by @dependabot[bot] in #610
Update feature_request.md by @leoguillaume in #607
fix(docs): make quickstart work directly and match documentation by @natoromano in #605
Clean archi - model endpoint by @moscaale in #522
feat(github): add PR template with most of the useful sections by @tibo-pdn in #606
feat(api): remove carbon footprint prefix in provider parameters by @tibo-pdn in #603
Correct typo in OCR tutorial documentation by @cyrillay in #629
fix(ocr-beta): forward_request for ocr-beta by @moscaale in #613
feat(collections): add desc filter on collections creation date by @tibo-pdn in #631
feat(rerank): change signature of v1rerank endpoint to cohere standard by @tibo-pdn in #611
feat(models): add request content to basemodelprovider by @leoguillaume in #640
608 core fix error when redis key of rate limit as no ttl by @tibo-pdn in #654

New Contributors

@natoromano made their first contribution in #605

Full Changelog: 0.3.6...0.3.7

Contributors

natoromano, cyrillay, and 5 other contributors

Assets 2

20 Dec 16:46

leoguillaume

0.3.6

c837ec0

0.3.6

What's Changed

fix(playground): fix app title wrap when too long by @tibo-pdn in #584
fix(playground): user creation when budget is empty by @leoguillaume in #586
fix(albert): increase playground timeout by @leoguillaume in #587
fix(models): disable router and provider pagination for interne funct… by @leoguillaume in #588
fix: order by router name in limits when displaying roles by @tibo-pdn in #589
fix(qdrant): /chunks offset issue for Qdrant database by @tibo-pdn in #595
fix(playground): pagination state shared between classes and components by @tibo-pdn in #591
fix(playground): never expirred key in playground by @leoguillaume in #599

Full Changelog: 0.3.5...0.3.6

Contributors

leoguillaume and tibo-pdn

Assets 2

17 Dec 14:38

leoguillaume

0.3.5post2

dc74f1f

0.3.5post2

What's Changed

fix(models): disable router and provider pagination for interne funct… by @leoguillaume in #588

Full Changelog: 0.3.5post1...0.3.5post2

Contributors

leoguillaume

Assets 2

17 Dec 13:06

leoguillaume

0.3.5post1

ce93719

0.3.5post1

What's Changed

fix(playground): fix app title wrap when too long by @tibo-pdn in #584
fix(playground): user creation when budget is empty by @leoguillaume in #586
fix(albert): increase playground timeout by @leoguillaume in #587

Full Changelog: 0.3.5...0.3.5post1

Contributors

leoguillaume and tibo-pdn

Assets 2

17 Dec 10:26

tibo-pdn

0.3.5

0119d94

0.3.5

What's Changed

556 support mistral ocr api by @leoguillaume in #559
Update issue templates by @leoguillaume in #569
feat(audio): add support Mistral API for audio transcription by @leoguillaume in #578
feat(playground): add pagination on router and provider pages by @tibo-pdn in #575
fix(models): conflict with name and aliases for router creation by @leoguillaume in #579
feat(playground): create unified header on the app by @tibo-pdn in #580
feat(playground): sort filter of user page by @leoguillaume in #583
feat(models): add update providers endpoint by @leoguillaume in #581

Full Changelog: 0.3.4...0.3.5

Contributors

leoguillaume and tibo-pdn

Assets 2

11 Dec 15:26

leoguillaume

0.3.4

51a6b54

0.3.4

What's Changed

minor improvment on playground by @leoguillaume in #552
fix: adding check collections for all /collections endpoints by @FaheemBEG in #504
plaground-fix by @leoguillaume in #560
fix header playground by @leoguillaume in #562
fix urls by @leoguillaume in #563
Feat/add unique constraint on organization by @tibo-pdn in #561

New Contributors

@tibo-pdn made their first contribution in #561

Full Changelog: 0.3.3...0.3.4

Contributors

leoguillaume, tibo-pdn, and FaheemBEG

Assets 2

05 Dec 11:11

leoguillaume

0.3.3

9a30913

0.3.3

What's Changed

fix role page by @leoguillaume in #545
hotfix: tei format request by @leoguillaume in #548
hotfix: hidden models by @leoguillaume in #549
hotfix: update user playground by @leoguillaume in #550
feat: add search email bar by @leoguillaume in #551

Full Changelog: 0.3.2...0.3.3

Contributors

leoguillaume

Assets 2

04 Dec 17:39

leoguillaume

0.3.2post3

12a98f8

0.3.2post3

What's Changed

hotfix: hidden models by @leoguillaume in #549

Full Changelog: 0.3.2post2...0.3.2post3

Contributors

leoguillaume

Assets 2

Uh oh!

Releases: etalab-ia/OpenGateLLM

0.4.0post1

Refactoring

Bug fixes

Uh oh!

0.4.0

Introduction

Why Elasticsearch over Qdrant?

Major Changes

End of Qdrant Support

Consolidation of Elasticsearch Indices into a Single Index

Convert document metadata into a single field with constraints

Other Changes

Fixes

Improvements

Migration Script

Contributors

Uh oh!

0.3.7

What's Changed

New Contributors

Contributors

Uh oh!

0.3.6

What's Changed

Contributors

Uh oh!

0.3.5post2

What's Changed

Contributors

Uh oh!

0.3.5post1

What's Changed

Contributors

Uh oh!

0.3.5

What's Changed

Contributors

Uh oh!

0.3.4

What's Changed

New Contributors

Contributors

Uh oh!

0.3.3

What's Changed

Contributors

Uh oh!

0.3.2post3

What's Changed

Contributors

Uh oh!