-
|
Hi, We have the following use case: our RAG system queries documents from different projects (legal cases). Usually there are several hundred cases stored in one instance of our system. The users of our system have the option to define the embeddings model per project. The reason for this is that some projects are multi-lingual or that some projects require higher precision than others. The vectors can be of different dimensionality. I'm aware of the discussion #5511 and the suggestion to partition different vector dimensionality across different collections, and use payload based multi-tenancy on different models that share the same number of dimensions. To decide whether we bite the bullet and use a collection by vector model/dimensionality or whether we have a collection by project, I would like to get a better sense for the performance impact of having a large number of collections (several hundred). Does it make a difference that realistically only a small number of projects (maybe 10-20 max) are active at any given time? Many thanks for your views! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Generally speaking - you have to partition it in some way - definitely choose a collection per vector model/dimension and still use payload base partitioning inside each collection. This way you'd create the lowest number of collections, which would best match our suggestions. For context, in our cloud offering we limit to a maximum of 200 collections. And we do that for good reason. My guess is that partitioning by project would easily outgrow that limit. If you can guarantee the number of projects will never be larger than 20 at any given time, then assigning a collection per project is a valid choice and it might be a bit easier to implement. The end goal should always be to use the lowest possible amount of collections. |
Beta Was this translation helpful? Give feedback.
Generally speaking - you have to partition it in some way - definitely choose a collection per vector model/dimension and still use payload base partitioning inside each collection. This way you'd create the lowest number of collections, which would best match our suggestions.
For context, in our cloud offering we limit to a maximum of 200 collections. And we do that for good reason. My guess is that partitioning by project would easily outgrow that limit.
If you can guarantee the number of projects will never be larger than 20 at any given time, then assi…