[Feature Request] Support for multi-stage retriever and re-ranker in OpenSearch to use late interaction embedding models like ColBert, ColPali etc.

### Is your feature request related to a problem? Please describe

[ColBert](https://arxiv.org/abs/2004.12832), [Colpali](https://arxiv.org/pdf/2407.01449) like late interaction models provide multiple token level vectors for a document/query and multiple patch level vectors for an image. These patch/token level vectors when subjected to similarity search should be scored using MaxSim operation based on the late interaction mechanism. [An excellent blog from Weviate](https://weaviate.io/blog/late-interaction-overview) on late interaction models, MaxSim etc.

These models when used for search provides better relevance due to the interaction of every query and document token which is not possible with ordinary vector search based on single vectors for the document and query, where similarity is calculated as Cosine(1 doc vector, 1 query vector). Also, late interaction models allows interpretability of vector search as in which token (for a document) or patch (for an image) is causing the document to appear in the result and you can highlight those tokens/patches for your results (semantic highlighting). Examples,

<img width="927" alt="Image" src="https://github.com/user-attachments/assets/04c5262c-c40c-4a28-9695-37ba49819115" />

### Describe the solution you'd like

The workflow can be depicted in 2 ways,

Type 1. **Pooling** of query token vectors:

- During ingestion, the multiple vectors per document are stored using OpenSearch nested vector fields
- **Phase 1 (Retrieval)**: On query time, the multiple token vectors of the query will be converted into a single query vector (Max/Min Pooling) and subjected to ANN search against the document nested vector field.
- The unique documents fetched (de-duplication already handled by OpenSearch nested vector field) should enter the ranking phase.
- **Phase 2 (Ranking)**:In the ranking phase, all the token level vectors of both the documents and query are considered to calculate the score for the whole document-query using MaxSim formula.

Type 2. **No pooling** done on query token vectors:

- During ingestion, the multiple vectors per document are stored using OpenSearch nested vector fields
- Phase 1 (Retrieval): On query time, each token vector of the query will be subjected to ANN search against the document nested vector field.
- The unique documents fetched (de-duplication already handled by OpenSearch nested vector field) enter the ranking phase. These documents are the union list of all candidates that resulted from ANN search of **every** token query vector.
- Phase 2 (Ranking): In the ranking phase, all the token level vectors of both the documents and query are considered to calculate the score for the whole document-query using MaxSim formula.

Just for similar feature comparison, the type 1 multi-phase retrieval is supported in a ElasticSearch using a new [rescorer retriever query type](https://www.elastic.co/guide/en/elasticsearch/reference/8.18/retriever.html#rescorer-retriever) and maxsim ranking is supported by a script_score function, maxSimDotProduct (MaxSim over dot product of full precision vectors) and maxSimInvHamming (MaxSim over Hamming distance of bit vector dimensions) over multiple vectors per document stored in a field type called [rank_vectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/rank-vectors.html#rank-vectors-scoring)




### Related component

Search

### Describe alternatives you've considered

ClientSide MaxSim scoring
OpenSearch Hybrid query to support upto ANN of 5 query token vectors against the document multi vectors in nested vector field

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Support for multi-stage retriever and re-ranker in OpenSearch to use late interaction embedding models like ColBert, ColPali etc. #18091

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support for multi-stage retriever and re-ranker in OpenSearch to use late interaction embedding models like ColBert, ColPali etc. #18091

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions