Skip to content

[Feature Request] Support for multi-stage retriever and re-ranker in OpenSearch to use late interaction embedding models like ColBert, ColPali etc. #18091

Open
@prasadnu

Description

@prasadnu

Is your feature request related to a problem? Please describe

ColBert, Colpali like late interaction models provide multiple token level vectors for a document/query and multiple patch level vectors for an image. These patch/token level vectors when subjected to similarity search should be scored using MaxSim operation based on the late interaction mechanism. An excellent blog from Weviate on late interaction models, MaxSim etc.

These models when used for search provides better relevance due to the interaction of every query and document token which is not possible with ordinary vector search based on single vectors for the document and query, where similarity is calculated as Cosine(1 doc vector, 1 query vector). Also, late interaction models allows interpretability of vector search as in which token (for a document) or patch (for an image) is causing the document to appear in the result and you can highlight those tokens/patches for your results (semantic highlighting). Examples,

Image

Describe the solution you'd like

The workflow can be depicted in 2 ways,

Type 1. Pooling of query token vectors:

  • During ingestion, the multiple vectors per document are stored using OpenSearch nested vector fields
  • Phase 1 (Retrieval): On query time, the multiple token vectors of the query will be converted into a single query vector (Max/Min Pooling) and subjected to ANN search against the document nested vector field.
  • The unique documents fetched (de-duplication already handled by OpenSearch nested vector field) should enter the ranking phase.
  • Phase 2 (Ranking):In the ranking phase, all the token level vectors of both the documents and query are considered to calculate the score for the whole document-query using MaxSim formula.

Type 2. No pooling done on query token vectors:

  • During ingestion, the multiple vectors per document are stored using OpenSearch nested vector fields
  • Phase 1 (Retrieval): On query time, each token vector of the query will be subjected to ANN search against the document nested vector field.
  • The unique documents fetched (de-duplication already handled by OpenSearch nested vector field) enter the ranking phase. These documents are the union list of all candidates that resulted from ANN search of every token query vector.
  • Phase 2 (Ranking): In the ranking phase, all the token level vectors of both the documents and query are considered to calculate the score for the whole document-query using MaxSim formula.

Just for similar feature comparison, the type 1 multi-phase retrieval is supported in a ElasticSearch using a new rescorer retriever query type and maxsim ranking is supported by a script_score function, maxSimDotProduct (MaxSim over dot product of full precision vectors) and maxSimInvHamming (MaxSim over Hamming distance of bit vector dimensions) over multiple vectors per document stored in a field type called rank_vectors

Related component

Search

Describe alternatives you've considered

ClientSide MaxSim scoring
OpenSearch Hybrid query to support upto ANN of 5 query token vectors against the document multi vectors in nested vector field

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    SearchSearch query, autocomplete ...etcenhancementEnhancement or improvement to existing feature or requestuntriaged

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions