Skip to content

Migrate QueryExpander, MultiQueryTextRetriever and MultiQueryEmbeddingRetriever from haystack-experimental to haystack #10086

@sjrl

Description

@sjrl

There have been requests from the Solution Engineering team to have a set of components capable of of creating and running retrieval on multiple queries to improve retrieval.

We have these components in haystack-experimental which are QueryExpander, MultiQueryEmbeddingRetriever and MultiQueryTextRetriever with the an example workflow being

QueryExpander --> MultiQueryTextRetriever --> Retrieved Documents

Additional here is a code example

expander = QueryExpander(
    chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"), n_expansions=3, include_original_query=True
)
in_memory_retriever = InMemoryEmbeddingRetriever(document_store=document_store_with_embeddings)
query_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
multiquery_retriever = MultiQueryEmbeddingRetriever(
    retriever=in_memory_retriever,
    query_embedder=query_embedder,
    max_workers=3
)

pipeline = Pipeline()
pipeline.add_component("query_expander", expander)
pipeline.add_component("multiquery_retriever", multiquery_retriever)
pipeline.connect("query_expander.queries", "multiquery_retriever.queries")

data = {
    "query_expander" : {"query": "green energy sources"},
    "multiquery_retriever": {"retriever_kwargs": {"top_k": 3}}
}
results = pipeline.run(data=data, include_outputs_from={"query_expander", "multiquery_retriever"})

Steps for Moving an Experiment to Haystack Core or Integrations

  • Make sure the latest Haystack release or an integration release contains the merged experiment
  • Update import statements in example cookbook, remove experimental tag from cookbook, etc.
  • Close discussion in haystack-experimental with move information
  • Remove pydocs
  • Move experiment from active experiments in the catalog in haystack-experimental README.md to adopted experiments
  • Remove example notebook from haystack-experimental if it exists
  • Remove dependencies needed for the experiment from pyproject.toml
  • Make sure to add a docs page for the new component(s)

Metadata

Metadata

Assignees

Labels

P1High priority, add to the next sprint

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions