Structured Search Pipeline

Querying requirements across RAG fall not only onto unstructured data that has been embedded and added to an vector database. It also falls onto structured data sources where semantic search doesn't really make sense. 

Goal: Provide a pipeline interface that connects to a structured data source and generates queries in real-time based on queries.

Implementation:
- Psuedo `Pipeline` without an embed or sink connector, just a data source.
- Data source connector is configured and an initial pull from the database is done to examine the fields available and their types.
- `Search` generates a query using an LLM based on the fields available in the database.
- The Pipeline can be used as part of a `PipelineCollection` and supported by `smart_route` in order for model to decide when to use it.

Alternative implementation:
- In order to reduce the latency associated with having to do 2-3 back to back LLM calls to generated query and validate it, what if the query generation was done pre-emptively and cached in to a vector database.
- Using an LLM, we would try to predict the top sets of queries that one might expect from the database and its permutations. (This might limit the complexity of the queries, but might answer for 80% of use cases)
- At `search` we would run a similarity search of the incoming query against the description of the "cached" queries. We then can run top query against the database.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structured Search Pipeline #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Structured Search Pipeline #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions