Support all vector databases

### Prerequisites

- [x] Search the [current open issues](https://github.com/googleapis/genai-toolbox/issues)

### What are you trying to do that currently feels hard or impossible?

We need to ensure that all vector-enable databases support `embeddedBy:`. Currently the blog only mentions PostgreSQL being supported.

### Suggested Solution(s)

Add tests for all vector-enabled databases and ensure the vector is formatted and inserted correctly.

## **Databases Supporting Vectors**

According to the source documentation and the nature of these integrations for GenAI, the following sources specifically support vector search or storage:

* **AlloyDB for PostgreSQL:** Includes built-in support via the google\_ml\_integration and pgvector extensions for high-performance vector search.  
* **Cloud SQL for PostgreSQL:** Supports vector storage and search through the pgvector extension.  
* **PostgreSQL:** Standard PostgreSQL support via the pgvector extension.  
* **Spanner:** Supports vector search using the VECTOR\_COSINE\_DISTANCE and VECTOR\_L2\_DISTANCE functions and vector indexing.  
* **BigQuery:** Supports vector search via the VECTOR\_SEARCH function and vector indexes.  
* **Firestore:** Supports vector search and K-nearest neighbor (KNN) queries on document fields.  
* **MongoDB:** Supports vector search through MongoDB Atlas Vector Search.  
* **Elasticsearch:** A native vector database supporting dense and sparse vector types.  
* **Neo4j:** Supports vector indexing and search via Cypher procedures.  
* **SingleStore:** Built-in vector data types and functions for dot product and Euclidean distance.  
* **Redis / Valkey:** Supports vector search using the RediSearch module (HNSW/Flat indexing).  
* **ClickHouse:** Supports vector search using specialized distance functions and experimental ANN (Approximate Nearest Neighbor) indexes.  
* **Cassandra:** Supports vector data types (since v5.0) and SAI (Storage-Attached Indexing).

---

## **Requirements for Enabling Vectors**

To use vectors with these sources within the GenAI Toolbox, the following general and specific requirements must be met:

#### **1\. Database-Specific Requirements**

* **PostgreSQL-based (AlloyDB, Cloud SQL, Self-hosted):**  
  * **Extension:** You must enable the pgvector extension by running CREATE EXTENSION IF NOT EXISTS vector; in your database.  
  * **Column Type:** Use the VECTOR(dimensions) data type for your embedding columns.  
* **Spanner:**  
  * **Schema:** Define columns with the ARRAY\<FLOAT64\> or ARRAY\<FLOAT32\> type.  
  * **Index:** Create a search index specifically for vectors (using the VECTOR\_INDEX syntax) to ensure performance.  
* **BigQuery:**  
  * **Index:** You must create a Vector Index on your embedding column (which is typically an ARRAY\<FLOAT64\>).  
  * **Metadata:** The VECTOR\_SEARCH function requires a base table and a query table (or a single embedding).  
* **Firestore:**  
  * **Index:** You must create a single-field index for the specific field containing the vector with the index type set to Vector.  
* **MongoDB:**  
  * **Deployment:** Requires MongoDB Atlas (v6.0.11 or v7.0.2+).  
  * **Index:** A "Vector Search Index" must be defined in the Atlas UI or via API.
 

## Formats:
### **1\. PostgreSQL / AlloyDB / Cloud SQL (Postgres)**

* **Format:** **String**  
* **Syntax:** '\[0.1, 0.2, 0.3\]'  
* **Notes:** Because the pgvector extension defines a custom vector type, literal values in SQL queries must be wrapped in single quotes and brackets to be cast correctly.

### **2\. Google Cloud Spanner**

* **Format:** **Array of Doubles/Floats**  
* **Syntax:** \[0.1, 0.2, 0.3\]  
* **Notes:** Spanner uses the native ARRAY\<FLOAT64\> or ARRAY\<FLOAT32\> type. In SQL queries, these are passed as standard arrays without quotes.

### **3\. BigQuery**

* **Format:** **Array of Floats**  
* **Syntax:** \[0.1, 0.2, 0.3\]  
* **Notes:** BigQuery expects an ARRAY\<FLOAT64\>. When using VECTOR\_SEARCH, the query vector is typically passed as a literal array or a parameter of type array.

### **4\. Google Cloud Firestore**

* **Format:** **VectorValue Object (via SDK)**  
* **Syntax:** VectorValue(\[0.1, 0.2, 0.3\])  
* **Notes:** Firestore does not use SQL. When using the GenAI Toolbox/SDKs, vectors are passed as a native array/list which the library wraps into a VectorValue object for the document request.

### **5\. MongoDB Atlas**

* **Format:** **BSON Array**  
* **Syntax:** \[0.1, 0.2, 0.3\]  
* **Notes:** Within an aggregation pipeline ($vectorSearch), the queryVector field is a standard JSON-style array of numbers.

### **6\. Redis / Valkey**

* **Format:** **Binary (Blob)**  
* **Syntax:** A byte-array representation of float32/64 values.  
* **Notes:** Redis requires vectors to be stored and queried as raw binary data. Most libraries (and the Toolbox) handle the conversion from a standard list \[0.1, 0.2...\] to binary automatically.

### **7\. Neo4j**

* **Format:** **List of Floats**  
* **Syntax:** \[0.1, 0.2, 0.3\]  
* **Notes:** In Cypher queries, vectors are treated as standard Neo4j Lists. For example: CALL db.index.vector.queryNodes('index\_name', 10, \[0.1, 0.2, 0.3\]).

### **8\. SingleStore**

* **Format:** **JSON Array or Binary**  
* **Syntax:** JSON\_ARRAY\_PACK('\[0.1, 0.2, 0.3\]')  
* **Notes:** SingleStore often uses JSON\_ARRAY\_PACK to convert a string-formatted array into a high-performance binary "blob" for vector operations.

### **9\. Elasticsearch**

* **Format:** **JSON Array**  
* **Syntax:** \[0.1, 0.2, 0.3\]  
* **Notes:** The dense\_vector field type accepts a standard JSON array of numbers.


### Alternatives Considered

_No response_

### Additional Details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support all vector databases #2415

Prerequisites

What are you trying to do that currently feels hard or impossible?

Suggested Solution(s)

Databases Supporting Vectors

Requirements for Enabling Vectors

1. Database-Specific Requirements

Formats:

1. PostgreSQL / AlloyDB / Cloud SQL (Postgres)

2. Google Cloud Spanner

3. BigQuery

4. Google Cloud Firestore

5. MongoDB Atlas

6. Redis / Valkey

7. Neo4j

8. SingleStore

9. Elasticsearch

Alternatives Considered

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support all vector databases #2415

Description

Prerequisites

What are you trying to do that currently feels hard or impossible?

Suggested Solution(s)

Databases Supporting Vectors

Requirements for Enabling Vectors

1. Database-Specific Requirements

Formats:

1. PostgreSQL / AlloyDB / Cloud SQL (Postgres)

2. Google Cloud Spanner

3. BigQuery

4. Google Cloud Firestore

5. MongoDB Atlas

6. Redis / Valkey

7. Neo4j

8. SingleStore

9. Elasticsearch

Alternatives Considered

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions