Provisions a Vertex AI Vector Search (Matching Engine) index with an endpoint and deployed index, optionally with a GCS bucket for durable embedding storage.
Designed for RAG (Retrieval-Augmented Generation) workloads where teams need to store and retrieve vector embeddings for semantic search.
For the Go client library that connects to the infrastructure this module provisions, see the RAG library documentation.
- Creates a Vertex AI Vector Search index with configurable dimensions, distance measure, and Tree-AH algorithm parameters
- Deploys the index to a public endpoint with configurable machine type and replica count
- Optionally creates a GCS bucket for durable embedding storage (dual-write for re-embedding on model upgrades)
- IAM bindings for authorized service accounts (
roles/aiplatform.user+ GCS access) - Consistent labeling with team, product, and custom labels
module "build_failures_index" {
source = "github.com/chainguard-dev/terraform-infra-reconcilers//modules/vertex-ai-vector-search"
name = "build-failures"
project = "my-gcp-project"
region = "us-central1"
team = "eng-sus-tools"
dimensions = 3072 # gemini-embedding-001
authorized_service_accounts = [
google_service_account.rag_mcp.email,
]
}
# Use outputs in your MCP server CORPORA_CONFIG:
# module.build_failures_index.index_id
# module.build_failures_index.deployed_index_id
# module.build_failures_index.public_endpoint_domain_nameEach embedding model produces vectors with a specific dimensionality (e.g.,
3072 for gemini-embedding-001, 768 for text-embedding-005). Since a
Matching Engine index is configured with fixed dimensions, each corpus that
uses a different embedding model requires its own index.
Call this module once per corpus:
module "build_failures" {
source = "github.com/chainguard-dev/terraform-infra-reconcilers//modules/vertex-ai-vector-search"
name = "build-failures"
dimensions = 3072 # gemini-embedding-001
# ...
}
module "advisories" {
source = "github.com/chainguard-dev/terraform-infra-reconcilers//modules/vertex-ai-vector-search"
name = "advisories"
dimensions = 768 # text-embedding-005
# ...
}If multiple corpora share the same model and dimensions, you can use a single index with Vertex AI restrict tokens to partition data by corpus at the application level.
No requirements.
| Name | Version |
|---|---|
| n/a |
No modules.
| Name | Type |
|---|---|
| google_project_iam_member.aiplatform_user | resource |
| google_storage_bucket.embeddings | resource |
| google_storage_bucket_iam_member.gcs_writer | resource |
| google_vertex_ai_index.this | resource |
| google_vertex_ai_index_endpoint.this | resource |
| google_vertex_ai_index_endpoint_deployed_index.this | resource |
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| approximate_neighbors_count | Default number of approximate neighbors to return during search. | number |
150 |
no |
| authorized_service_accounts | List of Google service account emails to grant roles/aiplatform.user and GCS access. | list(string) |
[] |
no |
| create_gcs_bucket | Create a GCS bucket for durable embedding storage. Set to false to bring your own bucket. | bool |
true |
no |
| deletion_protection | When true, prevents the GCS bucket from being destroyed with objects in it. | bool |
true |
no |
| description | Human-readable description for the index. | string |
"" |
no |
| dimensions | Number of dimensions for embedding vectors. Must match the embedding model (e.g. 3072 for gemini-embedding-001, 768 for text-embedding-005). | number |
n/a | yes |
| distance_measure_type | Distance measure for vector similarity. One of: COSINE_DISTANCE, SQUARED_L2_DISTANCE, L1_DISTANCE, DOT_PRODUCT_DISTANCE. | string |
"COSINE_DISTANCE" |
no |
| feature_norm_type | Feature normalization type. Use UNIT_L2_NORM with COSINE_DISTANCE for best results. | string |
"UNIT_L2_NORM" |
no |
| gcs_bucket_name | Name for the GCS bucket. Defaults to '{project}-{name}-embeddings' when create_gcs_bucket is true. When create_gcs_bucket is false, the caller is responsible for managing IAM on their own bucket. | string |
"" |
no |
| gcs_lifecycle_age_days | Number of days before objects in the embeddings bucket are deleted. Set to 0 to disable lifecycle rules. | number |
0 |
no |
| labels | Additional labels to apply to resources. | map(string) |
{} |
no |
| leaf_node_embedding_count | Number of embeddings per leaf node in the Tree-AH index. More embeddings per leaf = smaller index but slower search. | number |
1000 |
no |
| leaf_nodes_to_search_percent | Percentage of leaf nodes to search (1-100). Higher = better recall, slower search. | number |
10 |
no |
| machine_type | Machine type for serving the deployed index. | string |
"e2-standard-16" |
no |
| max_replica_count | Maximum number of replicas for the deployed index. | number |
1 |
no |
| min_replica_count | Minimum number of replicas for the deployed index. | number |
1 |
no |
| name | Base name for all resources (index, endpoint, bucket). Lowercase letters, numbers, and hyphens. | string |
n/a | yes |
| product | Product label to apply to resources. | string |
"unknown" |
no |
| project | GCP project ID. | string |
n/a | yes |
| region | GCP region for the index and endpoint. | string |
n/a | yes |
| team | Team label to apply to resources (replaces deprecated 'squad'). | string |
n/a | yes |
| Name | Description |
|---|---|
| deployed_index_id | ID of the deployed index within the endpoint. |
| gcs_bucket_name | Name of the GCS bucket for embedding storage. Empty if create_gcs_bucket is false. |
| index_endpoint_id | Fully-qualified resource name of the index endpoint. |
| index_id | Fully-qualified resource name of the Vertex AI index. |
| public_endpoint_domain_name | Public domain name for gRPC queries to the deployed index. |