Skip to content

Support image ingestion and semantic vectorization as a first-class database source #2948

@Deeven-Seru

Description

@Deeven-Seru

Prerequisites

What are you trying to do that currently feels hard or impossible?

I am building AI agent workflows that require searching, clustering, and operating over a large collection of images using semantic similarity. Currently, genai-toolbox supports semantic search and vector operations for text and tabular data, but lacks any image ingestion or image-to-vector pipeline. It is difficult or impossible to enable AI agents to retrieve or analyze images based on content (visual similarity, embedding queries, multimodal prompts, etc.) in a unified toolbox setup.

Suggested Solution(s)

Introduce a new database source and/or toolset for image ingestion and vectorization.

  • Allow users to add (or point to) a directory/bucket of images
  • Use SOTA embedding models or vision APIs (e.g., Hugging Face, CLIP, Google Vision) to extract vector embeddings
  • Store embeddings in supported vector DBs (BigQuery, Neo4j, Elasticsearch, Cloud SQL, etc.)
  • Provide query, retrieval, and filtering tools for image-based semantic search
  • Leverage existing patterns from text/vector tool implementations for seamless integration
  • Expose standard toolbox/agent APIs for multimodal queries (text + image)

Alternatives Considered

Workarounds involve building a parallel image search stack outside of genai-toolbox (e.g., using standalone FAISS, Milvus, or cloud vision vector DBs), which leads to fragmented agent workflows, duplicate effort, and poor integration with toolbox tools, query language, and agent APIs.

Additional Details

Related PRs and issues: #2415 (Support all vector databases), PR #2909 (Cloud SQL Postgres vector tools), PR #2890 (BigQuery semantic search).

This feature would unlock:

  • Visual similarity search for AI agents
  • Multimodal (text+image) analytics
  • Competitive position vs other open source agent/AI stacks
  • Unified dev experience for vision + text + data AI tasks

happy to send a PR 😀

Metadata

Metadata

Assignees

Labels

priority: p3Desirable enhancement or fix. May not be included in next release.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions