Text-to-Image Retrieval System

Overview

A production-ready text-to-image retrieval system designed for fashion product searches. The system enables semantic search across image databases using natural language queries or image inputs, powered by OpenAI's CLIP model and built on a scalable microservices architecture.

Features

Multimodal search supporting both text and image queries
Semantic similarity matching using CLIP embeddings
Scalable vector search with Milvus database
Object storage with MinIO for image management
Real-time monitoring and observability stack
Containerized deployment with Docker Compose
RESTful API and interactive web interface

Architecture

The system consists of the following components:

Model Serving: FastAPI service hosting the CLIP model for text and image encoding
Application Service: Core API handling search requests and result retrieval
Vector Database: Milvus for efficient similarity search on embeddings
Object Storage: MinIO for scalable image storage
Web Interface: Streamlit-based UI for interactive searches
Observability Stack: Prometheus, Grafana, Elasticsearch, and Kibana for monitoring and logging

Requirements

System Requirements

Docker Engine 20.10+
Docker Compose 2.0+
16GB RAM minimum (32GB recommended)
GPU with CUDA support (for model serving)
50GB available disk space

Software Dependencies

All Python dependencies are managed through container images. Key packages include:

FastAPI 0.112.2
PyTorch 1.13.1
Transformers 4.44.1
Milvus 2.4.9
MinIO
Prometheus & Grafana
Elasticsearch & Kibana

Installation

1. Clone the Repository

git clone <repository-url>
cd Text2Image-Retrieval

2. Configure the Network

Navigate to the services directory and create the Docker network:

cd services
make create-network

This creates the text_image_retrieval_network and updates network subnet configurations in service .env files.

3. Setup Volume Directories

make setup-volumes

4. Configure Environment Variables

Create and configure .env files for each service with the following variables:

services/milvus/.env

MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=milvus-data
NETWORK_SUBNET=<NETWORK_SUBNET>

services/minio/.env

MINIO_ROOT_USER=<MINIO_USERNAME>
MINIO_ROOT_PASSWORD=<SYSTEM_PASSWORD>
NETWORK_SUBNET=<NETWORK_SUBNET>

services/model_serving/.env

ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>
MODEL_ID=openai/clip-vit-base-patch32
DEVICE=cuda

services/app/.env

NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000
NETWORK_SUBNET=<NETWORK_SUBNET>

services/streamlit/.env

NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>

services/observability/.env

NETWORK_SUBNET=<NETWORK_SUBNET>
HTTPS_PROXY=<SERVER_PROXY_IF_EXIST>
DISCORD_WEBHOOK_URL=<DISCORD_WEBHOOK_URL>
ELASTIC_PASSWORD=<SYSTEM_PASSWORD>
KIBANA_ENCRYPTION_KEY=<GENERATED_KEY>
GF_SECURITY_ADMIN_USER=<GRAFANA_USERNAME>
GF_SECURITY_ADMIN_PASSWORD=<SYSTEM_PASSWORD>

services/etl/.env

NO_PROXY_HOST=<SERVER_HOST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000

Running the System

1. Start Base Services

Initialize storage, vector database, and model serving:

make setup

2. Load Dataset

The system uses the GLAMI-1M test dataset by default. This step downloads the dataset, generates embeddings, and loads them into Milvus:

make load-dataset

To use a custom dataset, modify the --url and --folder_name parameters in the load-dataset target of the Makefile, and adjust the ETL logic in services/etl/utils.py accordingly. Adjust --batch_size based on your GPU memory (maximum 1000).

3. Start Application Services

make up-server

4. Start Observability Stack

make up-observability

5. Verify Deployment

Check that all containers are running:

docker ps

Accessing the System

Once deployed, the following services are available:

Service	URL	Description
Streamlit UI	`http://<HOST>:8501`	Interactive search interface
API Documentation	`http://<HOST>/docs`	FastAPI Swagger UI
Model Serving API	`http://<HOST>:8000/docs`	CLIP model endpoints
MinIO Console	`http://<HOST>:9021`	Object storage management
Milvus Attu	`http://<HOST>:3000`	Vector database console
Prometheus	`http://<HOST>:9090`	Metrics collection
Grafana	`http://<HOST>:3030`	Metrics visualization
Kibana	`http://<HOST>:5601/app/discover`	Log analysis (username: elastic)

API Usage

Search by Text

curl -X POST "http://<HOST>/search-image?query=red%20dress&top_k=5"

Search by Image

curl -X POST "http://<HOST>/search-image?top_k=5" \
  -F "file=@/path/to/image.jpg"

Stopping the System

To stop all services:

make down

To stop only application services:

make stop

Documentation

For comprehensive setup instructions and system architecture details, refer to the documentation.

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
media_files		media_files
services		services
.gitignore		.gitignore
CHANGES.md		CHANGES.md
DOCUMENTATION_UPDATES.md		DOCUMENTATION_UPDATES.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-to-Image Retrieval System

Overview

Features

Architecture

Requirements

System Requirements

Software Dependencies

Installation

1. Clone the Repository

2. Configure the Network

3. Setup Volume Directories

4. Configure Environment Variables

Running the System

1. Start Base Services

2. Load Dataset

3. Start Application Services

4. Start Observability Stack

5. Verify Deployment

Accessing the System

API Usage

Search by Text

Search by Image

Stopping the System

Documentation

License

About

Uh oh!

Releases

Packages

Languages

License

tuiiitendinh/Text2Image-Retrieval

Folders and files

Latest commit

History

Repository files navigation

Text-to-Image Retrieval System

Overview

Features

Architecture

Requirements

System Requirements

Software Dependencies

Installation

1. Clone the Repository

2. Configure the Network

3. Setup Volume Directories

4. Configure Environment Variables

Running the System

1. Start Base Services

2. Load Dataset

3. Start Application Services

4. Start Observability Stack

5. Verify Deployment

Accessing the System

API Usage

Search by Text

Search by Image

Stopping the System

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages