Skip to content

tuiiitendinh/Text2Image-Retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-to-Image Retrieval System

Overview

A production-ready text-to-image retrieval system designed for fashion product searches. The system enables semantic search across image databases using natural language queries or image inputs, powered by OpenAI's CLIP model and built on a scalable microservices architecture.

system_architecture

Features

  • Multimodal search supporting both text and image queries
  • Semantic similarity matching using CLIP embeddings
  • Scalable vector search with Milvus database
  • Object storage with MinIO for image management
  • Real-time monitoring and observability stack
  • Containerized deployment with Docker Compose
  • RESTful API and interactive web interface

Architecture

The system consists of the following components:

  • Model Serving: FastAPI service hosting the CLIP model for text and image encoding
  • Application Service: Core API handling search requests and result retrieval
  • Vector Database: Milvus for efficient similarity search on embeddings
  • Object Storage: MinIO for scalable image storage
  • Web Interface: Streamlit-based UI for interactive searches
  • Observability Stack: Prometheus, Grafana, Elasticsearch, and Kibana for monitoring and logging

Requirements

System Requirements

  • Docker Engine 20.10+
  • Docker Compose 2.0+
  • 16GB RAM minimum (32GB recommended)
  • GPU with CUDA support (for model serving)
  • 50GB available disk space

Software Dependencies

All Python dependencies are managed through container images. Key packages include:

  • FastAPI 0.112.2
  • PyTorch 1.13.1
  • Transformers 4.44.1
  • Milvus 2.4.9
  • MinIO
  • Prometheus & Grafana
  • Elasticsearch & Kibana

Installation

1. Clone the Repository

git clone <repository-url>
cd Text2Image-Retrieval

2. Configure the Network

Navigate to the services directory and create the Docker network:

cd services
make create-network

This creates the text_image_retrieval_network and updates network subnet configurations in service .env files.

3. Setup Volume Directories

make setup-volumes

4. Configure Environment Variables

Create and configure .env files for each service with the following variables:

services/milvus/.env

MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=milvus-data
NETWORK_SUBNET=<NETWORK_SUBNET>

services/minio/.env

MINIO_ROOT_USER=<MINIO_USERNAME>
MINIO_ROOT_PASSWORD=<SYSTEM_PASSWORD>
NETWORK_SUBNET=<NETWORK_SUBNET>

services/model_serving/.env

ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>
MODEL_ID=openai/clip-vit-base-patch32
DEVICE=cuda

services/app/.env

NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000
NETWORK_SUBNET=<NETWORK_SUBNET>

services/streamlit/.env

NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>

services/observability/.env

NETWORK_SUBNET=<NETWORK_SUBNET>
HTTPS_PROXY=<SERVER_PROXY_IF_EXIST>
DISCORD_WEBHOOK_URL=<DISCORD_WEBHOOK_URL>
ELASTIC_PASSWORD=<SYSTEM_PASSWORD>
KIBANA_ENCRYPTION_KEY=<GENERATED_KEY>
GF_SECURITY_ADMIN_USER=<GRAFANA_USERNAME>
GF_SECURITY_ADMIN_PASSWORD=<SYSTEM_PASSWORD>

services/etl/.env

NO_PROXY_HOST=<SERVER_HOST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000

Running the System

1. Start Base Services

Initialize storage, vector database, and model serving:

make setup

2. Load Dataset

The system uses the GLAMI-1M test dataset by default. This step downloads the dataset, generates embeddings, and loads them into Milvus:

make load-dataset

To use a custom dataset, modify the --url and --folder_name parameters in the load-dataset target of the Makefile, and adjust the ETL logic in services/etl/utils.py accordingly. Adjust --batch_size based on your GPU memory (maximum 1000).

3. Start Application Services

make up-server

4. Start Observability Stack

make up-observability

5. Verify Deployment

Check that all containers are running:

docker ps

Accessing the System

Once deployed, the following services are available:

Service URL Description
Streamlit UI http://<HOST>:8501 Interactive search interface
API Documentation http://<HOST>/docs FastAPI Swagger UI
Model Serving API http://<HOST>:8000/docs CLIP model endpoints
MinIO Console http://<HOST>:9021 Object storage management
Milvus Attu http://<HOST>:3000 Vector database console
Prometheus http://<HOST>:9090 Metrics collection
Grafana http://<HOST>:3030 Metrics visualization
Kibana http://<HOST>:5601/app/discover Log analysis (username: elastic)

API Usage

Search by Text

curl -X POST "http://<HOST>/search-image?query=red%20dress&top_k=5"

Search by Image

curl -X POST "http://<HOST>/search-image?top_k=5" \
  -F "file=@/path/to/image.jpg"

Stopping the System

To stop all services:

make down

To stop only application services:

make stop

Documentation

For comprehensive setup instructions and system architecture details, refer to the documentation.

License

See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published