A production-ready text-to-image retrieval system designed for fashion product searches. The system enables semantic search across image databases using natural language queries or image inputs, powered by OpenAI's CLIP model and built on a scalable microservices architecture.
- Multimodal search supporting both text and image queries
- Semantic similarity matching using CLIP embeddings
- Scalable vector search with Milvus database
- Object storage with MinIO for image management
- Real-time monitoring and observability stack
- Containerized deployment with Docker Compose
- RESTful API and interactive web interface
The system consists of the following components:
- Model Serving: FastAPI service hosting the CLIP model for text and image encoding
- Application Service: Core API handling search requests and result retrieval
- Vector Database: Milvus for efficient similarity search on embeddings
- Object Storage: MinIO for scalable image storage
- Web Interface: Streamlit-based UI for interactive searches
- Observability Stack: Prometheus, Grafana, Elasticsearch, and Kibana for monitoring and logging
- Docker Engine 20.10+
- Docker Compose 2.0+
- 16GB RAM minimum (32GB recommended)
- GPU with CUDA support (for model serving)
- 50GB available disk space
All Python dependencies are managed through container images. Key packages include:
- FastAPI 0.112.2
- PyTorch 1.13.1
- Transformers 4.44.1
- Milvus 2.4.9
- MinIO
- Prometheus & Grafana
- Elasticsearch & Kibana
git clone <repository-url>
cd Text2Image-RetrievalNavigate to the services directory and create the Docker network:
cd services
make create-networkThis creates the text_image_retrieval_network and updates network subnet configurations in service .env files.
make setup-volumesCreate and configure .env files for each service with the following variables:
services/milvus/.env
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=milvus-data
NETWORK_SUBNET=<NETWORK_SUBNET>services/minio/.env
MINIO_ROOT_USER=<MINIO_USERNAME>
MINIO_ROOT_PASSWORD=<SYSTEM_PASSWORD>
NETWORK_SUBNET=<NETWORK_SUBNET>services/model_serving/.env
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>
MODEL_ID=openai/clip-vit-base-patch32
DEVICE=cudaservices/app/.env
NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000
NETWORK_SUBNET=<NETWORK_SUBNET>services/streamlit/.env
NO_PROXY_HOST=<SERVER_HOST>
ENTRY_PROXY=<SERVER_PROXY_IF_EXIST>
NETWORK_SUBNET=<NETWORK_SUBNET>services/observability/.env
NETWORK_SUBNET=<NETWORK_SUBNET>
HTTPS_PROXY=<SERVER_PROXY_IF_EXIST>
DISCORD_WEBHOOK_URL=<DISCORD_WEBHOOK_URL>
ELASTIC_PASSWORD=<SYSTEM_PASSWORD>
KIBANA_ENCRYPTION_KEY=<GENERATED_KEY>
GF_SECURITY_ADMIN_USER=<GRAFANA_USERNAME>
GF_SECURITY_ADMIN_PASSWORD=<SYSTEM_PASSWORD>services/etl/.env
NO_PROXY_HOST=<SERVER_HOST>
MILVUS_HOST=<SERVER_HOST>
MILVUS_PORT=19530
MILVUS_COLLECTION=text_image_retrieval
MILVUS_METRIC_TYPE=COSINE
MILVUS_INDEX_TYPE=HNSW
MINIO_ENDPOINT=<SERVER_HOST>:9020
MINIO_ACCESS_KEY_ID=<MINIO_USERNAME>
MINIO_SECRET_ACCESS_KEY=<SYSTEM_PASSWORD>
MINIO_BUCKET_NAME=text-image-retrieval
MODEL_SERVING_HOST=<SERVER_HOST>
MODEL_SERVING_PORT=8000Initialize storage, vector database, and model serving:
make setupThe system uses the GLAMI-1M test dataset by default. This step downloads the dataset, generates embeddings, and loads them into Milvus:
make load-datasetTo use a custom dataset, modify the --url and --folder_name parameters in the load-dataset target of the Makefile, and adjust the ETL logic in services/etl/utils.py accordingly. Adjust --batch_size based on your GPU memory (maximum 1000).
make up-servermake up-observabilityCheck that all containers are running:
docker psOnce deployed, the following services are available:
| Service | URL | Description |
|---|---|---|
| Streamlit UI | http://<HOST>:8501 |
Interactive search interface |
| API Documentation | http://<HOST>/docs |
FastAPI Swagger UI |
| Model Serving API | http://<HOST>:8000/docs |
CLIP model endpoints |
| MinIO Console | http://<HOST>:9021 |
Object storage management |
| Milvus Attu | http://<HOST>:3000 |
Vector database console |
| Prometheus | http://<HOST>:9090 |
Metrics collection |
| Grafana | http://<HOST>:3030 |
Metrics visualization |
| Kibana | http://<HOST>:5601/app/discover |
Log analysis (username: elastic) |
curl -X POST "http://<HOST>/search-image?query=red%20dress&top_k=5"curl -X POST "http://<HOST>/search-image?top_k=5" \
-F "file=@/path/to/image.jpg"To stop all services:
make downTo stop only application services:
make stopFor comprehensive setup instructions and system architecture details, refer to the documentation.
See LICENSE for details.
