This Streamlit application helps you discover visually similar groups within your image collection using MediaPipe's Image Embedder and hierarchical clustering. Upload your images, and the application will automatically group them based on visual similarity.
Key features:
- Support for up to 20 images (10MB each)
- Interactive similarity threshold adjustment
- Expandable group results view
- Powered by MediaPipe's MobileNet V3 model
- Streamlit
- MediaPipe
- OpenCV
- Python
- SciPy (Hierarchical Clustering)
The application requires the MediaPipe MobileNet V3 model. Download it using:
powershell -Command "Invoke-WebRequest -Uri https://storage.googleapis.com/mediapipe-models/image_embedder/mobilenet_v3_small/float32/1/mobilenet_v3_small.tflite -OutFile models/mobilenet_v3_small.tflite"-
Clone the repo
git clone https://github.com/pedropcamellon/image-similarity-clustering-engine.git cd image-similarity-clustering-engine -
Install dependencies using uv
uv sync
-
Run the Streamlit app
streamlit run src/app.py
- Open the application in your web browser
- Upload 2-20 images using the file uploader
- Adjust the similarity threshold if needed
- Click "Start Clustering"
- View the results in expandable group sections
-
MediaPipe Image Embedding Pipeline
- Uses MobileNet V3 for feature extraction
-
Clustering Engine
- Hierarchical clustering with customizable threshold
- Cosine similarity measurements
- Automatic group detection
- Caching strategy using
@st.cache_data - Efficient image processing with OpenCV
- Upload limits for optimal performance
- Maximum 20 images
- 10MB per image
This application is released under the GNU General Public License.


