Skip to content

duongve13112002/ImageRetrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Retrieval Web Application

Overview

Welcome to the Image Retrieval Web Application repository! This project focuses on creating a web-based image retrieval system using Gradio library for the frontend and a combination of Convolutional Neural Networks (CNNs) and Transformers from the Hugging Face Transformers library for feature extraction.

Features

  • Web Interface: The application provides a user-friendly web interface powered by Gradio, allowing users to easily interact with the image retrieval system.

  • Feature Extraction: The backend utilizes state-of-the-art CNNs and Transformers to extract meaningful feature vectors from images. This enables efficient and accurate image retrieval.

  • Model Variety: The project supports various CNN architectures and Transformers, allowing users to choose the best model for their specific needs.

Models

  • In this project, we have implemented 8 models for experimentation. They are as follows:
    • VisionTransformer
    • BEiT
    • MobileViTV2
    • Bit
    • EfficientFormer
    • MobileNetV2
    • ResNet
    • EfficientNet

Datasets

Paris Buildings

  • The "Paris Buildings" dataset comprises a collection of high-resolution images capturing various architectural structures across Paris. The dataset is carefully annotated, providing information on building types, architectural styles, and geographic locations. Researchers and developers can leverage this dataset for tasks such as image classification, object detection, and scene understanding.

  • For this dataset, it is divided into two parts. You can download them here:

  • The groundtruth can be downloaded here.

Oxford Buildings

  • The "Oxford Buildings" dataset focuses on architectural diversity within the city of Oxford. Similar to the Paris dataset, it includes annotated images to facilitate research in computer vision and related fields. This dataset is particularly suitable for tasks involving cross-city analysis or comparative studies between different architectural environments.

  • The images in the Oxford Buildings dataset can be found here.

  • The groundtruth can be downloaded here.

Feature vector comparison algorithm

Euclidean Distance

The Euclidean distance measures the straight-line distance between two points in Euclidean space. For vectors $(X)$ and (Y) of dimension $(n)$, the Euclidean distance is calculated as follows:

$$ ED(X, Y) = \sqrt{\sum_{i=1}^{n} (X_i - Y_i)^2} $$

This method is suitable for scenarios where the absolute magnitude and direction of the vectors are crucial for similarity assessment.

Cosine Similarity

Cosine similarity calculates the cosine of the angle between two vectors in multi-dimensional space. For vectors $(X)$ and $(Y)$ of dimension $(n)$, the cosine similarity is computed as:

$$\text{cosine similarity}(X, Y) = \frac{X \cdot Y}{||X||_2 \cdot ||Y||_2}$$

This method is effective when the direction of vectors is more important than their magnitudes. It is widely used in text mining, document analysis, and recommendation systems.

Video Demo

For a visual representation of the project in action, check out the Video Demo.

Installation

  1. Clone the repository:

    git clone https://github.com/duongve13112002/ImageRetrieval.git
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the notebook (Detailed instructions on how to run are provided in here):

    jupyter notebook notebook.ipynb

    Visit http://localhost:7860 in your web browser to access the Image Retrieval Web Application.

Usage

  1. Upload Image: Use the web interface to upload an image for retrieval.

  2. Retrieve Similar Images: The system will extract features using the selected model and display similar images from the dataset.

Contributing

Contributions are welcome! If you find any issues or have ideas for improvements, please create a GitHub issue or submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thank you for using and contributing to the Image Retrieval Web Application!

About

A simple image retrieval web by CNNs and Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published