Skip to content

Developed a robust system for object recognition and reconstruction based on latent representations, with an exploration of noise injection effects.

License

Notifications You must be signed in to change notification settings

Dherya27/Retrieval-based-Object-Recognition-and-Reconstruction-via-VAE-and-FAISS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Retrieval-based-Object-Recognition-and-Reconstruction-via-VAE-and-FAISS

Computer Vision FAISS Generative AI CLIP vae Image Reconstruction Robotics Application Cosine Similarity Noise Injection

Project Overview

Among the most impactful technologies, retrieval-based methods and generative models have demonstrated remarkable capabilities in image retrieval, object recognition, and image reconstruction. This project, titled "Retrieval-based Object Recognition and Reconstruction via VAE and FAISS," aims to leverage these technologies to create a robust system capable of recognizing and reconstructing objects based on their latent representations.

The core of this project lies in the integration of Variational Autoencoders (VAEs) with Facebook AI Similarity Search (FAISS). VAEs are powerful generative models that can learn to encode images into a continuous latent space, capturing essential features and structures. By manipulating these latent representations, VAEs can generate new images or reconstruct existing ones with high fidelity. FAISS, on the other hand, is a state-of-the-art library for efficient similarity search and clustering of high-dimensional vectors. It excels in quickly finding the most similar vectors within large datasets, making it ideal for image retrieval tasks.

Table of Contents

  1. Introduction
  2. Architecture
  3. Installation
  4. Usage
  5. Results
  6. Limitation
  7. Future Work

Introduction

This project introduces a novel approach where VAEs are used to encode images into latent vectors, which are then stored and organized using FAISS. When a query image is provided, its latent representation is computed and FAISS retrieves the most similar latent vectors from the dataset. These retrieved latent vectors are subsequently decoded by the VAE to reconstruct the corresponding images. Additionally, by manipulating the latent space with varying sigma values, the project explores the reconstruction of images with different levels of noise and variation, providing insights into the generative capabilities of VAEs.

This project holds significant potential across various applications, including medical imaging, autonomous vehicles, robotics, e-commerce, and more, by addressing practical challenges and contributing to the advancement of AI technologies.

Architecture

Installation and Usage

Results

Training Plot

Reconstruction Across Training Epochs

Query Image 1

Retrieval-based Reconstructed Images (k=5) from FAISS VectorDB

Reconstruction by Manipulating Latent Space with Noise Injection

Query Image 2

Retrieval-based Reconstructed Images (k=5) from FAISS VectorDB

Reconstruction by Manipulating Latent Space with Noise Injection

About

Developed a robust system for object recognition and reconstruction based on latent representations, with an exploration of noise injection effects.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published