Denoising Dirty Document Images

This repository contains code and datasets used for the task of denoising dirty document images as well as a PPT giving an overview of the process. I did this as a project for my university course.

Project Overview

The dataset used for this project is based on a Kaggle competition. Due to file size restrictions on GitHub, I will upload the three zip files contained within the main dataset zip file individually. The competition link can be found here.

The primary objective of this project is to build a robust convolutional autoencoder model that can effectively denoise images, facilitating better readability and interpretation of the documents.

Repository Structure

/denoising-dirty-documents
│
├── /train.zip               # Dirty images used for training
├── /train_cleaned.zip       # Cleaned images used for training
├── /test.zip                # Test images to make predictions
├── DenoisingCAE.ipynb       # Jupyter notebook used for the process.

Acknowledgements

Special thanks to Kaggle for providing the dataset, and to the open-source community for the resources and libraries that made this project possible.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Denoiser_PPT.pdf		Denoiser_PPT.pdf
DenoisingCAE.ipynb		DenoisingCAE.ipynb
README.md		README.md
test.zip		test.zip
train.zip		train.zip
train_cleaned.zip		train_cleaned.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Denoising Dirty Document Images

Project Overview

Repository Structure

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Ranjit2111/Denoising-Dirty-Image-Documents

Folders and files

Latest commit

History

Repository files navigation

Denoising Dirty Document Images

Project Overview

Repository Structure

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages