Book Recommender System

Book Recommender System is a modular and extensible book recommendation engine, featuring a custom training pipeline, robust logging and exception handling, and an interactive Streamlit interface for real-time book recommendations. The project is fully deployed on AWS EC2, providing a scalable, production-ready environment for live usage and demonstrations. Designed with clean architecture and best practices, it supports both experimentation and real-world deployment.

Overview

This system recommends books based on collaborative filtering. It includes:

A multi-stage pipeline for training the recommender.
A Streamlit UI for training the model and getting recommendations.
Modular design for easy extension and experimentation.

All components (data ingestion, transformation, training, and inference) are wrapped with logging and exception handling for production readiness.

Features

Book Recommendations using Nearest Neighbors on user ratings.
Trainable Engine: Execute an end-to-end pipeline for data ingestion to model training.
Modular Design: Add or replace pipeline components independently.
Robust Logging and Error Management with centralized log tracking.
Interactive UI using Streamlit for recommendation queries and pipeline triggers.
Model Persistence with pickle-based artifact storage and reusability.

Architecture and Workflow

Book Recommendation Architecture

Model Training Pipeline Workflow

Pipeline Workflow

The training pipeline consists of the following steps:

Data Ingestion:
Downloads and ingests the dataset using a factory pattern. Only downloads if the file does not exist.
Data Validation / Preprocessing:
Validates and preprocesses the ingested data.
Data Transformation:
Transforms the validated data for model training.
Model Training:
Trains the recommendation model and saves the trained model artifact.

Each step is wrapped in exception handling and logs errors using the internal logging system.

AWS EC2 Deployment Guide

Deploying the Streamlit App on AWS EC2

1. Launch an EC2 Instance

Log in to your AWS Console.
Launch/Create a Ubuntu-based EC2 instance.
Configure port 8501 to be open in the security group (for Streamlit access).

2. Connect to the EC2 Instance from inside the AWS console

3. Set Up Docker

# Update system and install dependencies
sudo apt-get update -y
sudo apt-get upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker

4. Deploy our project

# Clone your project (replace with your repo)
git clone https://github.com/RohitKrish46/book-recommender-system.git
cd book-recommender-system

# Build and run the Docker image (adviseable to use your docker user_id -> docker build -t {username}/bookapp:latest .)
docker build -t rokrr/bookapp:latest .
docker run -d -p 8501:8501 rokrr/bookapp

5. Access the App Open your browser and navigate to: http://<EC2_PUBLIC_IP>:8501

Additional commands (Optional)

1. Stop/Remove Containers

docker stop <container_id>
docker rm $(docker ps -a -q)

2. Push/Pull Docker Image

docker login
docker push entbappy/stapp:latest   # Push to registry
docker pull entbappy/stapp:latest    # Pull latest image

Usage

Initial Setup Using UV (Recommended)

1. Install UV (if not already installed):

# bash
pipx install uv

# Using curl
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or with pip
python -m pip install uv

2. Create & activate a virtual environment:

uv venv <virtual-env-name>

3. Activate Envoirnment

# Linux/macOS
source <virtual-env-name>/bin/activate

# Windows
.\<virtual-env-name>\Scripts\Activate

4. Clone this repo

git clone https://github.com/RohitKrish46/book-recommender-system.git

Now just get all your content into the uv managed repo

5. Install Dependencies:

uv pip install -r requirements.txt

Train the Recommendation Engine

python main.py

This triggers the entire training pipeline:

Ingests and validates data
Builds pivot tables
Trains Nearest Neighbors model
Saves trained artifacts for inference

Run the Streamlit App

streamlit run app.py

Visit http://localhost:8501 to interact with the UI.

Streamlit UI Features:

Train Engine: Run training from the UI
Get Recommendations: Type a book name to see similar recommendations
Cover Images: Displays book covers alongside titles

Application Screenshots

Home page: A page with an introduction to the app
Train Recommender system: Click the button to freshly train the recommender system
Get similar recommendations: Choose a book you like
About this app

Internal Conventions

Logging:
Uses recommender.logger.log for consistent logging across modules.
Exception Handling:
All major operations are wrapped in try/except blocks and raise AppException for unified error management.
Configuration:
Uses an AppConfiguration object to manage paths for models and serialized objects.
Artifacts:
Trained models and serialized data (e.g., pivot tables, ratings) are loaded and saved using pickle.

Folder Structure

book-recommender-system/
├── app.py                           # Streamlit app interface
├── main.py                          # Training pipeline trigger
├── recommender/
│   ├── components/                  # All modular pipeline steps
│   │   ├── data_ingestion.py
│   │   ├── data_validation.py
│   │   ├── data_transformation.py
│   │   └── model_training.py
│   ├── constants/
│   │   └── __init__.py              # constant configs
│   ├── entity/
│   │   └── config_entity.py         # Dataclass for configs
│   ├── exception/
│   │   └── exception.py             # AppException class
│   ├── logger/
│   │   └── log.py                   # AppLogger class
│   ├── pipelines/
│   │   └── training_pipeline.py     # Orchestrates all components
│   └── utils/
│       └── load_yaml.py             # AppConfiguration manager
├── artifacts/                     
│   ├── dataset/                 
│   │   ├── clean_data/              # Preprocessed data
│   │   ├── ingested_data/           # Extracted csv's
│   │   ├── raw_data/                # Dataset's raw zip file
│   │   └── transformed_data/        # Pivot files
│   ├── serialized_objects/          # Pickle files
│   ├── trained_model/               # Stores trained model
├── config/
│   │   ├── config.yaml/             # Main Configuration
├── templates/
│   │   ├── book_names.pkl/          # Book Names for Streamlit
├── Dockerfile                       # Docker Image Config
├── requirements.txt
└── README.md

Technologies Stack

Programming Language

Machine Learning

Data Manipulation & Visualization

Deployment & Web Framework

Package Manager

Future Improvements

Below are some enhancements planned for future versions:

Hybrid Recommendation Models: Combine collaborative filtering with content-based filtering using book metadata (genres, authors, descriptions, etc.) for improved personalization.
Incorporate NLP Models: Integrate transformer-based models like BERT to analyze book descriptions or user reviews for semantic recommendations.
CI/CD Integration: Set up automated testing and deployment workflows using GitHub Actions or similar tools.
Graph-based Recommendation: Explore knowledge graph embeddings or user-book interaction graphs to enhance recommendation diversity and explainability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Book Recommender System

Table of Contents

Overview

Features

Architecture and Workflow

Book Recommendation Architecture

Model Training Pipeline Workflow

Pipeline Workflow

AWS EC2 Deployment Guide

Deploying the Streamlit App on AWS EC2

Additional commands (Optional)

Usage

Initial Setup Using UV (Recommended)

Train the Recommendation Engine

Run the Streamlit App

Application Screenshots

Internal Conventions

Folder Structure

Technologies Stack

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
analysis		analysis
artifacts		artifacts
config		config
recommender		recommender
templates		templates
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

RohitKrish46/book-recommender-system

Folders and files

Latest commit

History

Repository files navigation

Book Recommender System

Table of Contents

Overview

Features

Architecture and Workflow

Book Recommendation Architecture

Model Training Pipeline Workflow

Pipeline Workflow

AWS EC2 Deployment Guide

Deploying the Streamlit App on AWS EC2

Additional commands (Optional)

Usage

Initial Setup Using UV (Recommended)

Train the Recommendation Engine

Run the Streamlit App

Application Screenshots

Internal Conventions

Folder Structure

Technologies Stack

Future Improvements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages