Book Recommender System is a modular and extensible book recommendation engine, featuring a custom training pipeline, robust logging and exception handling, and an interactive Streamlit interface for real-time book recommendations. The project is fully deployed on AWS EC2, providing a scalable, production-ready environment for live usage and demonstrations. Designed with clean architecture and best practices, it supports both experimentation and real-world deployment.
This system recommends books based on collaborative filtering. It includes:
-
A multi-stage pipeline for training the recommender.
-
A Streamlit UI for training the model and getting recommendations.
-
Modular design for easy extension and experimentation.
All components (data ingestion, transformation, training, and inference) are wrapped with logging and exception handling for production readiness.
-
Book Recommendations using Nearest Neighbors on user ratings.
-
Trainable Engine: Execute an end-to-end pipeline for data ingestion to model training.
-
Modular Design: Add or replace pipeline components independently.
-
Robust Logging and Error Management with centralized log tracking.
-
Interactive UI using Streamlit for recommendation queries and pipeline triggers.
-
Model Persistence with pickle-based artifact storage and reusability.
The training pipeline consists of the following steps:
-
Data Ingestion:
Downloads and ingests the dataset using a factory pattern. Only downloads if the file does not exist. -
Data Validation / Preprocessing:
Validates and preprocesses the ingested data. -
Data Transformation:
Transforms the validated data for model training. -
Model Training:
Trains the recommendation model and saves the trained model artifact.
Each step is wrapped in exception handling and logs errors using the internal logging system.
1. Launch an EC2 Instance
- Log in to your AWS Console.
- Launch/Create a Ubuntu-based EC2 instance.
- Configure port 8501 to be open in the security group (for Streamlit access).
2. Connect to the EC2 Instance from inside the AWS console
3. Set Up Docker
# Update system and install dependencies
sudo apt-get update -y
sudo apt-get upgrade -y
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
4. Deploy our project
# Clone your project (replace with your repo)
git clone https://github.com/RohitKrish46/book-recommender-system.git
cd book-recommender-system
# Build and run the Docker image (adviseable to use your docker user_id -> docker build -t {username}/bookapp:latest .)
docker build -t rokrr/bookapp:latest .
docker run -d -p 8501:8501 rokrr/bookapp
5. Access the App
Open your browser and navigate to:
http://<EC2_PUBLIC_IP>:8501
1. Stop/Remove Containers
docker stop <container_id>
docker rm $(docker ps -a -q)
2. Push/Pull Docker Image
docker login
docker push entbappy/stapp:latest # Push to registry
docker pull entbappy/stapp:latest # Pull latest image
1. Install UV (if not already installed):
# bash
pipx install uv
# Using curl
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with pip
python -m pip install uv
2. Create & activate a virtual environment:
uv venv <virtual-env-name>
3. Activate Envoirnment
# Linux/macOS
source <virtual-env-name>/bin/activate
# Windows
.\<virtual-env-name>\Scripts\Activate
4. Clone this repo
git clone https://github.com/RohitKrish46/book-recommender-system.git
Now just get all your content into the uv managed repo
5. Install Dependencies:
uv pip install -r requirements.txt
python main.py
This triggers the entire training pipeline:
- Ingests and validates data
- Builds pivot tables
- Trains Nearest Neighbors model
- Saves trained artifacts for inference
streamlit run app.py
Visit http://localhost:8501 to interact with the UI.
Streamlit UI Features:
- Train Engine: Run training from the UI
- Get Recommendations: Type a book name to see similar recommendations
- Cover Images: Displays book covers alongside titles
- Home page: A page with an introduction to the app
- Train Recommender system: Click the button to freshly train the recommender system
- Get similar recommendations: Choose a book you like
- About this app
-
Logging:
Usesrecommender.logger.log
for consistent logging across modules. -
Exception Handling:
All major operations are wrapped in try/except blocks and raiseAppException
for unified error management. -
Configuration:
Uses anAppConfiguration
object to manage paths for models and serialized objects. -
Artifacts:
Trained models and serialized data (e.g., pivot tables, ratings) are loaded and saved usingpickle
.
book-recommender-system/
├── app.py # Streamlit app interface
├── main.py # Training pipeline trigger
├── recommender/
│ ├── components/ # All modular pipeline steps
│ │ ├── data_ingestion.py
│ │ ├── data_validation.py
│ │ ├── data_transformation.py
│ │ └── model_training.py
│ ├── constants/
│ │ └── __init__.py # constant configs
│ ├── entity/
│ │ └── config_entity.py # Dataclass for configs
│ ├── exception/
│ │ └── exception.py # AppException class
│ ├── logger/
│ │ └── log.py # AppLogger class
│ ├── pipelines/
│ │ └── training_pipeline.py # Orchestrates all components
│ └── utils/
│ └── load_yaml.py # AppConfiguration manager
├── artifacts/
│ ├── dataset/
│ │ ├── clean_data/ # Preprocessed data
│ │ ├── ingested_data/ # Extracted csv's
│ │ ├── raw_data/ # Dataset's raw zip file
│ │ └── transformed_data/ # Pivot files
│ ├── serialized_objects/ # Pickle files
│ ├── trained_model/ # Stores trained model
├── config/
│ │ ├── config.yaml/ # Main Configuration
├── templates/
│ │ ├── book_names.pkl/ # Book Names for Streamlit
├── Dockerfile # Docker Image Config
├── requirements.txt
└── README.md
Programming Language
Machine Learning
Data Manipulation & Visualization
Deployment & Web Framework
Package Manager
Below are some enhancements planned for future versions:
-
Hybrid Recommendation Models: Combine collaborative filtering with content-based filtering using book metadata (genres, authors, descriptions, etc.) for improved personalization.
-
Incorporate NLP Models: Integrate transformer-based models like BERT to analyze book descriptions or user reviews for semantic recommendations.
-
CI/CD Integration: Set up automated testing and deployment workflows using GitHub Actions or similar tools.
-
Graph-based Recommendation: Explore knowledge graph embeddings or user-book interaction graphs to enhance recommendation diversity and explainability.