|
1 |
| -# deep_recommender |
| 1 | +# Movie Recommendation System |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This project implements a scalable Movie Recommendation System that leverages a multi-stage pipeline with ETL, model training, and an API for serving recommendations. The entire system is containerized using Docker and includes load testing using Locust. |
| 6 | + |
| 7 | +## Features |
| 8 | + |
| 9 | +- **ETL Pipeline**: Extracts and transforms movie data, then saves it to an S3-compatible storage (MinIO). |
| 10 | +- **Model Training**: Loads data from S3, trains the recommendation model, and saves the trained model back to S3. |
| 11 | +- **API Service**: A FastAPI-based service that loads the model from S3 and serves movie recommendations. |
| 12 | +- **Load Testing**: Uses Locust to test the API's performance under load. |
| 13 | + |
| 14 | +## Project Structure |
| 15 | + |
| 16 | +- `Dockerfile`: Configuration for building the application container. |
| 17 | +- `docker-compose.yml`: Defines services for ETL, model training, API, and MinIO. |
| 18 | +- `locustfile.py`: Script for load testing the API. |
| 19 | +- `src/`: Source code for ETL, training, and API. |
| 20 | +- `data/`: Initial data files used by the ETL process. |
| 21 | + |
| 22 | +## Installation and Setup |
| 23 | + |
| 24 | +1. **Clone the Repository**: |
| 25 | + ```bash |
| 26 | + git clone https://github.com/olawale0254/Movie_recommendation.git |
| 27 | + cd Movie_recommendation |
| 28 | + ``` |
| 29 | + |
| 30 | +2. ** Environment Variables: Create a `.env` file in the root directory with your MinIO credentials: |
| 31 | + ```bash |
| 32 | + S3_ACCESS_KEY=<your-access-key> |
| 33 | + S3_SECRET_KEY=<your-secret-key> |
| 34 | + ``` |
| 35 | +3. ** Build and Run the Application: |
| 36 | + ```bash |
| 37 | + docker-compose up --build |
| 38 | + ``` |
| 39 | +4. Access the Services: |
| 40 | + - API: `http://localhost:8000` |
| 41 | + - MinIO Console: `http://localhost:9001` |
| 42 | +## Pipeline Overview |
| 43 | +- ETL Service: Loads and processes movie data, then stores it in MinIO (S3-compatible storage). |
| 44 | +- Trainer Service: Retrieves the processed data from MinIO, trains the model, and saves the model back to MinIO. |
| 45 | +- API Service: Loads the trained model from MinIO and provides movie recommendations via RESTful endpoints. |
| 46 | + |
| 47 | +## Load Testing |
| 48 | +Use Locust to test the API's performance: |
| 49 | +```bash |
| 50 | +locust -f locustfile.py --host=http://localhost:8000 |
| 51 | +``` |
0 commit comments