Metta's Movie API is a Python-based project designed to extract movie data from an external API (TMDB), process and filter it, and then load it into a PostgreSQL database. It provides an API endpoint to access movie data and genres. The project utilizes Flask for the API server, PostgreSQL for database management, and Docker for containerization.
- Python
- Flask
- PostgreSQL
- Docker
-
Clone the repository:
git clone [email protected]:MettaSurendhar/DataEngineeringProject.git -
Install dependencies:
pip install -r requirements.txt -
Set up PostgreSQL database using Docker:
docker-compose up -d -
Create a
.envfile and add the required environment variables:API_KEY=<your_api_key> GENRE_LIST_API=<genre_list_api_endpoint> MOVIE_LIST_API=<movie_list_api_endpoint> DB_HOST=<database_host> DB_NAME=<database_name> DB_USER=<database_user> DB_PASSWORD=<database_password> DB_PORT=<database_port> ENGINE_PASSWORD=<engine_password> -
Run the Python scripts seperately to extract, filter, and load movie data into the database:
python dataExtraction.py python dataTransformation.py python dataLoad.py -
Run the Flask application to start the API server:
python app.py -
Instead can run the .sh file to extract, filter, and load data and run Flask app :
bash ./entryPoint.sh
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name). - Make your changes.
- Test your changes thoroughly.
- Commit your changes (
git commit -am 'Add new feature'). - Push to the branch (
git push origin feature/your-feature-name). - Create a new Pull Request.
- Ensure that the necessary environment variables are properly set up in the
.envfile for the project to function correctly. - The project utilizes Docker for containerization, making it easy to set up the development environment.
- Review the
requirements.txtfile for all dependencies used in the project.





