Overview
This repository contains a data analysis project focused on exploring and analyzing a dataset of movies sourced from the Internet Movie Database (IMDB).This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The aim of this exercise is to analyse the data and to find answers to some questions that can give us insight into the data.
Dataset
The dataset used for this analysis is sourced from Kaggle and contains comprehensive information about a wide range of movies. It includes attributes such as:
Title: The title of the movie.
Genre: The genre(s) associated with the movie.
Rating: The average rating given to the movie by users.
Budget: The budget allocated for making the movie.
Revenue: The revenue generated by the movie.
Runtime: The duration of the movie in minutes.
Release Year: The year in which the movie was released.
The dataset provides an opportunity to explore trends, patterns, and insights within the realm of movies, including popular genres, budget vs. revenue analysis, and more.
Files
tmdb-movies.csv: The main dataset file in CSV format.
Investigating the TMDB Database.ipynb: Jupyter Notebook containing the Python code for data analysis.
Investigating the TMDB Database.html: HTML version of the Jupyter Notebook for easier viewing.
README.md: This file, providing an overview of the project and instructions for use.
requirements.txt: This file provides the list of the Python libraries used for this project.
Dependencies
The analysis is conducted using Python programming language and several libraries including:
Pandas: For data manipulation and analysis.
Matplotlib: For data visualization.
Seaborn: For statistical data visualization.
NumPy: For numerical computing.
Usage
To run the analysis on your local machine, follow these steps:
Clone this repository to your local machine using the following command:
bash
Copy code git clone https://github.com/Olatokunbo360/Investigate-a-Movie-Dataset.git
Navigate to the project directory:
bash
Copy code cd Investigate-a-Movie-Dataset
Install the required dependencies. You can use the following command to install dependencies using pip:
Copy code pip install -r requirements.txt
Once the dependencies are installed, you can open the Jupyter Notebook data_analysis.ipynb to view the analysis and execute the code cells.
Follow the instructions provided in the notebook to explore the dataset, analyze trends, and visualize insights related to TMDB movies.
Contributing
Contributions to this project are welcome. If you have suggestions for improvements or new analyses, please feel free to open an issue or submit a pull request.
By [Yusuf Sanni] - [yusufsanni2003@yahoo.co.uk]