Skip to content

Olatokunbo360/Investigate-a-Movie-Dataset

Repository files navigation

Overview

This repository contains a data analysis project focused on exploring and analyzing a dataset of movies sourced from the Internet Movie Database (IMDB).This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The aim of this exercise is to analyse the data and to find answers to some questions that can give us insight into the data.

Dataset

The dataset used for this analysis is sourced from Kaggle and contains comprehensive information about a wide range of movies. It includes attributes such as:

Title: The title of the movie.

Genre: The genre(s) associated with the movie.

Rating: The average rating given to the movie by users.

Budget: The budget allocated for making the movie.

Revenue: The revenue generated by the movie.

Runtime: The duration of the movie in minutes.

Release Year: The year in which the movie was released.

The dataset provides an opportunity to explore trends, patterns, and insights within the realm of movies, including popular genres, budget vs. revenue analysis, and more.

Files

tmdb-movies.csv: The main dataset file in CSV format.

Investigating the TMDB Database.ipynb: Jupyter Notebook containing the Python code for data analysis.

Investigating the TMDB Database.html: HTML version of the Jupyter Notebook for easier viewing.

README.md: This file, providing an overview of the project and instructions for use.

requirements.txt: This file provides the list of the Python libraries used for this project.

Dependencies

The analysis is conducted using Python programming language and several libraries including:

Pandas: For data manipulation and analysis.

Matplotlib: For data visualization.

Seaborn: For statistical data visualization.

NumPy: For numerical computing.

Usage

To run the analysis on your local machine, follow these steps:

Clone this repository to your local machine using the following command:

bash

Copy code git clone https://github.com/Olatokunbo360/Investigate-a-Movie-Dataset.git

Navigate to the project directory:

bash

Copy code cd Investigate-a-Movie-Dataset

Install the required dependencies. You can use the following command to install dependencies using pip:

Copy code pip install -r requirements.txt

Once the dependencies are installed, you can open the Jupyter Notebook data_analysis.ipynb to view the analysis and execute the code cells.

Follow the instructions provided in the notebook to explore the dataset, analyze trends, and visualize insights related to TMDB movies.

Contributing

Contributions to this project are welcome. If you have suggestions for improvements or new analyses, please feel free to open an issue or submit a pull request.

By [Yusuf Sanni] - [yusufsanni2003@yahoo.co.uk]

About

This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The aim of this exercise is to analyse the data and to find answers to some questions that can give us insight into the data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors