Music Genre Prediction

Project Overview

This project explores the relationship between a song’s measurable attributes and its genre, using machine learning techniques to predict music genres based on audio features and popularity data. The motivation stems from the importance of music categorization for streaming platforms, recommendation systems, and the broader music industry.

Data

Source: Kaggle - Prediction of music genre
Observations: 50,005 songs
Features: Audio characteristics (danceability, energy, key, loudness, mode, speechiness, acousticness, instrumentalness, liveness, valence, tempo, duration), popularity, and genre (target)

Data Cleaning

Removed rows with missing or invalid values (e.g., nulls, non-numeric tempo, non-positive duration)
Dropped high-entropy and irrelevant columns (e.g., instance_id, obtained_date, track_name)
Reset dataset indices after cleaning

Exploratory Data Analysis (EDA)

Compared feature distributions across genres
Identified features with high variance by genre: popularity, acousticness, danceability, energy, instrumentalness, speechiness, valence
Determined less useful features: duration, liveness, loudness, tempo

Modeling

Built and evaluated three classification models:
- Logistic Regression
- Decision Tree
- Gaussian Naive Bayes
Used different feature sets to assess impact on prediction accuracy
Best model: Logistic Regression (f1-score: 0.50)
Found that including 'popularity' and high-variance features improved accuracy

Conclusions

Audio features and popularity can be used to predict music genre, but accuracy is limited by feature overlap and genre ambiguity
Logistic Regression outperformed other models in this study
Further improvements could include better parameter tuning, feature engineering, and deeper exploration of feature relationships

Team Members

Caitlyn Li
Aammya Sapra
Jackie Wang
Mishari Almusallam
Yiting Wang

License

This project is made available to the public as per the permissions specified in the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitattributes		.gitattributes
FinalProject_Group134_WI24.ipynb		FinalProject_Group134_WI24.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Music Genre Prediction

Project Overview

Data

Data Cleaning

Exploratory Data Analysis (EDA)

Modeling

Conclusions

Team Members

License

About

Uh oh!

Releases

Packages

Languages

jaw039/Music-Genre-Prediction

Folders and files

Latest commit

History

Repository files navigation

Music Genre Prediction

Project Overview

Data

Data Cleaning

Exploratory Data Analysis (EDA)

Modeling

Conclusions

Team Members

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages