A lightweight content-based music recommendation app built with Python.
It uses TF-IDF vectorization and cosine similarity on song lyrics to suggest similar tracks.
🚀 Deployed with Streamlit for an intuitive, interactive experience.
🎧 Get instant music recommendations with Spotify album covers integrated.
- 🎼 Content-based Recommendations using TF-IDF vectorization on lyrics/text features + Cosine Similarity to suggest similar tracks.
- ⚡ Lightweight & Git-friendly: avoids committing large files (committed files are computed dynamically at runtime)
- 🎨 Streamlit UI: clean design with Spotify album covers
- 🔍 Interactive Search: choose songs from dropdown
├── preprocess.py # Cleans dataset, builds TF-IDF & cosine similarity
├── recommend.py # Loads data & computes song recommendations
├── main.py # Streamlit app interface (UI + Spotify album covers)
├── requirements.txt # Python dependencies
├── README.md # Documentation
This project uses the Spotify Million Song Dataset (Lyrics Data) from Kaggle.
- The dataset contains song lyrics and metadata (artist, track, link).
- In this project:
- The
text(lyrics) column is cleaned and preprocessed with NLTK. - A TF-IDF matrix is built from the cleaned text.
- Cosine similarity is calculated to recommend top-N similar songs.
- The
Before building the recommender, the raw dataset required preprocessing:
- 🔤 Removed links, numbers & special characters
- ✂️ Converted text to lowercase, tokenized, and removed stopwords
- 📐 Built a TF-IDF matrix (max 5000 features)
- 🔗 Computed cosine similarity between song vectors
👉 For detailed preprocessing and exploratory data analysis, see the notebook:
Music_Recommendation_System.ipynb
-
Clone the Repository
git clone https://github.com/Sonalikasingh17/Music_Recommender_System.git cd Music_Recommender_System -
Install Dependencies
pip install -r requirements.txt
-
Download Dataset
Download the dataset from Kaggle:
-
Run Preprocessing
python preprocess.py
- This script cleans the dataset and create these files in project folder:
df_cleaned.pkltfidf_matrix.pklcosine_sim.pkl
- Launch the Streamlit app
Open the local URL in your browser to explore the app.
streamlit run main.py
- Autocomplete Search Bar: Enhance the select box with st.selectbox(..., help="Type to search...") or use st.text_input() + fuzzy matching for better UX.
- Custom Styling: Modify CSS in main.py to refine recommendation cards (colors, fonts, spacing).
- Data Customization: Replace df['text'] preprocessing logic to work with lyrics, metadata, or genre features.
This project was built with ❤️ using scikit-learn, pandas, NLTK, and Streamlit,
inspired by common TF-IDF + cosine similarity recommender patterns for content-based filtering.
Special thanks to the Spotify Million Song Dataset.
✨ Enjoy exploring music recommendations with your interactive, lightweight app! ✨
