Skip to content
View bonnilee's full-sized avatar

Block or report bonnilee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
bonnilee/README.md

Data Scientist | Python • Machine Learning • SQL • Data Analysis • Visualization

Hi! I’m an aspiring Data Scientist with strong skills in Python, SQL, and end-to-end project development. I specialize in turning raw data into insights through data cleaning, exploratory analysis, predictive modeling, and visualization. I enjoy solving meaningful problems, building analytical tools, and applying data-driven thinking to real-world challenges.

Actively seeking opportunities as a Data Scientist, Machine Learning Engineer, or Data Analyst.


🛠 Technical Skills

Languages

  • Python
  • SQL

Data Science & Machine Learning

  • Pandas • NumPy • Scikit-learn • SciPy
  • Data Cleaning & Wrangling
  • Regression, Classification, Clustering
  • Time Series Analysis
  • Feature Engineering
  • Model Evaluation & Cross-Validation

Data Visualization

  • Matplotlib
  • Seaborn
  • Plotly

Databases & Tools

  • SQLite
  • Jupyter Notebook • Google Colab
  • Git & GitHub
  • VS Code
  • Flask

Concepts

  • RFM Analysis
  • Customer Segmentation
  • Window Functions
  • CTEs & SQL Analytics
  • EDA & Feature Analysis
  • Dashboarding / Reporting

📁 Featured Projects

1️⃣ SQL Customer Segmentation Project

Tech: SQL, SQLite, SQLite CLI, VS Code
Concepts: Customer segmentation, RFM analysis, window functions, conditional logic

This project demonstrates customer analytics using SQL. It performs customer segmentation based on purchasing behavior, implements RFM scoring, ranks high-value customers, and analyzes spending trends using moving averages.

Advanced SQL techniques used include CTEs, window functions, and conditional statements.
Insights from this project can be used for targeted marketing, loyalty programs, and personalized customer experiences.

🔗 *GitHub Repo: www.github.com/bonnilee/sql-customer-segmentation


2️⃣ PDF Word Search Flask App

Tech: Python, Flask, pdfminer.six, NLTK, Matplotlib, Pandas, HTML/CSS (Jinja)
Concepts: Text extraction, document parsing, NLP preprocessing, visualization

This web application allows users to upload a PDF and search for word occurrences by section. It automatically detects section headers using font differences, cleans extracted text, and generates charts showing word frequency per section.

Features:

  • Upload and parse PDFs
  • Detect and separate sections based on headers
  • Clean text with NLTK stopwords
  • Search for a specific keyword
  • Visualize occurrences in a chart

🔗 GitHub Repo: www.github.com/bonnilee/WordCounter


3️⃣ Most Streamed Spotify Songs 2023 Analysis

Tech: Python, Pandas, Matplotlib, Seaborn
Dataset: Most Streamed Spotify Songs 2023

A full exploratory data analysis (EDA) project exploring trends in popular songs across streaming platforms.

Highlights:

  • Cleaned and transformed a complex dataset (streams, release dates, audio features)
  • Analyzed top artists by song count and total streams
  • Explored audio feature distributions (danceability, energy, valence, speechiness, BPM)
  • Conducted correlation and heatmap analysis
  • Built monthly and yearly release trend visualizations
  • Examined platform presence vs. stream count
  • Discovered patterns showing that being on 5–6 platforms leads to higher streams

Key Insights:

  • Top artists include The Weeknd, Taylor Swift, Ed Sheeran, Harry Styles, and more.
  • Most songs released between 2019–2023, with peaks in 2021 and 2022.
  • High danceability and moderate energy dominate popular songs.
  • More platform presence generally correlates with higher stream counts.

🔗 *GitHub Repo: www.github.com/bonnilee/spotify


🔗 Connect With Me


🎯 Job Interests

  • Data Scientist
  • Machine Learning Engineer
  • Data Analyst

Open to remote or U.S.-based roles (full-time or contract).


Thanks for visiting my profile! 🚀
Feel free to explore my repositories or reach out!

Popular repositories Loading

  1. web-sprint-challenge-build-a-web-api web-sprint-challenge-build-a-web-api Public

    Forked from bloominstituteoftechnology/web-sprint-challenge-build-a-web-api

    JavaScript

  2. spotify spotify Public

    analysis of spotify trends and other platforms

    Python

  3. WordCounter WordCounter Public

    counts amount of words by section

    Python

  4. sql-customer-segmentation sql-customer-segmentation Public

    SQL Customer Segmentation Project with RFM analysis and ranking

  5. bonnilee bonnilee Public