I'm deeply passionate about harnessing the potential of data to uncover insights and tackle complex challenges head-on. With a solid background in Business Management and extensive expertise in Advanced Data Analytics and machine learning, I thrive in collaborative settings where I can apply my analytical skills to address real-world problems with innovation and precision.
Project 1: Classification of TikTok videos
- Overview: TikTok is exploring the application of machine learning to extract claims or propositions from videos and comments on its platform. Leveraging statsmodels and scikit-learn, the goal is to classify each data point as either a claim or opinion. This initiative aims to enhance the triaging process for human review, facilitating more efficient content moderation on TikTok.
- Technologies Used: Python, Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, XGBoost
Project 2: Classification of Waze data
- Overview: The Waze data team is working on a data analytics project focused on reducing monthly user churn on the Waze app to drive overall growth. By leveraging decision tree, random forest, and XGBoost algorithms, the team aims to predict user churn and implement targeted strategies to retain users.
- Technologies Used: Python, Scikit-learn, Pandas, NumPy, Matplotlib, XGBoost
Project 3: Automatidata
- Overview: The New York City Taxi & Limousine Commission (NYC TLC) has engaged the Automatidata data team to develop a machine learning model aimed at predicting whether a rider in a NYC TLC taxi cab will be a generous tipper. This predictive model utilizes multiple regression to forecast taxi fares, serving as a component of a suite of models designed to optimize revenue for the NYC TLC and its drivers.
- Technologies Used: Python, Scikit-learn, Pandas, NumPy, Matplotlib, XGBoost
- Programming Languages: Python, R, SQL
- Data Analysis: Pandas, NumPy, SciPy, Statsmodels, Scikit-learn, Keras
- Machine Learning Models: Regression (linear, logistic), Naive Bayes, Decision Trees, Random Forest, AdaBoost, XGBoost
- Data Visualization: Matplotlib, Seaborn, Plotly, Tableau Software
- Database Management: SQL, MySQL, PostgreSQL
- Tools & Platforms: Jupyter Notebook, Git
I regularly share insights, tutorials, and best practices related to data analysis and data science on Meduim and contributions to leading publications.
- Advanced Data Analytics Professional Certificate, Google, 2024
- Data Analytics Professional Certificate, Google, 2022
- The Complete SQL Bootcamp: Go from Zero to Hero, Udemy, 2022
- Bachelor of Science in Business Management and Work Organazations, Unikin, 2016
Feel free to reach out to me via linkedin for collaboration opportunities, job inquiries, or just to say hello!
I am eager to collaborate on captivating projects within the realms of data analysis, data science, machine learning, and data visualization. Whether you have an innovative project idea or require assistance with ongoing work, I am readily available to contribute. Don't hesitate to reach out!