HR Attrition Analysis — Predicting & Preventing Employee Turnover
This is a quick prediction analysis made with a couple of Machine Learning algorithms. Through statistical methods, the algorithms are able to predict with certain confidence some behaviours and what drives the attrition at any and all companies. It's not a perfect solution, there is much to improve, but its a starting point as to where #peopleAnalytics can get to one day.
Business Problem "Why are employees leaving? Can we predict attrition early and target retention efforts?"
Data Science Approach
- Convert to binary classification:
Attrition = Yes/No - Three models for robustness & insight:
- Logistic Regression: Interpretable odds ratios (HR-friendly)
- Random Forest: Captures non-linear patterns (e.g.,
JobSatisfaction × Overtime) - XGBoost + SHAP: State-of-the-art prediction + explainable AI
Key Insights (From Results)
- Model performance:
- Random Forest AUC: 0.811 - Strong predictive power
- Logistic Regression: 0.805 - strong predictive power
- XGBoost AUC: 0.779 - Slightly lower but more interpretable via SHAP Top 3 global drivers of attrition (SHAP):
OverTime— Highest impact (mean |SHAP| = 0.62)MonthlyIncome— Second highest (0.44)StockOptionLevel— Third (0.38)
Actionable insight: Employees working overtime with low income or stock options are at highest risk.
Even though, these insights are somewhat an over-simplified version of what actually drives attrition. It is very important to ask to ourselves what is it that we are doing differently. Everyone want that non-plus-ultra performance from their employees, but no one wants to go 'the extra mile' when it comes ti fighting for a 3%-5% increment in comepensation for some of the top talent. And employers still wonder "why people leave this awesome company", well, it's more complicated than we all initally think.
Living in Mexico, where people work the longest out of anywhere in the world, overtime is the norm. But when did employers forgot this was the extra mile? Where was it forgotten that this needs to be compensated? Employees never forget, and that is what drives turnover. Moving the goal-post over and over and over.
Managing employees is not easy, but we have the tools to make it easier, to facilitate a much better and improved balance of our lives.
How to Run (100% Reproducible on Windows)
Prerequisites
- Windows 10/11
- Anaconda 2023.09 or later (includes Python 3.11.5)
→ Download Anaconda (64-bit)
- Clone this repo
Open Anaconda Prompt (search in Start menu) and run:git clone https://github.com/albertogoga/hr-attrition-analysis.git cd hr-attrition-analysis