Employee Attrition and Performance Prediction

Data Preprocessing and EDA** and (Machine Learning & Deep Learning Models)** of Employee Attrition & Performance project:

Employee Attrition and Performance Prediction

📊 Overview

This project analyzes employee data to uncover patterns behind attrition and predict performance scores using machine learning (Random Forest & Linear Regression) and deep learning (Neural Network). It includes data preprocessing, exploratory analysis, statistical testing, classification, regression, and model evaluation.

📁 Dataset Description

File: employee_data.csv
Shape: 100 rows × 8 columns
Columns:

EmployeeID: Unique identifier
Name: Employee name
Age: Employee age
Department: Department name (e.g., HR, Sales)
Salary: Annual salary (numeric)
YearsAtCompany: Total years spent at the company
PerformanceScore: Annual performance score (out of 100)
Attrition: Whether the employee left the company (Yes/No)

✅ Step 1: Data Cleaning & Preprocessing

Missing Values: Filled numeric columns using column mean
Duplicates: Removed duplicate records
Inconsistent Entries: Cleaned and standardized Department values
Descriptive Stats: Summary generated using df.describe()

📊 Step 2: Exploratory Data Analysis (EDA)

Visualizations:

Pairplot: To observe attrition patterns across features
Correlation Heatmap: To assess feature relationships
Boxplot: To detect outliers in Age, Salary, YearsAtCompany
Attrition Probability by Department:
- Sales: 35.89%
- Engineering: 30.77%
- HR: 23.08%
- Marketing: 10.26%

Statistical Tests:

Bayesian Inference:
- P(Attrition | PerformanceScore) ≈ 0.395
ANOVA (F-test):
- Significant difference in performance scores across departments (p-value ≈ 2.56e-12)

🤖 Step 3: Machine Learning Models

Random Forest Classifier

Target: Attrition
Accuracy: 70%
Precision/Recall (Class 1): 57%
Confusion Matrix: Visualized using Seaborn heatmap

Linear Regression

Target: PerformanceScore
R² Score: 0.74
Cross-Validated R² (5-fold): 0.73
MAE: 0.36
Visuals:
- Predicted vs Actual
- Residuals Plot

🧠 Step 4: Deep Learning - Neural Network (Keras)

Architecture:
- Dense(64, ReLU)
- Dense(32, ReLU)
- Dense(1, Linear)
Loss: MSE
Final Validation MAE: ~0.33
Test MAE: ~0.49

📈 Trend Analysis

Performance vs Tenure: Performance generally increases with YearsAtCompany
Performance by Department & Attrition: Insights into department-specific attrition-performance patterns

🔧 Tools & Libraries Used

Python Libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, tensorflow, keras
ML Models: Random Forest, Linear Regression
DL Model: Keras Sequential Neural Network

📌 Future Improvements

Hyperparameter tuning (Random Forest & Neural Net)
Feature engineering with interaction terms
Deployment using Streamlit or Flask
SHAP/LIME for model explainability

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
EMPLOYEE PERFORMANCE AND RETENTION ANALYSIS.ipynb		EMPLOYEE PERFORMANCE AND RETENTION ANALYSIS.ipynb
EMS.ipynb		EMS.ipynb
EPRA.ipynb		EPRA.ipynb
Employee Management System (EMS).ipynb		Employee Management System (EMS).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Employee Attrition and Performance Prediction

📊 Overview

📁 Dataset Description

✅ Step 1: Data Cleaning & Preprocessing

📊 Step 2: Exploratory Data Analysis (EDA)

Visualizations:

Statistical Tests:

🤖 Step 3: Machine Learning Models

Random Forest Classifier

Linear Regression

🧠 Step 4: Deep Learning - Neural Network (Keras)

📈 Trend Analysis

🔧 Tools & Libraries Used

📌 Future Improvements

About

Uh oh!

Releases

Packages

Languages

IRupanjali/EMS

Folders and files

Latest commit

History

Repository files navigation

Employee Attrition and Performance Prediction

📊 Overview

📁 Dataset Description

✅ Step 1: Data Cleaning & Preprocessing

📊 Step 2: Exploratory Data Analysis (EDA)

Visualizations:

Statistical Tests:

🤖 Step 3: Machine Learning Models

Random Forest Classifier

Linear Regression

🧠 Step 4: Deep Learning - Neural Network (Keras)

📈 Trend Analysis

🔧 Tools & Libraries Used

📌 Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages