Skip to content

Latest commit

 

History

History
67 lines (50 loc) · 1.64 KB

File metadata and controls

67 lines (50 loc) · 1.64 KB

Overview

This project demonstrates how supervised and unsupervised machine learning techniques can be combined to analyze behavioural data, predict outcomes, classify risk, and detect anomalies. The system models normal behavioural patterns, identifies deviations, and segments users into meaningful groups.

Objectives

Predict expected behavioural outcomes (Regression), Classify observations into risk categories (Classification), Detect abnormal behaviour using deviation scoring, Discover hidden behavioural segments (Clustering)

Techniques Used

*Supervised Learning KNN Regressor – Behaviour prediction, Logistic Regression – Risk classification

*Unsupervised Learning KMeans Clustering – Behavioural segmentation

*Evaluation Metrics R² Score, Mean Squared Error (MSE), Accuracy, Precision, Recall, F1-score

Workflow

Data loading and preprocessing, Encoding categorical variables, Feature scaling (Standardization), Train–test split, Model training and comparison, Anomaly score calculation, Cluster-based behavioural segmentation

Key Results

KNN Regression achieved moderate predictive performance (R² ≈ 0.41), Logistic Regression achieved ~71% classification accuracy, Anomaly detection identified high-deviation behavioural cases, KMeans revealed 5 distinct behavioural segments

Key Concepts Covered

Supervised vs Unsupervised Learning, Feature Scaling, Overfitting and Model Comparison, Probability-based Risk Scoring, Data Leakage Awareness

Applications *This framework can be extended to:

Financial risk modeling, Fraud detection, Customer behaviour analytics, Anomaly detection systems

Tech Stack

Python, Pandas, Scikit-learn, NumPy Matplotlib