An unsupervised machine learning project that detects anomalies and outliers in data using the Local Outlier Factor (LOF) algorithm, with clear visualizations of normal points and detected anomalies.
This project demonstrates anomaly detection using Local Outlier Factor, a density-based algorithm that identifies outliers by comparing the local density of a data point with its neighbors. Points that significantly deviate from surrounding data are marked as anomalies.
The project includes visual analysis of:
- Original dataset
- Data points with anomalies highlighted
- Isolated anomaly points
- Local_Outlier_Factor.ipynb — Main project notebook implementing LOF
- anomalies_marked.png — All data points with anomalies highlighted
- anomalies_only.png — Only detected anomaly points
- original_data.png — Original dataset used for training
- README.md — Project documentation
- Python
- NumPy
- Pandas
- Matplotlib
- scikit-learn
- Jupyter Notebook
- Algorithm: Local Outlier Factor (LOF)
- Learning Type: Unsupervised Learning
- Distance Metric: Euclidean Distance
- Purpose: Anomaly and Outlier Detection
Visualization of the dataset before anomaly detection.
All data points plotted with detected anomalies marked in a different color.
Only the anomaly points extracted and visualized.
- Clone the repository
git clone https://github.com/btboilerplate/Anomaly-detection-using-LocalOutlierFactor.git
- Install required libraries
pip install numpy pandas matplotlib scikit-learn
- Open Local_Outlier_Factor.ipynb and run all cells sequentially
- LOF effectively detects local density-based anomalies
- Anomalies are clearly distinguishable from normal data points
- Works well for non-linear and complex data distributions
- Does not require labeled data
- Tune number of neighbors for better sensitivity
- Compare with Isolation Forest and DBSCAN
- Apply to real-world datasets
- Add anomaly score visualization


