Skip to content

shazibid/ARITY-BTT-PROJECT-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

199 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classification of Vehicle Turns from Telematic Data


👥 Arity 1B Team Members

Name GitHub Handle Contribution
Shazi Bidarian @shazibid Data exploration, visualization, project coordination, state 1 + 2 unsupervised modeling
Connie Yang @connieyyy Data exploration, project coordination, state 0 unsupervised modeling, documentation
Kelly Pham @kllyph Data preprocessing, supervised learning with random forest
Jaewon Kim @CanDoJaewon Data exploration, label matching

🎯 Project Highlights

  • Developed a machine learning model using K-Means, DBSCAN, and HDBSCAN to classify turns.
  • Achieved 96% accuracy for random forest model, demonstrating consistency of replication of modeling results for Arity.

👩🏽‍💻 Setup and Installation

  1. Install the following prerequisites:
    a. Python 3.8+
    b. Git

  2. Clone the repo

  3. Set up virtual environment

  4. Install dependencies

  5. Verify installation

  6. Open in VS Code and run notebooks

  7. Open the workspace in VS Code

  8. Select the Python kernel (.venv environment)

  9. Start with raw.ipynb to understand the data and explore state0, state1/, state2/ for analysis by driving state

Project Structure

  • Data/
    • Raw/ → untouched iOS & Android data
    • Processed/ → cleaned + split datasets
  • Notebooks/ → EDA + experiments
  • SRC/ → finalized scripts (data cleaning, modeling)
  • Results/ → plots, metrics, reports
  • README.md → project overview + instructions

🏗️ Project Overview

Describe:

  • Arity collects user driving data with consent and safe drivers get a lower rate on their insurance policies
  • Use AI/ML to classify data points into different types of turns
  • Use telematics data to classify vehicle turning behaviors
  • Cluster models to distinguish different types of turns
  • Create a supervised model to classify vehicle turns

📊 Data Exploration

  • 20 MB dataset, dictionary structure and CSV format
  • Plotted data points with matplotlib and seaborn
  • Removed outliers with interquartile method
Screenshot 2025-12-10 at 2 10 25 PM Screenshot 2025-12-10 at 1 43 43 PM

🧠 Model Development

  • Model(s) used: K-Means, HDBSCAN, DBSCAN, and Random Forest

📈 Results & Key Findings

  • 96% accuracy score with random forest modeling
Screenshot 2025-12-10 at 2 31 18 PM Screenshot 2025-12-10 at 2 31 35 PM Screenshot 2025-12-10 at 2 31 48 PM Screenshot 2025-12-10 at 2 31 57 PM

🚀 Next Steps

  • We want to focus on optimizing the supervised model to ensure it generalizes well and is ready for deployment. This includes applying techniques such as grid search or randomized search to systematically explore hyperparameter combinations, using cross‑validation, and tuning parameters like tree depth, minimum samples per leaf, and the number of estimators to balance accuracy with robustness.
  • We can also compare Random Forest with other ensemble methods such as Gradient Boosting, and analyze feature importance to understand which inputs drive cluster predictions most strongly.

📝 License

This project is licensed under the MIT License.


📄 References

Presentation Slides


🙏 Acknowledgements

Thank you to our advisor, Francesco De Bernardis, and coach, Matt Brems who supported our project.

About

Project for company Arity through Break Through Tech.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors