Skip to content

Repository of the ML Classification project "TO GRANT OR NOT TO GRANT: DECIDING ON COMPENSATION BENEFITS" , MDSAA, Nova IMS.

License

Notifications You must be signed in to change notification settings

apocalypsine/MLproject_G20

Repository files navigation

"TO GRANT OR NOT TO GRANT: DECIDING ON COMPENSATION BENEFITS", a Machine Learning Project repository

This repository contains a machine learning project "TO GRANT OR NOT TO GRANT: DECIDING ON COMPENSATION BENEFITS" developed as part of the course in the Master’s program in Data Science and Advanced Analytics at Nova IMS in december 2024. The project use data provided by the teacher from the New York Workers’ Compensation Board (WCB), covering claims made between 2020 and 2022, and addresses the challenges faced by the WCB in processing and categorizing claims efficiently. The main objective of this project is to build and optimize machine learning models for automating the classification of injury types from claims.

Project goals

  1. Multiclass Classification Benchmarking Develop a classification model to predict the Claim Injury Type for workers' compensation claims assembled between 2020 and 2022? which envolves the implementation of a model evaluation strategy and the identification of the model with the best generalization performance.
  2. Model Optimization Refine the selected models to improve their predictive performance through hyperparameter tuning and adjustments in preprocessing and feature selection.
  3. Additional insights: Creative exploration of the data.

Data and Files

  • final_submission.csv contains the preprocessed dataset used for model training.
  • Group_20_notebook.ipynb contains the full project analysis notebook, including data exploration, preprocessing, model comparison, feature importance, and results.
  • Group_20_Report.pdf is the report detailing the methodology, steps taken, and conclusions on the project notebook.

How to Run the Project

  1. Clone the repository to your local machine.
  2. Install the required dependencies by running pip install -r requirements.txt.
  3. Open the Jupyter notebook WCB_Predictions_Notebook.ipynb to explore the analysis and results.
  4. Optionally, you can run the Python scripts in src/ to see the individual steps.

Contributors

This project was developed by Group_20 as part of a machine learning course at Nova IMS. Team members:

  • Duarte Nunes 20240564
  • Mariana Gomes 20211689
  • Pedro Gaspar 20240112
  • Rodrigo Nascimento 20240565
  • Yasmine Boubezari 20230775

Kaggle competition link

https://www.kaggle.com/competitions/to-grant-or-not-to-grant/

About

Repository of the ML Classification project "TO GRANT OR NOT TO GRANT: DECIDING ON COMPENSATION BENEFITS" , MDSAA, Nova IMS.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published