Skip to content

LKEthridge/Integrated_Project_2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Integrated_Project_2

This was an Integrated skill project for TripleTen. πŸ‘©πŸ½β€πŸ’»

This project developed a machine learning solution for predicting gold recovery at the rougher and final stages of ore processing using datasets with over 80 parameters. A Multi-Output Random Forest Regression model provided the most accurate predictions during training, with Linear Regression as a viable, less computationally intensive alternative. Despite underperforming compared to constant benchmarks on the test set, the models demonstrate the potential for data-driven optimization of industrial processes.

Skills Highlighted

🐍 Python πŸ‘©πŸ½β€πŸ’» Data Science πŸ€– Machine Learning πŸ§ͺ Scikit Learn ❌ Cross Validation 🐼 pandas πŸ“Š Data Analytics πŸ‘€ Supervised Learning βš™οΈ Feature Engineering πŸ’― Model Evaluation πŸ•΅πŸ½β€β™€οΈ Anomaly Detection 🧼 Data Cleaning and Preprocessing

Installation & Usage

  • This project uses pandas, numpy, RandomForestRegressor, MultiOutputRegressor, LinearRegression, mean_squared_error, mean_absolute_error, make_scorer, matplotlib.pyplot, shuffle, StandardScaler, seaborn, SimpleImputer, cross_val_score, KFold, and RandomizedSearchCV. It requires python 3.9.6. There is one additional file containing the full, unsplit test set that I was unable to upload due to upload limitations.