| Title | Author | Date |
|---|---|---|
Day 14 of 30 Day Of ML Code Event |
Rajendrasinh Parmar |
August 15, 2021 |
In this tutorial, you will learn how to build and optimize models with gradient boosting. This method dominates many Kaggle competitions and achieves state-of-the-art results on a variety of datasets.
We will use xgboost library for the purpose of the lesson. scikit-learn also has a version of gradient boosting. However, xgboost has some technical advantages so we will use that.
Follow the notebook XGBoost for lesson details.
As a part of the exercise for the sixth lesson we have predicted the values of house prices using the xgboost regressor (gradient boosting).
The associated exercise with the sixth lesson of the course is provided in Exercise: XGBoost
In this tutorial, you will learn what data leakage is and how to prevent it. If you don't know how to prevent it, leakage will come up frequently, and it will ruin your models in subtle and dangerous ways. So, this is one of the most important concepts for practicing data scientists.
There are mainly 2 kinds of data leakage.
- target leakage
- train-test contamination
Follow the notebook Data Leakage for lesson details.
As a part of the exercise for the seventh lesson we have learned about data leakage and how to identify the leakage using different examples.
The associated exercise with the seventh lesson of the course is provided in Exercise: Data Leakage