-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Motivation
Currently, hyperparameter optimization for CatBoost requires many trials because all parameters are optimized at the same time. There is a LightGBM integration, which uses a stepwise optimization strategy. This approach reduces the search space at each stage, leading to faster convergence with fewer trials. I use LightGBM integration for my company's Automl solution, and it's currently shows the best results utilizing fewer time.
For CatBoost, the lack of a similar stepwise method results in longer optimization times and higher computational costs. If CatBoost had a similar stepwise method, it would be faster and cheaper. A special process that adjusts CatBoost's unique settings (like depth, l2_leaf_reg, and bagging_temperature) would make it a lot better.
Description
Implement a stepwise hyperparameter optimization strategy for CatBoost, inspired by Optuna's existing LightGBM integration. The key steps would include:
-
Parameter Grouping and Ordering:
- Define a logical order for tuning parameters, prioritizing those with the highest impact first. Example stages (suggest yours if you have ideas):
- Stage 1: foundational parameters (
learning_rateanddepth). - Stage 2: regularizations (
l2_leaf_reg,random_strength). - Stage 3: stochastic parameters (
bagging_temperature,subsample). - Stage 4: repeat for foundational parameters (
learning_rateanddepth). - Stage 5: categorical features (e.g.,
one_hot_max_size).
- Stage 1: foundational parameters (
- Define a logical order for tuning parameters, prioritizing those with the highest impact first. Example stages (suggest yours if you have ideas):
-
Pruning Integration. Use already implemented catboost pruning callback
-
API Design:
- Introduce a class
optuna.integration.CatBoostTuner<CV>to automate the process. - Example usage:
tuner = CatBoostTuner() tuner_output = tuner.run()
- Introduce a class
Benefits:
- Reduced computational resources and time.
- Better alignment with CatBoost's parameter interdependencies (e.g., tuning
depthbefore leaf regularization).
Alternatives (optional)
No response
Additional context (optional)
No response