Benchmark reports logloss, but benchmark does not tell AutoML systems to optimize for logloss

I noticed that you're reporting logloss as the metric to evaluate systems, but you're not passing this information to any of the AutoML systems.  Both auto-sklearn and H2O AutoML (maybe MLJar too?) have the ability to optimize and choose a leader model based on the metric which you want to evaluate, so this should be explicitly specified in a benchmark.  

- H2O AutoML has two parameters that should be set when evaluating on a non-default metric.  Those are `stopping_metric` and `sort_metric` and should both be set to `"logloss"`.  More info [here](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html#optional-miscellaneous-parameters).  By default on binary classification problems, H2O is optimized for AUC, unless you change it to logloss.
- Auto-sklearn also has a `metric` argument which should be used and set to `"logloss"`.  More info [here](https://automl.github.io/auto-sklearn/master/api.html#api).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark reports logloss, but benchmark does not tell AutoML systems to optimize for logloss #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark reports logloss, but benchmark does not tell AutoML systems to optimize for logloss #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions