Scorers might need to know about training and testing data

This is not a PR because I didn't write this yet. It's more a very loose RFC.

I think scorers might need to be able to distinguish between training and test data.
I think there were more cases but there are two obvious ones:
the R^2 is currently computed using the test-set mean. That seems really odd, and breaks for LOO.
When doing cross-validation, the classes that are present can change, which can impact things like macro-f1 in weird ways, and can also lead to errors in LOO (https://github.com/scikit-learn/scikit-learn/issues/4546)

I'm not sure if this is a good enough case yet, but I wanted somewhere to take a note ;)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scorers might need to know about training and testing data #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scorers might need to know about training and testing data #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions