Open
Description
This is not a PR because I didn't write this yet. It's more a very loose RFC.
I think scorers might need to be able to distinguish between training and test data.
I think there were more cases but there are two obvious ones:
the R^2 is currently computed using the test-set mean. That seems really odd, and breaks for LOO.
When doing cross-validation, the classes that are present can change, which can impact things like macro-f1 in weird ways, and can also lead to errors in LOO (scikit-learn/scikit-learn#4546)
I'm not sure if this is a good enough case yet, but I wanted somewhere to take a note ;)
Metadata
Metadata
Assignees
Labels
No labels