-
Notifications
You must be signed in to change notification settings - Fork 101
feat: Add a threshold parameter to the confusion_matrix display
#2177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add a threshold parameter to the confusion_matrix display
#2177
Conversation
969d097 to
d0e5f1f
Compare
7007077 to
c12cdac
Compare
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a first pass on the API design. I'm overlooking the tests since that the changes will have an impact and we can iterate already on the API before to take care about the test.
|
Hi @glemaitre, thanks for the review. I have have implemented the requested changes. I will wait until we are satisfied with the API to update the tests and the doc example. So until then, the CI will be red. |
|
@glemaitre I did changes suggested orally:
Here is how the Generated with: |

Closes #2112.
With this PR, for binary classification, calling
report.metrics.confusion_matrix(threshold=True)will compute and store the confusion matrices for all thresholds of the decision function of the classifier. They can then be plotted with.plot(threshold_value=x)and accessed via.frame().This makes use of the new scikit-learn function
confusion_matrix_at_thresholds, available in 1.8 and back-ported for earlier versions.The storage structure extends what we converged towards in #2165 : a long format dataframe with one cell of one matrix per row. Columns are the raw count number, all three possible normalized values, the threshold value, and the true and predicted labels.
The default threshold value is
0.5. We could add an"auto"option to select the "best" threshold if we find a satisfactory universal metric to define what "best" means (balanced accuracy may not be desirable for instance, as argued here).