-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Description
Hi,
I just came across an interesting corner case: some bins have samples of the same probability.
The code below will reproduce the error.
import calibration as cal
model_probs = [[0.5507, 0.4493],
[0.8764, 0.1236],
[0.1822, 0.8178],
[0.3814, 0.6186],
[0.9725, 0.0275],
[0.281, 0.719 ],
[0.8817, 0.1183],
[0.8193, 0.1807],
[0.4806, 0.5194],
[0.9415, 0.0585],
[0.4648, 0.5352],
[0.9561, 0.0439]]
labels = [0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0]
calibrator = cal.PlattBinnerMarginalCalibrator(len(labels), num_bins=4)
calibrator.train_calibration(model_probs, labels)
print (calibrator._bins)
The shape of the first row in calibrator._bins is (3,) instead of (4,) as expected.
We looked into the reason and found that the last two bins have samples of the same probabilities.

We are wondering whether in such a case, an error message should be thrown out or the probabilities should have been added with noises.
Metadata
Metadata
Assignees
Labels
No labels