-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Description
Hi!
First of all, thanks for the excellent package, and in particular also for still actively maintaining it! :-)
I have some questions regarding the bootstrapping-based uncertainty quantification. When I call get_calibration_error_uncertainties, it calls bootstrap_uncertainty with the functional get_calibration_error(probs, labels, p, debias=False, mode=mode).
bootstrap_uncertainty will then roughly do this:
plugin = functional(data)
bootstrap_estimates = []
for _ in range(num_samples):
bootstrap_estimates.append(functional(resample(data)))
return (2*plugin - np.percentile(bootstrap_estimates, 100 - alpha / 2.0),
2*plugin - np.percentile(bootstrap_estimates, 50),
2*plugin - np.percentile(bootstrap_estimates, alpha / 2.0))Questions:
- Why is
debias=Falsein the call toget_calibration_error? I would like UQ for the unbiased (L2) error estimate? - How/why is "2*plugin - median(bootstrap_estimates)" a good estimate of the median? And similarly for the lower/upper quantiles?
- In
get_calibration_error_uncertainties, it says "When p is not 2 (e.g. for the ECE where p = 1), [the median]
can be used as a debiased estimate as well." - why would that be true / what exactly do you mean by it...?
I guess what I am really asking is: what's the reasoning behind the approach you chose, and is it described somewhere? :-)
Metadata
Metadata
Assignees
Labels
No labels