Bootstrap uncertainty details

Hi!

First of all, thanks for the excellent package, and in particular also for still actively maintaining it! :-)

I have some questions regarding the bootstrapping-based uncertainty quantification. When I call [get_calibration_error_uncertainties](https://github.com/p-lambda/verified_calibration/blob/ee81c346895e3377653bd347c429a95bd631058d/calibration/utils.py#L63), it calls [bootstrap_uncertainty](https://github.com/p-lambda/verified_calibration/blob/ee81c346895e3377653bd347c429a95bd631058d/calibration/utils.py#L415) with the functional [get_calibration_error](https://github.com/p-lambda/verified_calibration/blob/ee81c346895e3377653bd347c429a95bd631058d/calibration/utils.py#L98)(probs, labels, p, debias=False, mode=mode).

`bootstrap_uncertainty` will then roughly do this:
```python
    plugin = functional(data)
    bootstrap_estimates = []
    for _ in range(num_samples):
        bootstrap_estimates.append(functional(resample(data)))
    return (2*plugin - np.percentile(bootstrap_estimates, 100 - alpha / 2.0),
            2*plugin - np.percentile(bootstrap_estimates, 50),
            2*plugin - np.percentile(bootstrap_estimates, alpha / 2.0))
```

Questions:
1. Why is `debias=False` in the call to `get_calibration_error`? I would like UQ for the unbiased (L2) error estimate?
2. How/why is "2*plugin - median(bootstrap_estimates)" a good estimate of the median? And similarly for the lower/upper quantiles?
3. In `get_calibration_error_uncertainties`, it says "When p is not 2 (e.g. for the ECE where p = 1), [the median]
        can be used as a debiased estimate as well." - why would that be true / what exactly do you mean by it...?
        
I guess what I am really asking is: what's the reasoning behind the approach you chose, and is it described somewhere? :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bootstrap uncertainty details #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bootstrap uncertainty details #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions