Skip to content

Bootstrap uncertainty details #13

@e-pet

Description

@e-pet

Hi!

First of all, thanks for the excellent package, and in particular also for still actively maintaining it! :-)

I have some questions regarding the bootstrapping-based uncertainty quantification. When I call get_calibration_error_uncertainties, it calls bootstrap_uncertainty with the functional get_calibration_error(probs, labels, p, debias=False, mode=mode).

bootstrap_uncertainty will then roughly do this:

    plugin = functional(data)
    bootstrap_estimates = []
    for _ in range(num_samples):
        bootstrap_estimates.append(functional(resample(data)))
    return (2*plugin - np.percentile(bootstrap_estimates, 100 - alpha / 2.0),
            2*plugin - np.percentile(bootstrap_estimates, 50),
            2*plugin - np.percentile(bootstrap_estimates, alpha / 2.0))

Questions:

  1. Why is debias=False in the call to get_calibration_error? I would like UQ for the unbiased (L2) error estimate?
  2. How/why is "2*plugin - median(bootstrap_estimates)" a good estimate of the median? And similarly for the lower/upper quantiles?
  3. In get_calibration_error_uncertainties, it says "When p is not 2 (e.g. for the ECE where p = 1), [the median]
    can be used as a debiased estimate as well." - why would that be true / what exactly do you mean by it...?

I guess what I am really asking is: what's the reasoning behind the approach you chose, and is it described somewhere? :-)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions