Open
Description
Dividing a weighted histogram by a large scalar integer can result in negative (or nan) variances. This only happens with integers.
Code to reproduce:
import boost_histogram as bh
hist = bh.Histogram(bh.axis.Regular(2,0,1), storage=bh.storage.Weight())
x = [0,1]
weight = [10.0,10.0]
hist.fill(x, weight=weight)
print(hist.values(), hist.variances())
#>>> [10. 0.] [100. 0.]
hist_2 = hist / (123456789)
print(hist_2.values(), hist_2.variances())
#>>> [8.10000007e-08 0.00000000e+00] [-5.68861947e-08 -0.00000000e+00]
hist_3 = hist / float(123456789)
print(hist_3.values(), hist_3.variances())
#>>> [8.10000007e-08 0.00000000e+00] [6.56100012e-15 0.00000000e+00]
Observed behavior:
The variance in hist_2
turns negative.
Dividing by 2**N
with N>15
results in a [inf, nan]
variance.
Expected behavior:
The variance with a weight of 10 after dividing by 123456789 should be the one from hist_3
.
Workaround:
Cast the scalar to a float (which happens for hist_3
).
IMHO this should happen automatically or a warning should be given to the user.
Version:
- Windows 10, 64 bit, AMD Ryzen 3800X
- Python 3.12.7
- boost-histogram 1.5.0
- numpy 1.26.4
Metadata
Metadata
Assignees
Labels
No labels