Skip to content

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

Open
@Superharz

Description

@Superharz

Dividing a weighted histogram by a large scalar integer can result in negative (or nan) variances. This only happens with integers.

Code to reproduce:

import boost_histogram as bh
hist = bh.Histogram(bh.axis.Regular(2,0,1), storage=bh.storage.Weight())
x = [0,1]
weight = [10.0,10.0]
hist.fill(x, weight=weight)

print(hist.values(), hist.variances())
#>>> [10.  0.] [100.   0.]

hist_2 = hist / (123456789)
print(hist_2.values(), hist_2.variances())
#>>> [8.10000007e-08 0.00000000e+00] [-5.68861947e-08 -0.00000000e+00]

hist_3 = hist / float(123456789)
print(hist_3.values(), hist_3.variances())
#>>> [8.10000007e-08 0.00000000e+00] [6.56100012e-15 0.00000000e+00]

Observed behavior:

The variance in hist_2 turns negative.

Dividing by 2**N with N>15 results in a [inf, nan] variance.

Expected behavior:

The variance with a weight of 10 after dividing by 123456789 should be the one from hist_3.

Workaround:

Cast the scalar to a float (which happens for hist_3).

IMHO this should happen automatically or a warning should be given to the user.

Version:

  • Windows 10, 64 bit, AMD Ryzen 3800X
  • Python 3.12.7
  • boost-histogram 1.5.0
  • numpy 1.26.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions