Skip to content

Conversation

@ariostas
Copy link
Member

@ariostas ariostas commented Dec 3, 2025

This PR adds some flexibility to the hist profiles that can be written to file. The old code was assuming ROOT histograms that were converted to hist with to_hist, which adds some metadata, but if the metadata was not present it just didn't work. Someone should double-check that I'm inputting the right data into to_TProfile. Closes #1531.

@ariostas
Copy link
Member Author

ariostas commented Dec 3, 2025

@pfackeldey could you take a look at this since you're good with histograms?

@ianna ianna mentioned this pull request Dec 4, 2025
@wiso
Copy link
Contributor

wiso commented Dec 5, 2025

thanks, it seems to work, even if I have to simplify my original code. For example with storage=hist.storage.WeightedMean() it doesn't work #1533 and ND profiles are not supported #1534.

By the way, I get a warning

FutureWarning: .metadata was not set, returning None instead of Attribute error, boost-histogram 1.7+ will error.
  if obj.metadata is not None and "fSumw2" in obj.metadata.keys():

@ariostas
Copy link
Member Author

ariostas commented Dec 5, 2025

Thank you, @wiso! I fixed the warning and the WeightedMean storage. I'll follow up on the ND profiles on a separate PR.

@ianna ianna added the next-release Required for the next release label Dec 11, 2025
Copy link
Member

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ariostas - looks great! Thanks. Please merge it if you are done with it. Thanks!

@ariostas ariostas requested a review from Copilot December 15, 2025 16:56
@ariostas ariostas linked an issue Dec 15, 2025 that may be closed by this pull request
@ariostas ariostas merged commit d855770 into main Dec 15, 2025
33 of 35 checks passed
@ariostas ariostas deleted the ariostas/fix_writing_tprofiles branch December 15, 2025 18:51
@wiso
Copy link
Contributor

wiso commented Dec 19, 2025

I think this should be reopen, the values I get when reading the written object are not correct.

Test case

import uproot
import hist
import numpy as np

axis_x = hist.axis.IntCategory(range(10))
h = hist.Hist(axis_x, storage=hist.storage.Mean())

sample0 = np.array([10, 20, 30, 10])
sample1 = np.array([10, 20, 10, 10, 0])
h.fill([0, 0, 0, 0], sample=sample0)
h.fill([1, 1, 1, 1, 1], sample=sample1)

expected_count0 = len(sample0)
expected_count1 = len(sample1)
expected_mean0 = np.mean(sample0)
expected_mean1 = np.mean(sample1)
expected_variance0 = np.var(sample0) / (len(sample0) - 1)
expected_variance1 = np.var(sample1) / (len(sample1) - 1)

hist_count0 = h[0].count
hist_count1 = h[1].count
hist_mean0 = h[0].value
hist_mean1 = h[1].value
hist_variance0 = h.variances()[0]
hist_variance1 = h.variances()[1]
hist_bintype = type(h[0])



with uproot.recreate("test.root") as f:
    f['h'] = h

with uproot.open("test.root") as f:
    h = f['h'].to_hist()

    # FAIL AttributeError: 'boost_histogram._core.accumulators.WeightedMean' object has no attribute 'count'
#    uproot_count0 = h[0j].count
#    uproot_count1 = h[1j].count

    uproot_mean0 = h[0].value
    uproot_mean1 = h[1].value
    uproot_variance0 = h.variances()[0]
    uproot_variance1 = h.variances()[1]
    uproot_bintype = type(h[0])

print(f"count0: {expected_count0:.1f}  {hist_count0:.1f}")
print(f"count1: {expected_count1:.1f}  {hist_count1:.1f}")
print(f"mean0: {expected_mean0:.1f}  {hist_mean0:.1f} {uproot_mean0:.1f}")
print(f"mean1: {expected_mean1:.1f}  {hist_mean1:.1f} {uproot_mean1:.1f}")
print(f"var0: {expected_variance0:.1f}  {hist_variance0:.1f} {uproot_variance0:.1f}")
print(f"var1: {expected_variance1:.1f}  {hist_variance1:.1f} {uproot_variance1:.1f}")
print(f"bin-type {hist_bintype}  {uproot_bintype}")

with output

count0: 4.0  4.0
count1: 5.0  5.0
mean0: 17.5  17.5 17.5
mean1: 10.0  10.0 10.0
var0: 22.9  22.9 1.2
var1: 10.0  10.0 0.2
bin-type <class 'boost_histogram.accumulators.Mean'>  <class 'boost_histogram.accumulators.WeightedMean'>

I guess the main problem is that when retrieving the histogram from uproot the bin type is WeightedMean and the variances are wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

next-release Required for the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Writing profiles with storage=WeightedMean() Writing profiles

4 participants