Description
Check duplicate issues.
- Checked for duplicates
Goal
I am trying to store an edm4hep file using RDataFrame::Snapshot
after applying some cuts.
For columns (branches?) of type edm4hep::ReconstructedParticleData
this seems to just work.
However, for e.g. the corresponding clusters (_CollectionName_clusters
) of type podio::ObjectID
, I get the error:
Error in <TTree::Branch>: The class requested (ROOT::VecOps::RVec<podio::ObjectID>) for the branch "_CollectionName_clusters" is an instance of an stl collection and does not have a compiled CollectionProxy. Please generate the dictionary for this collection (ROOT::VecOps::RVec<podio::ObjectID>) to avoid to write corrupted data.
RDataFrame::Run: event loop was interrupted
Operating System and Version
Alma Linux 9
compiler
necessary?
The version of the key4hep stack
source /cvmfs/sw.hsf.org/key4hep/setup.sh -r 2025-01-28
Package Version
whatever is in the above release
Reproducer
import ROOT
df = ROOT.RDataFrame("events", "https://lreichen.webtest.cern.ch/minidst.edm4hep.root")
# doesn't work
# df.Range(10).Snapshot("events", "snapshot-test.root", ["_BCalRecoParticle_clusters"])
# works
df.Range(10).Snapshot("events", "snapshot-test.root", ["BCalRecoParticle"])
print(df.GetColumnType("_BCalRecoParticle_clusters"))
# ROOT::VecOps::RVec<podio::ObjectID>
print(df.GetColumnType("BCalRecoParticle"))
# ROOT::VecOps::RVec<edm4hep::ReconstructedParticleData>
Additional context
I am honestly more confused that one of the two cases works.
As far as I know, ROOT needs to have RVec<type>
in the dictionary for RVecs to be written out; vector<type>
is not enough and as far as I can tell, we only have the latter in edm4hepDictDict.rootmap
and podioDictDict.rootmap
. But this is where my knowledge of ROOT dictionaries ends.
It would, of course, be even better if ROOT would not automatically convert everything to RVecs when writing a Snapshot, but we don't have a new enough version for this in the stack... :)
root-project/root#15895