Open
Description
Feature description
As discussed during CHEP24, it would be very useful to have the option
to not only extract data from RDF graph in numpy/torch/tf format but
also to be able to feed back into RDF the data in batches (e.g. for NN
inference not supported in SOPHIE)
When exporting/importing it would be useful to have the option to
explode/flatten vecops of same length.
Pseudo code example:
def processBatch(nparray)
#do something with pyTorch
...
return outTensor
rdf.BatchProcess(inputCols={"Jet_pt","Jet_eta","Jet_mass","MET_pt"},
outputVectorCols={"Jet_regressedPt", "Jet_regressedMass"},
outputScalarCols={}, processBatch,
batchSize=100000,flattenRVec=True,broadCastScalars=True)
Alternatives considered
No response
Additional context
No response