Skip to content

DefinePerSample is not supported for RDataSource #15086

Open
@vepadulano

Description

@vepadulano

Check duplicate issues.

  • Checked for duplicates

Description

RLoopManager::RunDataSource is lacking the logic to update the sample information which is present in the equivalent runners for the empty data source or the TTree data source. This notably prevents using DefinePerSample effectively when the input data source is an RNTuple.

$: ./repro.out
+-----+-----------+
| Row | file_name |
+-----+-----------+
| 0   | ""        |
+-----+-----------+
| 1   | ""        |
+-----+-----------+
| 2   | ""        |
+-----+-----------+
| 3   | ""        |
+-----+-----------+
| 4   | ""        |
+-----+-----------+
| 5   | ""        |
+-----+-----------+
| 6   | ""        |
+-----+-----------+
| 7   | ""        |
+-----+-----------+
| 8   | ""        |
+-----+-----------+
| 9   | ""        |
+-----+-----------+
| 10  | ""        |
+-----+-----------+
| 11  | ""        |
+-----+-----------+
| 12  | ""        |
+-----+-----------+
| 13  | ""        |
+-----+-----------+
| 14  | ""        |
+-----+-----------+

Reproducer

#include <ROOT/RNTupleDS.hxx>
#include <ROOT/RNTupleWriter.hxx>
#include <ROOT/RNTupleModel.hxx>
#include <TFile.h>
#include <ROOT/RNTuple.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RVec.hxx>

#include <iostream>

using RNTupleModel = ROOT::Experimental::RNTupleModel;
using RNTupleWriter = ROOT::Experimental::RNTupleWriter;

void create_definepersample(const std::string &ntpl_name, const std::vector<std::string> &filenames)
{
    for (const auto &fn : filenames)
    {
        auto model = RNTupleModel::Create();
        auto fldX = model->MakeField<ULong64_t>("x");
        auto ntpl = RNTupleWriter::Recreate(std::move(model), ntpl_name, fn);
        for (ULong64_t entry = 0; entry < 10; entry++)
        {
            *fldX = entry;
            ntpl->Fill();
        }
    }
}

int main()
{
    std::vector<std::string> filenames{
        "sample1.root",
        "sample2.root",
        "sample3.root"};
    std::string ntpl_name = "Events";
    create_definepersample(ntpl_name, filenames);
    ROOT::RDataFrame df{ntpl_name, filenames};
    auto df1 = df.DefinePerSample("file_name", [](unsigned int, const ROOT::RDF::RSampleInfo &si) {
      return si.AsString();
    });
    df1.Display<std::string>({"file_name"}, 30)->Print();
}

ROOT version

Any

Installation method

Any

Operating system

Any

Additional context

No response

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions