Skip to content

Conversation

@maxgalli
Copy link
Collaborator

@maxgalli maxgalli commented Dec 2, 2025

This PR extends the codegen implementation of FastVerticalInterpHistPdf2 and FastVerticalInterpHistPdf2D2 by providing support for non-uniform binning. This is done using the rawBinNumber function in Roofit (link).

Once this is merged, we will be able to test the codegen backend for objects of type RooParametricHist without using the --use-HistPdf keyword when running text2workspace.py.

Summary by CodeRabbit

  • New Features

    • 1D and 2D interpolation now support both uniform and non-uniform binning, with independent handling of X and Y axes.
    • Dynamic generation and use of bin-edge arrays for non-uniform binning to improve flexibility.
  • Performance

    • Preserved fast execution path for uniform binning to maintain existing performance characteristics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

Modified binning logic in code generation to support both uniform and non-uniform distributions. Implemented conditional pathways: uniform binning uses the fast uniformBinNumber path, while non-uniform binning generates static bin edge arrays and integrates with ROOT's rawBinNumber function. Applied changes to 1D and 2D interpolation handlers.

Changes

Cohort / File(s) Summary
1D Binning Logic
src/CombineCodegenImpl.cxx
Replaced strict uniform-binning assertion with a dual-path approach. Uniform binning routes to uniformBinNumber; non-uniform generates a static bin edges array and uses rawBinNumber.
2D Interpolation Enhancement
src/CombineCodegenImpl.cxx
Extended FastVerticalInterpHistPdf2D2 to handle X and Y binning independently. Each axis conditionally uses uniformBinNumber or rawBinNumber with generated bin edges arrays.
Removed Constraints
src/CombineCodegenImpl.cxx
Eliminated mandatory uniform-binning checks for X and Y axes; preserved the fast uniform path while enabling non-uniform support via generated static arrays and ROOT MathFuncs integration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review bin edge static-array generation and emitted code correctness for non-uniform cases
  • Verify selection logic between uniformBinNumber and rawBinNumber for both axes in 2D paths
  • Confirm correct integration and argument passing to ROOT's rawBinNumber and handling of edge/boundary cases
  • Check for potential performance regressions in the uniform path and binary compatibility of generated code

Poem

🐰 I stitched new edges into the land of bins,

fast lanes for flats, winding roads for whims.
Static edges hum where curves used to hide,
ROOT counts the steps as I hop and glide.
Compile, run, and let interpolation sing.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding non-uniform binning support to the fastVerticalInterpHistPdf2 codegen implementation, which matches the primary objective and content of the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 612c77f and dc559f6.

📒 Files selected for processing (1)
  • src/CombineCodegenImpl.cxx (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Compile (py3.10, root6.32.2)
  • GitHub Check: Compile (py3.10, root6.26.4)
  • GitHub Check: Compile (py3.12, root6.34.4)
  • GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
  • GitHub Check: dev3/latest - ROOT LCG master
  • GitHub Check: LCG_108 - ROOT 6.36.02
  • GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
  • GitHub Check: LCG_106 - ROOT 6.32.02
  • GitHub Check: LCG_102 - ROOT 6.26.04
  • GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
🔇 Additional comments (2)
src/CombineCodegenImpl.cxx (2)

81-101: LGTM! Dual-path binning implementation is clean.

The implementation correctly handles both uniform and non-uniform binning cases. The bin edge extraction logic (lines 93-97) properly constructs the edge array, and passing the binEdges vector directly to buildCall aligns with the previous reviewer's suggestion.


171-207: LGTM! 2D binning implementation correctly handles both axes independently.

The implementation properly handles uniform and non-uniform binning for both X and Y dimensions. The bin edge extraction and vector usage directly in buildCall (lines 187, 206) addresses the previous reviewer's feedback to pass vectors directly rather than building intermediate structures.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/CombineCodegenImpl.cxx (1)

155-155: Critical bug: Incorrect total bin count calculation.

numBinsY * numBinsY should be numBinsX * numBinsY. This typo will cause incorrect array sizing and out-of-bounds access when X and Y have different bin counts.

-  int numBins = numBinsY * numBinsY;
+  int numBins = numBinsX * numBinsY;
🧹 Nitpick comments (2)
src/CombineCodegenImpl.cxx (2)

183-237: Correct dual-path binning for 2D axes; consider extracting helper to reduce duplication.

The binning logic for X and Y axes is implemented correctly. However, the bin edge array generation code (constructing static array, formatting to stringstream, calling addToCodeBody) is now repeated three times across 1D and 2D implementations.

Consider extracting a helper function like:

std::string emitBinEdgesArray(CodegenContext& ctx, const RooAbsBinning& binning, int numBins) {
  std::vector<double> binEdges(numBins + 1);
  for (int i = 0; i < numBins; ++i) {
    binEdges[i] = binning.binLow(i);
  }
  binEdges[numBins] = binning.binHigh(numBins - 1);
  
  std::string arrayName = ctx.getTmpVarName();
  std::stringstream code;
  code << "static const double " << arrayName << "[] = {";
  for (int i = 0; i <= numBins; ++i) {
    if (i > 0) code << ", ";
    code << binEdges[i];
  }
  code << "};\n";
  ctx.addToCodeBody(code.str(), true);
  return arrayName;
}

This would reduce duplication across lines 93-109, 192-207, and 220-235.


239-240: Minor inconsistency: Use numBinsY for consistency.

Line 240 uses yVar.numBins() while the rest of the function uses numBinsY (from cacheNominal.binY()). For consistency and to avoid potential discrepancies, consider using the already-computed variable.

-  binIdx << "(" << binIdxY << " + " << yVar.numBins() << " * " << binIdxX << ")";
+  binIdx << "(" << binIdxY << " + " << numBinsY << " * " << binIdxX << ")";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a58a232 and 612c77f.

📒 Files selected for processing (1)
  • src/CombineCodegenImpl.cxx (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: dev3/latest - ROOT LCG master
  • GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
  • GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
  • GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
  • GitHub Check: LCG_106 - ROOT 6.32.02
  • GitHub Check: LCG_102 - ROOT 6.26.04
  • GitHub Check: LCG_108 - ROOT 6.36.02
  • GitHub Check: Compile (py3.10, root6.32.2)
  • GitHub Check: Compile (py3.12, root6.34.4)
  • GitHub Check: Compile (py3.10, root6.26.4)
🔇 Additional comments (1)
src/CombineCodegenImpl.cxx (1)

81-113: Implementation correctly uses both binning functions with proper signatures.

The dual-path binning approach is sound and verified:

  • Uniform binning correctly calls uniformBinNumber(xLow, xHigh, xVar, numBins, 1.0) matching the signature uniformBinNumber(double low, double high, double val, unsigned int numBins, double coef).

  • Non-uniform binning correctly calls rawBinNumber(xVar, binEdgesArrayName, numBins + 1) matching the signature rawBinNumber(double x, double const *boundaries, std::size_t nBoundaries). The static bin edges array is properly generated with all boundary points (bin lows plus the final high), and the variable name is correctly passed as a C++ identifier in the generated code.

The bin edge extraction logic correctly captures all boundaries: binLow(i) for each bin plus binHigh(numBins - 1) for the final edge.

@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 0% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 22.23%. Comparing base (a58a232) to head (dc559f6).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/CombineCodegenImpl.cxx 0.00% 33 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (98.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1188      +/-   ##
==========================================
- Coverage   22.25%   22.23%   -0.02%     
==========================================
  Files         195      195              
  Lines       26154    26172      +18     
  Branches     3884     3887       +3     
==========================================
  Hits         5820     5820              
- Misses      20334    20352      +18     
Files with missing lines Coverage Δ
src/CombineCodegenImpl.cxx 0.00% <0.00%> (ø)
Files with missing lines Coverage Δ
src/CombineCodegenImpl.cxx 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maxgalli maxgalli requested a review from anigamova December 3, 2025 09:57
@anigamova anigamova requested a review from guitargeek December 3, 2025 10:55
ctx.addToCodeBody(binEdgesCode.str(), true);

// Call ROOT's rawBinNumber for non-uniform bin finding
binIdx = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", xVar, binEdgesArrayName, numBins + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
binIdx = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", xVar, binEdgesArrayName, numBins + 1);
binIdx = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", xVar, binEdges, numBins + 1);

Can you try to use the std::vector<double> binEdges directly to build the function call here?
The code generation context will generate the code with the copied values implicitly then, and also take care of passing the right variable name.

binEdgesCodeY << "};\n";

ctx.addToCodeBody(binEdgesCodeY.str(), true);
binIdxY = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.y(), binEdgesArrayNameY, numBinsY + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
binIdxY = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.y(), binEdgesArrayNameY, numBinsY + 1);
binIdxY = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.y(), binEdgesY, numBinsY + 1);

Same here

binEdgesCodeX << "};\n";

ctx.addToCodeBody(binEdgesCodeX.str(), true);
binIdxX = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.x(), binEdgesArrayNameX, numBinsX + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
binIdxX = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.x(), binEdgesArrayNameX, numBinsX + 1);
binIdxX = ctx.buildCall("RooFit::Detail::MathFuncs::rawBinNumber", arg.x(), binEdgesX, numBinsX + 1);

And here

Copy link
Collaborator

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, looks very correct!

I just made a suggestion about avoiding to manually generate some code. You can directly use the standard vectors with the bin edges when building the function call. The code generation context will copy the values to the generated code for you.

Copy link
Collaborator

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

@maxgalli
Copy link
Collaborator Author

maxgalli commented Dec 4, 2025

Thank you very much, looks very correct!

I just made a suggestion about avoiding to manually generate some code. You can directly use the standard vectors with the bin edges when building the function call. The code generation context will copy the values to the generated code for you.

Thanks a lot, I updated it now

Copy link
Collaborator

@anigamova anigamova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
How about adding tests here? Can also go into a separate PR

@maxgalli
Copy link
Collaborator Author

maxgalli commented Dec 4, 2025

Looks good! How about adding tests here? Can also go into a separate PR

I was thinking about it. In principle, this is already part of the test that runs here: until now, it failed with the error raised in these lines, whereas now it will fail because codegen backend for RooParametricHist is not implemented.
We could just test the function itself (but it just uses rawBinNumber from Roofit, so probably it's already tested in ROOT) or have a full combine workflow with a model that uses fastVerticalInterpHistPdf2 with histograms with non-uniform binning but not RooParametricHist (is it something we have in the examples? couldn't find one). Opinions? @anigamova @guitargeek

@anigamova
Copy link
Collaborator

OK, I see. We just have to work on the AD tests, and I think ideally we need both combine workflow tests and individual functions, but of course it will take time and probably not super relevant for this PR

@anigamova anigamova merged commit 18685de into cms-analysis:main Dec 4, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants