Add codegen implementations of CMSHist classes #1191

PetroZarytskyi · 2025-12-04T15:30:41Z

This PR is based on #1162

Summary by CodeRabbit

Release Notes

New Features
- Enhanced histogram processing with new utility functions for bin indexing and weighted sums
- Added code generation support for histogram-based operations
- Introduced public accessors for histogram data and internal variables
Bug Fixes
- Corrected histogram bin boundary lookup behavior for edge cases
Refactor
- Improved test framework parameter handling and configuration logic

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-04T15:31:13Z

Walkthrough

The PR adds new public accessor methods to histogram classes (getXVar, getFuncValList, GetValues, GetBinEdges), introduces utility math functions for histogram operations, and implements code generation support for CMSHistFunc, CMSHistErrorPropagator, and CMSHistSum. It also modifies bin-edge lookup behavior in FastTemplate and updates test infrastructure with refactored initialization logic.

Changes

Cohort / File(s)	Summary
Histogram class accessors `interface/CMSHistErrorPropagator.h`, `interface/CMSHistSum.h`, `interface/FastTemplate_Old.h`	Added public getters: `getXVar()` returning const reference to x_ proxy, `getFuncValList(fnIdx)` returning per-bin values vector, `GetValues()` and `GetBinEdges()` exposing internal containers; FastHisto constructor changed to defaulted form
Codegen infrastructure `interface/CombineCodegenImpl.h`, `src/CombineCodegenImpl.cxx`	Added forward declarations and new overloads for `codegenImpl()` and `codegenIntegralImpl()` handling CMSHistErrorPropagator, CMSHistFunc, and CMSHistSum; each generates specialized math function calls with cached data; includes reorganized `#include` directives
Math utility functions `interface/CombineMathFuncs.h`	Introduced three new inline functions in `RooFit::Detail::MathFuncs`: `cmsHistFunc()` for bin indexing and value lookup, `cmsHistErrorPropagator()` for coefficient-weighted sums, `cmsHistSum()` for bin-wise weighted sums across samples; added supporting include
Implementation `src/CMSHistSum.cc`	Implemented `getFuncValList()` computing per-bin values with LogQuadLinear transformation, underflow cropping, and bin-type-specific adjustments from binerrors_ and scaledbinmods_
Boundary lookup behavior `src/FastTemplate_Old.cc`	Replaced `std::lower_bound` with `std::upper_bound` across all GetAt and FindBin methods in FastHisto/FastHisto2D/FastHisto3D, altering bin index mapping for edge values
Test infrastructure `test/testCreateNLL.cxx`	Refactored `configureCascadeMinimizerState()` with guarded parameter list initialization; updated `runFit()` signature and implementation with codegen path handling, improved error propagation, and result summaries; added FitSummary struct; adjusted test parameterization and formatting

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

src/FastTemplate_Old.cc: Critical boundary behavior change (lower_bound → upper_bound) affecting bin indexing across three dimensional variants; requires careful validation that edge-case handling aligns with intended semantics
src/CMSHistSum.cc: getFuncValList() contains logic branching on bin types with transformations and scaling operations; verify correctness of underflow cropping and bin modifier application
test/testCreateNLL.cxx: Extensive refactoring of initialization logic and test flow; new struct definitions and enum require validation of test coverage and backend comparison paths
src/CombineCodegenImpl.cxx: Multiple new codegen overloads assembling cached data; placeholder integral implementations ("2.0", "3.0") may be incomplete—confirm intended behavior

Possibly related PRs

Do analytical Barlow Beeston before likelihood evaluation #1137: Modifies same CMSHistErrorPropagator and CMSHistSum classes, adding/removing public interface members
Unit test for template shape model with bare C++ Combine classes #1131: Updates test/testCreateNLL.cxx with similar refactoring of runFit and configureCascadeMinimizerState functions
Add non-uniform binning support for fastVerticalInterpHistPdf2 codegen implementation #1188: Extends codegen implementation in src/CombineCodegenImpl.cxx for histogram-based PDF operations

Suggested reviewers

guitargeek
anigamova

Poem

🐰 Hop! New histograms take their stage,
With getters swift and codegen's page,
Math functions bloom in detail's glow,
Upper bounds help bin edges flow,
Tests adapt with care and might—
Combine's canvas shines so bright! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding codegen implementations for CMSHist classes, which aligns with the substantial additions across multiple files.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/FastTemplate_Old.cc (1)

1-1: Fix clang-format violations.

The CI pipeline reports formatting differences. Run git-clang-format HEAD~ to fix code style issues.

src/CombineCodegenImpl.cxx (1)

1-327: Address clang-format violations.

The CI pipeline reports formatting differences. Run git-clang-format HEAD~ to fix code style issues.

🧹 Nitpick comments (2)

interface/CombineMathFuncs.h (2)
317-324: Unused parameter x in cmsHistErrorPropagator.

The x parameter is declared but never used in the function body. If this is intentional (e.g., to maintain a consistent interface), consider adding (void)x; or using [[maybe_unused]] to suppress compiler warnings. Otherwise, remove the parameter.
-inline double cmsHistErrorPropagator(double x, std::size_t nFuncs, double const* coefList, double const* funcList) {
+inline double cmsHistErrorPropagator([[maybe_unused]] double x, std::size_t nFuncs, double const* coefList, double const* funcList) {
   // My naive understanding of the logic: multiply functions with coefficients and sum up
   double out = 0.;
   for (std::size_t i = 0; i < nFuncs; ++i) {
     out += coefList[i] * funcList[i];
   }
   return out;
 }
326-334: Parameter coeffs should be const.

The coeffs pointer is only read, not modified. For consistency with binEdges and values, and to enforce const-correctness:
 inline double cmsHistSum(
-    double x, std::size_t nBins, std::size_t nSamples, double* coeffs, double const* binEdges, double const* values) {
+    double x, std::size_t nBins, std::size_t nSamples, double const* coeffs, double const* binEdges, double const* values) {
   unsigned int binIdx = RooFit::Detail::MathFuncs::binNumber(x, 1.0, binEdges, nBins + 1, nBins, 0);
   double val = 0.;
   for (std::size_t iSample = 0; iSample < nSamples; ++iSample)
     val += coeffs[iSample] * values[iSample * nBins + binIdx];

   return val;
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18685de and 6f1e1c3.

📒 Files selected for processing (9)

interface/CMSHistErrorPropagator.h (1 hunks)
interface/CMSHistSum.h (1 hunks)
interface/CombineCodegenImpl.h (2 hunks)
interface/CombineMathFuncs.h (2 hunks)
interface/FastTemplate_Old.h (2 hunks)
src/CMSHistSum.cc (1 hunks)
src/CombineCodegenImpl.cxx (2 hunks)
src/FastTemplate_Old.cc (5 hunks)
test/testCreateNLL.cxx (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

interface/CMSHistSum.h (1)

src/CMSHistSum.cc (2)

getFuncValList (415-431)

getFuncValList (415-415)

interface/CombineCodegenImpl.h (3)

interface/CMSHistErrorPropagator.h (1)

CMSHistErrorPropagator (17-47)

interface/CMSHistSum.h (1)

CMSHistSum (17-48)

src/CMSHistSum.cc (3)

CMSHistSum (22-22)

CMSHistSum (24-170)

CMSHistSum (172-197)

interface/CombineMathFuncs.h (2)

src/CombineCodegenImpl.cxx (1)

values (302-302)

interface/CombineCodegenImpl.h (1)

RooFit (15-33)

interface/FastTemplate_Old.h (2)

src/CMSInterferenceFunc.cxx (1)

values_ (28-28)

src/FastTemplate_Old.cc (2)

FastHisto (56-66)

FastHisto (68-73)

src/FastTemplate_Old.cc (2)

src/CMSInterferenceFunc.cxx (1)

values_ (28-28)

src/HZZ4LRooPdfs.cc (1)

z (3878-3878)

🪛 GitHub Actions: clang-format

src/CombineCodegenImpl.cxx

[error] 1-1: clang-format formatting differences detected. Run 'git-clang-format HEAD~' to fix code style issues in this file.

src/FastTemplate_Old.cc

[error] 1-1: clang-format formatting differences detected. Run 'git-clang-format HEAD~' to fix code style issues in this file.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
GitHub Check: dev3/latest - ROOT LCG master
GitHub Check: LCG_108 - ROOT 6.36.02
GitHub Check: Compile (py3.12, root6.34.4)

🔇 Additional comments (15)

interface/CMSHistErrorPropagator.h (1)

74-75: LGTM!

The new getXVar() accessor is consistent with existing accessors (coefList(), funcList()) and properly exposes the x variable for the codegen integration.

src/FastTemplate_Old.cc (2)

75-80: Breaking change in bin lookup semantics for edge values.

Switching from std::lower_bound to std::upper_bound changes behavior when x lands exactly on a bin edge. With lower_bound, a value exactly on edge E would map to the bin before E; with upper_bound, it maps to the bin starting at E.

Example with binEdges = [0, 1, 2, 3] and x = 1.0:

lower_bound: returns iterator to 1.0, index = 1, bin = 0

upper_bound: returns iterator to 2.0, index = 2, bin = 1

Please confirm this behavioral change is intentional and that downstream consumers (including the new codegen paths) expect the updated semantics.

83-87: Consistent application of upper_bound.

The same change is applied here as in FindBin. This is consistent, but shares the same behavioral implications noted above.

interface/FastTemplate_Old.h (3)

79-80: LGTM!

The GetValues() accessor follows the same pattern as CMSInterferenceFunc::getValues() (see src/CMSInterferenceFunc.cxx:27), providing consistent API for accessing internal values.

87-87: Verify default initialization behavior.

Switching to = default relies on default member initialization. FastTemplate base class initializes size_ to 0 and values_ to empty, and binEdges_/binWidths_ will be empty vectors. This should be equivalent to the previous explicit default constructor, but please verify no initialization was lost.

129-130: LGTM!

The GetBinEdges() accessor is consistent with GetValues() and provides the necessary access for the new codegen paths in CombineMathFuncs.h.

interface/CombineMathFuncs.h (2)

10-11: LGTM!

The include is necessary for RooFit::Detail::MathFuncs::binNumber used in the new helper functions.

311-315: Verify binNumber function signature.

Please verify that the binNumber call matches the expected signature from RooFit/Detail/MathFuncs.h. The arguments (x, 1.0, binEdges, nBins + 1, nBins, 0) should be confirmed against the function's parameter list to ensure correct parameter types and order.

interface/CMSHistSum.h (1)

77-78: Consider bounds checking for fnIdx.

The getFuncValList method accesses multiple container elements using fnIdx as an index. Without verification of the implementation, please ensure that fnIdx is validated against appropriate bounds (likely n_procs_ or container sizes) to prevent undefined behavior on invalid input.

Consider adding an assertion or bounds check for this public method, or document any assumptions about valid input ranges.

interface/CombineCodegenImpl.h (1)

7-9: LGTM!

The forward declarations and function signatures follow the existing patterns in the header. The additions are consistent with the other codegen implementations.

Also applies to: 20-22, 28-30

test/testCreateNLL.cxx (3)

34-40: LGTM!

The refactored loadSnapshotIfExists is cleaner with explicit null checks before operations.

42-124: Well-structured refactoring of minimizer configuration.

The reorganized configureCascadeMinimizerState function has clearer initialization order and explicit null/type checks. The use of structured bindings and range-based loops improves readability.

393-397: Verify test coverage intent with commented-out configurations.

Two test configurations are commented out (analytic on for both backends). If these are intentionally disabled, consider adding a comment explaining why, or remove them if they're not planned for future use.

src/CombineCodegenImpl.cxx (1)

263-278: Calling evaluate() during codegen introduces side effects that mutate internal caches.

The call to arg.evaluate() on line 264 triggers cache creation as a side effect. This pattern is problematic for a codegen function, which should be side-effect-free and repeatable. The adjacent comment acknowledges uncertainty about whether values are constant.

Consider extracting a const cache accessor or a dedicated cache initialization method that doesn't have evaluate's side effects. This would make the codegen contract clearer and safer if invoked multiple times.

src/CMSHistSum.cc (1)

423-429: Bin type logic may be inconsistent with updateCache.

In updateCache (lines 393-404), bin types 2 and 3 are handled per-process using bintypes_[j][i] where i is the process index. Here, the code checks bintypes_[j][0] regardless of fnIdx, which would only match the first process's bin type configuration. This seems inconsistent with how per-process bin types are intended to work.

Should this be checking bintypes_[j][fnIdx] for per-process bin types 2/3?

coderabbitai · 2025-12-04T15:36:48Z

src/CMSHistSum.cc

+std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {
+  staging_ = compcache_[fnIdx];
+  if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {
+    staging_.Exp();
+    staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());
+  }
+  staging_.CropUnderflows();
+  std::vector<double> result = staging_.GetValues();
+  for (unsigned j = 0; j < bintypes_.size(); ++j) {
+    if (bintypes_[j][0] == 1) {
+      double x = vbinpars_[j][0]->getVal();
+      result[j] += binerrors_[fnIdx][j] * x;
+    } else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)
+      result[j] += scaledbinmods_[fnIdx][j];
+  }
+  return result;
+}


⚠️ Potential issue | 🟠 Major

Missing bounds check on fnIdx parameter.

The function does not validate that fnIdx is within valid range before accessing compcache_[fnIdx], vtype_[fnIdx], binerrors_[fnIdx], and scaledbinmods_[fnIdx]. This could lead to out-of-bounds access if called with an invalid index.

Consider adding bounds validation:

std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) { + if (fnIdx >= static_cast<std::size_t>(n_procs_)) { + throw std::out_of_range("fnIdx exceeds number of processes"); + } staging_ = compcache_[fnIdx];

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {

staging_ = compcache_[fnIdx];

if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {

staging_.Exp();

staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());

}

staging_.CropUnderflows();

std::vector<double> result = staging_.GetValues();

for (unsigned j = 0; j < bintypes_.size(); ++j) {

if (bintypes_[j][0] == 1) {

double x = vbinpars_[j][0]->getVal();

result[j] += binerrors_[fnIdx][j] * x;

} else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)

result[j] += scaledbinmods_[fnIdx][j];

}

return result;

}

std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {

if (fnIdx >= static_cast<std::size_t>(n_procs_)) {

throw std::out_of_range("fnIdx exceeds number of processes");

}

staging_ = compcache_[fnIdx];

if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {

staging_.Exp();

staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());

}

staging_.CropUnderflows();

std::vector<double> result = staging_.GetValues();

for (unsigned j = 0; j < bintypes_.size(); ++j) {

if (bintypes_[j][0] == 1) {

double x = vbinpars_[j][0]->getVal();

result[j] += binerrors_[fnIdx][j] * x;

} else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)

result[j] += scaledbinmods_[fnIdx][j];

}

return result;

}

coderabbitai · 2025-12-04T15:36:49Z

src/CombineCodegenImpl.cxx

+std::string RooFit::Experimental::codegenIntegralImpl(CMSHistErrorPropagator& arg,
+                                                      int code,
+                                                      const char* rangeName,
+                                                      CodegenContext& ctx) {
+  return "2.0";  // TODO: dummy for now
+}


⚠️ Potential issue | 🟡 Minor

Placeholder integral implementation returns hardcoded value.

The codegenIntegralImpl for CMSHistErrorPropagator returns "2.0", which is a placeholder. This should be tracked and replaced with a proper implementation before production use.

Would you like me to open an issue to track implementing the proper integral codegen for CMSHistErrorPropagator?

coderabbitai · 2025-12-04T15:36:49Z

src/CombineCodegenImpl.cxx

+void RooFit::Experimental::codegenImpl(CMSHistSum& arg, CodegenContext& ctx) {
+  arg.evaluate();
+  std::vector<double> const& edges = arg.cache().GetBinEdges();
+  std::size_t nBins = edges.size() - 1;
+  RooArgList const& coefs = arg.coefList();
+  std::size_t nSamples = coefs.size();
+  std::vector<double> values(nBins * nSamples);
+  for (std::size_t iSamples = 0; iSamples < nSamples; ++iSamples) {
+    std::vector<double> sampleValues = arg.getFuncValList(iSamples);
+    for (std::size_t iBin = 0; iBin < nBins; ++iBin)
+      values[iBin + iSamples * nBins] = sampleValues[iBin];
+  }
+
+  ctx.addResult(&arg,
+                ctx.buildCall("RooFit::Detail::MathFuncs::cmsHistSum",
+                              arg.getXVar(),
+                              nBins,
+                              nSamples,
+                              coefs,
+                              edges,
+                              values
+                              ));
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for staging_ usage in the codebase rg -n 'staging_' --type cc --type h

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224

🏁 Script executed:

# Look for the CMSHistSum class definition and getFuncValList method rg -n 'class CMSHistSum\|getFuncValList' --type cc --type h -A 3

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224

🏁 Script executed:

# Search for mutex, lock, or synchronization primitives in CMSHistSum files rg -n 'mutex|lock|synchron|thread' src/CMSHistSum.cc interface/CMSHistSum.h --type cc --type h

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224

🏁 Script executed:

# Check the codegenImpl function and CodegenContext for threading info rg -n 'codegenImpl|CodegenContext' src/CombineCodegenImpl.cxx --type cc -B 2 -A 5

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224

🌐 Web query:

CMSHistSum RooFit codegen thread safety concurrent

💡 Result:

Short answer: not safely — RooFit's code generation and some related operations (workspace I/O, codegen/JIT, shared caches) are not thread‑safe for concurrent use; protect them with a mutex or run them in separate processes. Evidence: forum threads and RooFit source show codegen/IO usage is serialized in practice and users report segfaults when calling import/codegen from multiple threads. [1][2]. Recommendation: serialize codegen/workspace import (std::mutex) or fork worker processes. [3]

Sources:

Threaded RooFit issues and advice to lock workspace import (forum discussion). [1]

RooFit codegen implementation / headers (shows centralized codegen structures). [2]

Combine documentation (RooFit-based tools used in CMS — use separate processes or guard shared RooFit operations). [3]

Ensure codegen executes in a single-threaded context; RooFit's code generation is not thread-safe.

The getFuncValList method and codegen operations in RooFit modify shared internal state and are documented as unsafe for concurrent execution. Calling this function from multiple threads will cause race conditions and potential segfaults. Either serialize codegen execution with a mutex, run codegen in separate processes, or document that this function must only be invoked single-threaded.

coderabbitai · 2025-12-04T15:36:49Z

src/CombineCodegenImpl.cxx

+std::string RooFit::Experimental::codegenIntegralImpl(CMSHistSum& arg,
+                                                      int code,
+                                                      const char* rangeName,
+                                                      CodegenContext& ctx) {
+  return "3.0";  // TODO: dummy for now
+}


⚠️ Potential issue | 🟡 Minor

Placeholder integral implementation returns hardcoded value.

Similarly, codegenIntegralImpl for CMSHistSum returns "3.0". This needs a proper implementation.

Would you like me to open an issue to track implementing the proper integral codegen for CMSHistSum?

coderabbitai · 2025-12-04T15:36:49Z

test/testCreateNLL.cxx

+    auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));
+    if (!workspace) {
+      std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n";
+      file->ls();
+      //return 2;
+    }

-  // Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
-  utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
-  loadSnapshotIfExists(*workspace, "clean");
+    // Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
+    utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
+    loadSnapshotIfExists(*workspace, "clean");

-  auto *modelConfig =
-      dynamic_cast<RooStats::ModelConfig *>(workspace->genobj(modelConfigName.c_str()));
-  if (!modelConfig) {
-    std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '"
-              << workspaceName << "'.\n";
-    //return 2;
-  }
+    auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));
+    if (!modelConfig) {
+      std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";
+      //return 2;
+    }

-  RooAbsPdf *pdf = modelConfig->GetPdf();
-  if (!pdf) {
-    std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
-    //return 2;
-  }
+    RooAbsPdf* pdf = modelConfig->GetPdf();
+    if (!pdf) {
+      std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
+      //return 2;
+    }

-  RooAbsData *data = workspace->data(dataName.c_str());
-  if (!data) {
-    std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName
-              << "'.\n";
-    //return 2;
-  }
+    RooAbsData* data = workspace->data(dataName.c_str());
+    if (!data) {
+      std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";
+      //return 2;
+    }


⚠️ Potential issue | 🟠 Major

Null pointer checks lack proper test failure assertions.

Multiple null pointer checks (lines 295-299, 306-309, 312-315, 318-321) log errors to std::cerr but do not fail the test or return. The commented-out return statements suggest this was intentional, but the test will continue with null pointers, likely causing crashes or undefined behavior.

Either add ASSERT_TRUE checks or uncomment the returns:

auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str())); - if (!workspace) { - std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n"; - file->ls(); - //return 2; - } + ASSERT_NE(workspace, nullptr) << "workspace '" << workspaceName << "' not found in '" << fileName << "'";

Apply similar changes for modelConfig, pdf, data, and poi checks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));

if (!workspace) {

std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n";

file->ls();

//return 2;

}

// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603

utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);

loadSnapshotIfExists(*workspace, "clean");

// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603

utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);

loadSnapshotIfExists(*workspace, "clean");

auto *modelConfig =

dynamic_cast<RooStats::ModelConfig *>(workspace->genobj(modelConfigName.c_str()));

if (!modelConfig) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '"

<< workspaceName << "'.\n";

//return 2;

}

auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));

if (!modelConfig) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";

//return 2;

}

RooAbsPdf *pdf = modelConfig->GetPdf();

if (!pdf) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";

//return 2;

}

RooAbsPdf* pdf = modelConfig->GetPdf();

if (!pdf) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";

//return 2;

}

RooAbsData *data = workspace->data(dataName.c_str());

if (!data) {

std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName

<< "'.\n";

//return 2;

}

RooAbsData* data = workspace->data(dataName.c_str());

if (!data) {

std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";

//return 2;

}

auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));

ASSERT_NE(workspace, nullptr) << "workspace '" << workspaceName << "' not found in '" << fileName << "'";

// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603

utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);

loadSnapshotIfExists(*workspace, "clean");

auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));

if (!modelConfig) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";

//return 2;

}

RooAbsPdf* pdf = modelConfig->GetPdf();

if (!pdf) {

std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";

//return 2;

}

RooAbsData* data = workspace->data(dataName.c_str());

if (!data) {

std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";

//return 2;

}

🤖 Prompt for AI Agents

In test/testCreateNLL.cxx around lines 294 to 321, the null pointer checks for workspace, modelConfig, pdf, and data only log errors to std::cerr and leave commented-out returns, which lets the test continue with null pointers; replace those with proper test failures by either uncommenting the return statements or (preferably in this unit test) add ASSERT_TRUE checks (e.g., ASSERT_TRUE(workspace) << "message") for workspace, modelConfig, pdf, and data (and do the same for the poi check elsewhere), so the test will immediately fail instead of continuing with null pointers.

coderabbitai · 2025-12-04T15:36:50Z

test/testCreateNLL.cxx

+      auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);
+      if (!codegenFit) {
+        if (cfg.expectFailure) {
+          GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;
+        } else {
+          ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage;
+        }
+        return;
+      }


⚠️ Potential issue | 🟡 Minor

Error message references wrong backend name.

Line 361 reports "Failed to run fit with 'codegen' backend" but the actual backend used on line 356 is "codegen_no_grad". This could cause confusion when debugging test failures.

- ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage; + ADD_FAILURE() << "Failed to run fit with 'codegen_no_grad' backend: " << errMessage;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);

if (!codegenFit) {

if (cfg.expectFailure) {

GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;

} else {

ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage;

}

return;

}

auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);

if (!codegenFit) {

if (cfg.expectFailure) {

GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;

} else {

ADD_FAILURE() << "Failed to run fit with 'codegen_no_grad' backend: " << errMessage;

}

return;

}

🤖 Prompt for AI Agents

In test/testCreateNLL.cxx around lines 356 to 364, the failure message references the wrong backend name: the code calls runFit with "codegen_no_grad" but logs "codegen" on failure; change the error strings to mention "codegen_no_grad" (both in the GTEST_SKIP and ADD_FAILURE messages where applicable) so the logged backend matches the actual backend used and avoid confusion during debugging.

vgvassilev · 2025-12-07T10:42:39Z

test/testCreateNLL.cxx

+    if (!ws.getSnapshot(name))
+      return false;
+    return ws.loadSnapshot(name);
+  }


Did we touch these lines actually? If not, let's not change them.

PetroZarytskyi and others added 2 commits December 4, 2025 16:26

Consider a bin edge to fall in the next bin like in ROOT

a538001

[WIP] Add codegen implementations of CMSHist classes

6f1e1c3

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

vgvassilev reviewed Dec 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add codegen implementations of CMSHist classes #1191

Add codegen implementations of CMSHist classes #1191

Uh oh!

PetroZarytskyi commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

vgvassilev Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add codegen implementations of CMSHist classes #1191

Are you sure you want to change the base?

Add codegen implementations of CMSHist classes #1191

Uh oh!

Conversation

PetroZarytskyi commented Dec 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

vgvassilev Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PetroZarytskyi commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading