Skip to content

Conversation

@PetroZarytskyi
Copy link
Contributor

@PetroZarytskyi PetroZarytskyi commented Dec 4, 2025

This PR is based on #1162

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced histogram processing with new utility functions for bin indexing and weighted sums
    • Added code generation support for histogram-based operations
    • Introduced public accessors for histogram data and internal variables
  • Bug Fixes

    • Corrected histogram bin boundary lookup behavior for edge cases
  • Refactor

    • Improved test framework parameter handling and configuration logic

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

The PR adds new public accessor methods to histogram classes (getXVar, getFuncValList, GetValues, GetBinEdges), introduces utility math functions for histogram operations, and implements code generation support for CMSHistFunc, CMSHistErrorPropagator, and CMSHistSum. It also modifies bin-edge lookup behavior in FastTemplate and updates test infrastructure with refactored initialization logic.

Changes

Cohort / File(s) Summary
Histogram class accessors
interface/CMSHistErrorPropagator.h, interface/CMSHistSum.h, interface/FastTemplate_Old.h
Added public getters: getXVar() returning const reference to x_ proxy, getFuncValList(fnIdx) returning per-bin values vector, GetValues() and GetBinEdges() exposing internal containers; FastHisto constructor changed to defaulted form
Codegen infrastructure
interface/CombineCodegenImpl.h, src/CombineCodegenImpl.cxx
Added forward declarations and new overloads for codegenImpl() and codegenIntegralImpl() handling CMSHistErrorPropagator, CMSHistFunc, and CMSHistSum; each generates specialized math function calls with cached data; includes reorganized #include directives
Math utility functions
interface/CombineMathFuncs.h
Introduced three new inline functions in RooFit::Detail::MathFuncs: cmsHistFunc() for bin indexing and value lookup, cmsHistErrorPropagator() for coefficient-weighted sums, cmsHistSum() for bin-wise weighted sums across samples; added supporting include
Implementation
src/CMSHistSum.cc
Implemented getFuncValList() computing per-bin values with LogQuadLinear transformation, underflow cropping, and bin-type-specific adjustments from binerrors_ and scaledbinmods_
Boundary lookup behavior
src/FastTemplate_Old.cc
Replaced std::lower_bound with std::upper_bound across all GetAt and FindBin methods in FastHisto/FastHisto2D/FastHisto3D, altering bin index mapping for edge values
Test infrastructure
test/testCreateNLL.cxx
Refactored configureCascadeMinimizerState() with guarded parameter list initialization; updated runFit() signature and implementation with codegen path handling, improved error propagation, and result summaries; added FitSummary struct; adjusted test parameterization and formatting

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • src/FastTemplate_Old.cc: Critical boundary behavior change (lower_bound → upper_bound) affecting bin indexing across three dimensional variants; requires careful validation that edge-case handling aligns with intended semantics
  • src/CMSHistSum.cc: getFuncValList() contains logic branching on bin types with transformations and scaling operations; verify correctness of underflow cropping and bin modifier application
  • test/testCreateNLL.cxx: Extensive refactoring of initialization logic and test flow; new struct definitions and enum require validation of test coverage and backend comparison paths
  • src/CombineCodegenImpl.cxx: Multiple new codegen overloads assembling cached data; placeholder integral implementations ("2.0", "3.0") may be incomplete—confirm intended behavior

Possibly related PRs

Suggested reviewers

  • guitargeek
  • anigamova

Poem

🐰 Hop! New histograms take their stage,
With getters swift and codegen's page,
Math functions bloom in detail's glow,
Upper bounds help bin edges flow,
Tests adapt with care and might—
Combine's canvas shines so bright!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding codegen implementations for CMSHist classes, which aligns with the substantial additions across multiple files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/FastTemplate_Old.cc (1)

1-1: Fix clang-format violations.

The CI pipeline reports formatting differences. Run git-clang-format HEAD~ to fix code style issues.

src/CombineCodegenImpl.cxx (1)

1-327: Address clang-format violations.

The CI pipeline reports formatting differences. Run git-clang-format HEAD~ to fix code style issues.

🧹 Nitpick comments (2)
interface/CombineMathFuncs.h (2)

317-324: Unused parameter x in cmsHistErrorPropagator.

The x parameter is declared but never used in the function body. If this is intentional (e.g., to maintain a consistent interface), consider adding (void)x; or using [[maybe_unused]] to suppress compiler warnings. Otherwise, remove the parameter.

-inline double cmsHistErrorPropagator(double x, std::size_t nFuncs, double const* coefList, double const* funcList) {
+inline double cmsHistErrorPropagator([[maybe_unused]] double x, std::size_t nFuncs, double const* coefList, double const* funcList) {
   // My naive understanding of the logic: multiply functions with coefficients and sum up
   double out = 0.;
   for (std::size_t i = 0; i < nFuncs; ++i) {
     out += coefList[i] * funcList[i];
   }
   return out;
 }

326-334: Parameter coeffs should be const.

The coeffs pointer is only read, not modified. For consistency with binEdges and values, and to enforce const-correctness:

 inline double cmsHistSum(
-    double x, std::size_t nBins, std::size_t nSamples, double* coeffs, double const* binEdges, double const* values) {
+    double x, std::size_t nBins, std::size_t nSamples, double const* coeffs, double const* binEdges, double const* values) {
   unsigned int binIdx = RooFit::Detail::MathFuncs::binNumber(x, 1.0, binEdges, nBins + 1, nBins, 0);
   double val = 0.;
   for (std::size_t iSample = 0; iSample < nSamples; ++iSample)
     val += coeffs[iSample] * values[iSample * nBins + binIdx];

   return val;
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18685de and 6f1e1c3.

📒 Files selected for processing (9)
  • interface/CMSHistErrorPropagator.h (1 hunks)
  • interface/CMSHistSum.h (1 hunks)
  • interface/CombineCodegenImpl.h (2 hunks)
  • interface/CombineMathFuncs.h (2 hunks)
  • interface/FastTemplate_Old.h (2 hunks)
  • src/CMSHistSum.cc (1 hunks)
  • src/CombineCodegenImpl.cxx (2 hunks)
  • src/FastTemplate_Old.cc (5 hunks)
  • test/testCreateNLL.cxx (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
interface/CMSHistSum.h (1)
src/CMSHistSum.cc (2)
  • getFuncValList (415-431)
  • getFuncValList (415-415)
interface/CombineCodegenImpl.h (3)
interface/CMSHistErrorPropagator.h (1)
  • CMSHistErrorPropagator (17-47)
interface/CMSHistSum.h (1)
  • CMSHistSum (17-48)
src/CMSHistSum.cc (3)
  • CMSHistSum (22-22)
  • CMSHistSum (24-170)
  • CMSHistSum (172-197)
interface/CombineMathFuncs.h (2)
src/CombineCodegenImpl.cxx (1)
  • values (302-302)
interface/CombineCodegenImpl.h (1)
  • RooFit (15-33)
interface/FastTemplate_Old.h (2)
src/CMSInterferenceFunc.cxx (1)
  • values_ (28-28)
src/FastTemplate_Old.cc (2)
  • FastHisto (56-66)
  • FastHisto (68-73)
src/FastTemplate_Old.cc (2)
src/CMSInterferenceFunc.cxx (1)
  • values_ (28-28)
src/HZZ4LRooPdfs.cc (1)
  • z (3878-3878)
🪛 GitHub Actions: clang-format
src/CombineCodegenImpl.cxx

[error] 1-1: clang-format formatting differences detected. Run 'git-clang-format HEAD~' to fix code style issues in this file.

src/FastTemplate_Old.cc

[error] 1-1: clang-format formatting differences detected. Run 'git-clang-format HEAD~' to fix code style issues in this file.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
  • GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
  • GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
  • GitHub Check: dev3/latest - ROOT LCG master
  • GitHub Check: LCG_108 - ROOT 6.36.02
  • GitHub Check: Compile (py3.12, root6.34.4)
🔇 Additional comments (15)
interface/CMSHistErrorPropagator.h (1)

74-75: LGTM!

The new getXVar() accessor is consistent with existing accessors (coefList(), funcList()) and properly exposes the x variable for the codegen integration.

src/FastTemplate_Old.cc (2)

75-80: Breaking change in bin lookup semantics for edge values.

Switching from std::lower_bound to std::upper_bound changes behavior when x lands exactly on a bin edge. With lower_bound, a value exactly on edge E would map to the bin before E; with upper_bound, it maps to the bin starting at E.

Example with binEdges = [0, 1, 2, 3] and x = 1.0:

  • lower_bound: returns iterator to 1.0, index = 1, bin = 0
  • upper_bound: returns iterator to 2.0, index = 2, bin = 1

Please confirm this behavioral change is intentional and that downstream consumers (including the new codegen paths) expect the updated semantics.


83-87: Consistent application of upper_bound.

The same change is applied here as in FindBin. This is consistent, but shares the same behavioral implications noted above.

interface/FastTemplate_Old.h (3)

79-80: LGTM!

The GetValues() accessor follows the same pattern as CMSInterferenceFunc::getValues() (see src/CMSInterferenceFunc.cxx:27), providing consistent API for accessing internal values.


87-87: Verify default initialization behavior.

Switching to = default relies on default member initialization. FastTemplate base class initializes size_ to 0 and values_ to empty, and binEdges_/binWidths_ will be empty vectors. This should be equivalent to the previous explicit default constructor, but please verify no initialization was lost.


129-130: LGTM!

The GetBinEdges() accessor is consistent with GetValues() and provides the necessary access for the new codegen paths in CombineMathFuncs.h.

interface/CombineMathFuncs.h (2)

10-11: LGTM!

The include is necessary for RooFit::Detail::MathFuncs::binNumber used in the new helper functions.


311-315: Verify binNumber function signature.

Please verify that the binNumber call matches the expected signature from RooFit/Detail/MathFuncs.h. The arguments (x, 1.0, binEdges, nBins + 1, nBins, 0) should be confirmed against the function's parameter list to ensure correct parameter types and order.

interface/CMSHistSum.h (1)

77-78: Consider bounds checking for fnIdx.

The getFuncValList method accesses multiple container elements using fnIdx as an index. Without verification of the implementation, please ensure that fnIdx is validated against appropriate bounds (likely n_procs_ or container sizes) to prevent undefined behavior on invalid input.

Consider adding an assertion or bounds check for this public method, or document any assumptions about valid input ranges.

interface/CombineCodegenImpl.h (1)

7-9: LGTM!

The forward declarations and function signatures follow the existing patterns in the header. The additions are consistent with the other codegen implementations.

Also applies to: 20-22, 28-30

test/testCreateNLL.cxx (3)

34-40: LGTM!

The refactored loadSnapshotIfExists is cleaner with explicit null checks before operations.


42-124: Well-structured refactoring of minimizer configuration.

The reorganized configureCascadeMinimizerState function has clearer initialization order and explicit null/type checks. The use of structured bindings and range-based loops improves readability.


393-397: Verify test coverage intent with commented-out configurations.

Two test configurations are commented out (analytic on for both backends). If these are intentionally disabled, consider adding a comment explaining why, or remove them if they're not planned for future use.

src/CombineCodegenImpl.cxx (1)

263-278: Calling evaluate() during codegen introduces side effects that mutate internal caches.

The call to arg.evaluate() on line 264 triggers cache creation as a side effect. This pattern is problematic for a codegen function, which should be side-effect-free and repeatable. The adjacent comment acknowledges uncertainty about whether values are constant.

Consider extracting a const cache accessor or a dedicated cache initialization method that doesn't have evaluate's side effects. This would make the codegen contract clearer and safer if invoked multiple times.

src/CMSHistSum.cc (1)

423-429: Bin type logic may be inconsistent with updateCache.

In updateCache (lines 393-404), bin types 2 and 3 are handled per-process using bintypes_[j][i] where i is the process index. Here, the code checks bintypes_[j][0] regardless of fnIdx, which would only match the first process's bin type configuration. This seems inconsistent with how per-process bin types are intended to work.

Should this be checking bintypes_[j][fnIdx] for per-process bin types 2/3?

Comment on lines +415 to +431
std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {
staging_ = compcache_[fnIdx];
if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {
staging_.Exp();
staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());
}
staging_.CropUnderflows();
std::vector<double> result = staging_.GetValues();
for (unsigned j = 0; j < bintypes_.size(); ++j) {
if (bintypes_[j][0] == 1) {
double x = vbinpars_[j][0]->getVal();
result[j] += binerrors_[fnIdx][j] * x;
} else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)
result[j] += scaledbinmods_[fnIdx][j];
}
return result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing bounds check on fnIdx parameter.

The function does not validate that fnIdx is within valid range before accessing compcache_[fnIdx], vtype_[fnIdx], binerrors_[fnIdx], and scaledbinmods_[fnIdx]. This could lead to out-of-bounds access if called with an invalid index.

Consider adding bounds validation:

 std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {
+  if (fnIdx >= static_cast<std::size_t>(n_procs_)) {
+    throw std::out_of_range("fnIdx exceeds number of processes");
+  }
   staging_ = compcache_[fnIdx];
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {
staging_ = compcache_[fnIdx];
if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {
staging_.Exp();
staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());
}
staging_.CropUnderflows();
std::vector<double> result = staging_.GetValues();
for (unsigned j = 0; j < bintypes_.size(); ++j) {
if (bintypes_[j][0] == 1) {
double x = vbinpars_[j][0]->getVal();
result[j] += binerrors_[fnIdx][j] * x;
} else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)
result[j] += scaledbinmods_[fnIdx][j];
}
return result;
}
std::vector<double> CMSHistSum::getFuncValList(std::size_t fnIdx) {
if (fnIdx >= static_cast<std::size_t>(n_procs_)) {
throw std::out_of_range("fnIdx exceeds number of processes");
}
staging_ = compcache_[fnIdx];
if (vtype_[fnIdx] == CMSHistFunc::VerticalSetting::LogQuadLinear) {
staging_.Exp();
staging_.Scale(storage_[process_fields_[fnIdx]].Integral() / staging_.Integral());
}
staging_.CropUnderflows();
std::vector<double> result = staging_.GetValues();
for (unsigned j = 0; j < bintypes_.size(); ++j) {
if (bintypes_[j][0] == 1) {
double x = vbinpars_[j][0]->getVal();
result[j] += binerrors_[fnIdx][j] * x;
} else if (bintypes_[j][0] == 2 || bintypes_[j][0] == 3)
result[j] += scaledbinmods_[fnIdx][j];
}
return result;
}

Comment on lines +289 to +294
std::string RooFit::Experimental::codegenIntegralImpl(CMSHistErrorPropagator& arg,
int code,
const char* rangeName,
CodegenContext& ctx) {
return "2.0"; // TODO: dummy for now
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Placeholder integral implementation returns hardcoded value.

The codegenIntegralImpl for CMSHistErrorPropagator returns "2.0", which is a placeholder. This should be tracked and replaced with a proper implementation before production use.

Would you like me to open an issue to track implementing the proper integral codegen for CMSHistErrorPropagator?

Comment on lines +296 to +318
void RooFit::Experimental::codegenImpl(CMSHistSum& arg, CodegenContext& ctx) {
arg.evaluate();
std::vector<double> const& edges = arg.cache().GetBinEdges();
std::size_t nBins = edges.size() - 1;
RooArgList const& coefs = arg.coefList();
std::size_t nSamples = coefs.size();
std::vector<double> values(nBins * nSamples);
for (std::size_t iSamples = 0; iSamples < nSamples; ++iSamples) {
std::vector<double> sampleValues = arg.getFuncValList(iSamples);
for (std::size_t iBin = 0; iBin < nBins; ++iBin)
values[iBin + iSamples * nBins] = sampleValues[iBin];
}

ctx.addResult(&arg,
ctx.buildCall("RooFit::Detail::MathFuncs::cmsHistSum",
arg.getXVar(),
nBins,
nSamples,
coefs,
edges,
values
));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for staging_ usage in the codebase
rg -n 'staging_' --type cc --type h

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224


🏁 Script executed:

# Look for the CMSHistSum class definition and getFuncValList method
rg -n 'class CMSHistSum\|getFuncValList' --type cc --type h -A 3

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224


🏁 Script executed:

# Search for mutex, lock, or synchronization primitives in CMSHistSum files
rg -n 'mutex|lock|synchron|thread' src/CMSHistSum.cc interface/CMSHistSum.h --type cc --type h

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224


🏁 Script executed:

# Check the codegenImpl function and CodegenContext for threading info
rg -n 'codegenImpl|CodegenContext' src/CombineCodegenImpl.cxx --type cc -B 2 -A 5

Repository: cms-analysis/HiggsAnalysis-CombinedLimit

Length of output: 224


🌐 Web query:

CMSHistSum RooFit codegen thread safety concurrent

💡 Result:

Short answer: not safely — RooFit's code generation and some related operations (workspace I/O, codegen/JIT, shared caches) are not thread‑safe for concurrent use; protect them with a mutex or run them in separate processes. Evidence: forum threads and RooFit source show codegen/IO usage is serialized in practice and users report segfaults when calling import/codegen from multiple threads. [1][2]. Recommendation: serialize codegen/workspace import (std::mutex) or fork worker processes. [3]

Sources:

  • Threaded RooFit issues and advice to lock workspace import (forum discussion). [1]
  • RooFit codegen implementation / headers (shows centralized codegen structures). [2]
  • Combine documentation (RooFit-based tools used in CMS — use separate processes or guard shared RooFit operations). [3]

Ensure codegen executes in a single-threaded context; RooFit's code generation is not thread-safe.

The getFuncValList method and codegen operations in RooFit modify shared internal state and are documented as unsafe for concurrent execution. Calling this function from multiple threads will cause race conditions and potential segfaults. Either serialize codegen execution with a mutex, run codegen in separate processes, or document that this function must only be invoked single-threaded.

Comment on lines +320 to +325
std::string RooFit::Experimental::codegenIntegralImpl(CMSHistSum& arg,
int code,
const char* rangeName,
CodegenContext& ctx) {
return "3.0"; // TODO: dummy for now
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Placeholder integral implementation returns hardcoded value.

Similarly, codegenIntegralImpl for CMSHistSum returns "3.0". This needs a proper implementation.

Would you like me to open an issue to track implementing the proper integral codegen for CMSHistSum?

Comment on lines +294 to +321
auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));
if (!workspace) {
std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n";
file->ls();
//return 2;
}

// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
loadSnapshotIfExists(*workspace, "clean");
// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
loadSnapshotIfExists(*workspace, "clean");

auto *modelConfig =
dynamic_cast<RooStats::ModelConfig *>(workspace->genobj(modelConfigName.c_str()));
if (!modelConfig) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '"
<< workspaceName << "'.\n";
//return 2;
}
auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));
if (!modelConfig) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}

RooAbsPdf *pdf = modelConfig->GetPdf();
if (!pdf) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
//return 2;
}
RooAbsPdf* pdf = modelConfig->GetPdf();
if (!pdf) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
//return 2;
}

RooAbsData *data = workspace->data(dataName.c_str());
if (!data) {
std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName
<< "'.\n";
//return 2;
}
RooAbsData* data = workspace->data(dataName.c_str());
if (!data) {
std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Null pointer checks lack proper test failure assertions.

Multiple null pointer checks (lines 295-299, 306-309, 312-315, 318-321) log errors to std::cerr but do not fail the test or return. The commented-out return statements suggest this was intentional, but the test will continue with null pointers, likely causing crashes or undefined behavior.

Either add ASSERT_TRUE checks or uncomment the returns:

     auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));
-    if (!workspace) {
-      std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n";
-      file->ls();
-      //return 2;
-    }
+    ASSERT_NE(workspace, nullptr) << "workspace '" << workspaceName << "' not found in '" << fileName << "'";

Apply similar changes for modelConfig, pdf, data, and poi checks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));
if (!workspace) {
std::cerr << "ERROR: workspace '" << workspaceName << "' not found in '" << fileName << "'.\n";
file->ls();
//return 2;
}
// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
loadSnapshotIfExists(*workspace, "clean");
// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
loadSnapshotIfExists(*workspace, "clean");
auto *modelConfig =
dynamic_cast<RooStats::ModelConfig *>(workspace->genobj(modelConfigName.c_str()));
if (!modelConfig) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '"
<< workspaceName << "'.\n";
//return 2;
}
auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));
if (!modelConfig) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}
RooAbsPdf *pdf = modelConfig->GetPdf();
if (!pdf) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
//return 2;
}
RooAbsPdf* pdf = modelConfig->GetPdf();
if (!pdf) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
//return 2;
}
RooAbsData *data = workspace->data(dataName.c_str());
if (!data) {
std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName
<< "'.\n";
//return 2;
}
RooAbsData* data = workspace->data(dataName.c_str());
if (!data) {
std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}
auto* workspace = dynamic_cast<RooWorkspace*>(file->Get(workspaceName.c_str()));
ASSERT_NE(workspace, nullptr) << "workspace '" << workspaceName << "' not found in '" << fileName << "'";
// Matches https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/master/src/Combine.cc#L505-L603
utils::check_inf_parameters(workspace->allVars(), /*verbosity=*/0);
loadSnapshotIfExists(*workspace, "clean");
auto* modelConfig = dynamic_cast<RooStats::ModelConfig*>(workspace->genobj(modelConfigName.c_str()));
if (!modelConfig) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}
RooAbsPdf* pdf = modelConfig->GetPdf();
if (!pdf) {
std::cerr << "ERROR: ModelConfig '" << modelConfigName << "' does not define a pdf.\n";
//return 2;
}
RooAbsData* data = workspace->data(dataName.c_str());
if (!data) {
std::cerr << "ERROR: dataset '" << dataName << "' not found in workspace '" << workspaceName << "'.\n";
//return 2;
}
🤖 Prompt for AI Agents
In test/testCreateNLL.cxx around lines 294 to 321, the null pointer checks for
workspace, modelConfig, pdf, and data only log errors to std::cerr and leave
commented-out returns, which lets the test continue with null pointers; replace
those with proper test failures by either uncommenting the return statements or
(preferably in this unit test) add ASSERT_TRUE checks (e.g.,
ASSERT_TRUE(workspace) << "message") for workspace, modelConfig, pdf, and data
(and do the same for the poi check elsewhere), so the test will immediately fail
instead of continuing with null pointers.

Comment on lines +356 to +364
auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);
if (!codegenFit) {
if (cfg.expectFailure) {
GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;
} else {
ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage;
}
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error message references wrong backend name.

Line 361 reports "Failed to run fit with 'codegen' backend" but the actual backend used on line 356 is "codegen_no_grad". This could cause confusion when debugging test failures.

-          ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage;
+          ADD_FAILURE() << "Failed to run fit with 'codegen_no_grad' backend: " << errMessage;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);
if (!codegenFit) {
if (cfg.expectFailure) {
GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;
} else {
ADD_FAILURE() << "Failed to run fit with 'codegen' backend: " << errMessage;
}
return;
}
auto codegenFit = runFit(*workspace, *modelConfig, *pdf, *data, *poi, "codegen_no_grad", &errMessage);
if (!codegenFit) {
if (cfg.expectFailure) {
GTEST_SKIP() << "Expected failure for codegen backend: " << errMessage;
} else {
ADD_FAILURE() << "Failed to run fit with 'codegen_no_grad' backend: " << errMessage;
}
return;
}
🤖 Prompt for AI Agents
In test/testCreateNLL.cxx around lines 356 to 364, the failure message
references the wrong backend name: the code calls runFit with "codegen_no_grad"
but logs "codegen" on failure; change the error strings to mention
"codegen_no_grad" (both in the GTEST_SKIP and ADD_FAILURE messages where
applicable) so the logged backend matches the actual backend used and avoid
confusion during debugging.

if (!ws.getSnapshot(name))
return false;
return ws.loadSnapshot(name);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we touch these lines actually? If not, let's not change them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants