Do analytical Barlow Beeston before likelihood evaluation #1137

guitargeek · 2025-10-29T14:45:24Z

Doing the analytical Barlow Beeston before evaluating the likelihood has
two big advantages, instead of doing it while evaluating the
CMSHistSum/CMSHistErrorPropagator:

We don't have to hack the dirty state propagation. This was
necessary before, because it is not allowed to change a RooRealVar
server during the evaluation of the client. Also, the hack relies on
the implementation details of the CachingSimNLL (evaluating main pdf
before constraints), so it's also a problem for using RooFit regular
NLL code path.
If the analytical minimization and the likelihood evaluation
factorize, we don't need to worry how to differentiate through the
analytical minimization and the likelihood evaluation together.

Summary by CodeRabbit

New Features
- Optional early analytic Barlow–Beeston evaluation before model evaluation.
- Explicit support for CMS histogram-based PDFs in caching and evaluation.
Refactor
- Bulk handling of histogram PDFs and streamlined cache/evaluation flow.
- Simplified dirty-state management: removed implicit triggers and redundant dirty-flag bookkeeping.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-10-29T14:46:07Z

Walkthrough

Removed per-argument dirty-propagation state and RooAbsArg dirty-inhibit usage; added explicit analytic Barlow–Beeston entry points and collections for CMS histogram PDFs in CachingAddNLL; CachingNLL now runs analytic Barlow–Beeston on histogram PDFs prior to evaluation and during propagation/setup.

Changes

Cohort / File(s)	Summary
CMSHist headers `interface/CMSHistErrorPropagator.h`, `interface/CMSHistSum.h`	Removed `BarlowBeeston::dirty_prop` member; moved/added `void runBarlowBeeston() const` to the public section (removed duplicate/private declaration).
Caching header `interface/CachingNLL.h`	Added forward declarations for `CMSHistSum` and `CMSHistErrorPropagator`; added `void runAnalyticBarlowBeeston()` to `CachingAddNLL`; added private mutable vectors `histSums_` and `histErrorPropagators_`.
CMSHistErrorPropagator impl `src/CMSHistErrorPropagator.cc`	Removed RooAbsArg::setDirtyInhibit usage and dirty-propagation bookkeeping; `runBarlowBeeston()` now calls `updateCache()` at start; removed unconditional Barlow–Beeston invocation from `updateCache`.
CMSHistSum impl `src/CMSHistSum.cc`	Removed `dirty_prop` and RooAbsArg dirty-inhibit handling; `runBarlowBeeston()`/eval entry now unconditionally calls `updateCache()`; `updateCache()` no longer auto-invokes Barlow–Beeston.
Caching NLL impl `src/CachingNLL.cc`	`setup_()` now classifies PDFs via `dynamic_cast` into `multiPdfs_`, `histSums_`, and `histErrorPropagators_`; added `runAnalyticBarlowBeeston()` which iterates stored hist objects and calls their eval/run Barlow–Beeston entry; `evaluate()`/`SimNLL::evaluate` and `propagateData`/`setAnalyticBarlowBeeston` use bulk iteration over these containers.

Sequence Diagram(s)

sequenceDiagram
    participant Sim as CachingSimNLL
    participant Add as CachingAddNLL
    participant Sum as CMSHistSum
    participant Err as CMSHistErrorPropagator

    Note over Sim: Pre-evaluation step before NLL evaluation
    Sim->>Add: runAnalyticBarlowBeeston()
    activate Add

    loop histSums_
        Add->>Sum: evalBarlowBeeston()
        activate Sum
        Sum->>Sum: updateCache()
        Sum->>Sum: runBarlowBeeston()
        deactivate Sum
    end

    loop histErrorPropagators_
        Add->>Err: evalBarlowBeeston()
        activate Err
        Err->>Err: updateCache()
        Err->>Err: runBarlowBeeston()
        deactivate Err
    end

    deactivate Add
    Sim->>Sim: continue NLL evaluation

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40–60 minutes

Verify removal of dirty_prop and all RooAbsArg::setDirtyInhibit uses (cache consistency and races).
Inspect dynamic_cast-based classification in setup_() and lifetime/ownership of stored pointers (histSums_, histErrorPropagators_).
Confirm ordering: runAnalyticBarlowBeeston() invoked at correct times and does not introduce double-updates or missed updates.
Check propagateData / setAnalyticBarlowBeeston cover all CMS histogram PDF variants and that no call-sites expect previous dirty-propagation semantics.

Poem

🐰
I pruned the props and nudged the run in line,
I wake the histograms to do their Barlow shine.
No dirty hops, just tidy calls at dawn,
Caches hum softly as the code moves on. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.55% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Do analytical Barlow Beeston before likelihood evaluation' accurately summarizes the main objective of the PR: moving the analytical Barlow–Beeston step to execute before likelihood evaluation instead of during it.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (9)

interface/CMSHistErrorPropagator.h (1)

111-111: Signature change remains source‑compatible.

Adding the doBarlowBeeston=false flag keeps old call sites working while enabling the new path. Consider documenting eval semantics (0 vs 1) in the header for future readers.
interface/CachingNLL.h (1)
133-133: Make runAnalyticBarlowBeeston() const.

It doesn’t need to mutate CachingAddNLL; histSums_/histErrorPropagators_ are pointers and already mutable.

Apply:
-        void runAnalyticBarlowBeeston();
+        void runAnalyticBarlowBeeston() const;
Then adjust the definition in CachingNLL.cc accordingly.
src/CMSHistErrorPropagator.cc (2)
196-200: Force a recompute even if caches are “good”.

To guarantee BB is applied when called out‑of‑band, proactively dirty the bin sentry before updating.

Apply:
 void CMSHistErrorPropagator::evalBarlowBeeston() const {
   if (!bb_.init)
     return;
+  binsentry_.setValueDirty();
   updateCache(1, true);
 }
295-296: Stale comment.

“The arg (1) forces …” predates the new BB gating. Reword to avoid implying BB behavior is tied to eval.
-  updateCache(1); // the arg (1) forces updateCache to fill the caches for all bins
+  updateCache(1); // fill caches for all bins
src/CMSHistSum.cc (2)
422-426: Consider dirtying bin cache before BB.

As with ErrorPropagator, ensure BB runs even if binsentry_ reads “good”.
 void CMSHistSum::evalBarlowBeeston() const {
   if (!bb_.init)
     return;
+  binsentry_.setValueDirty();
   updateCache(true);
 }
521-522: Outdated comment.

Remove mention of “arg (1)”.
-  updateCache(); // the arg (1) forces updateCache to fill the caches for all bins
+  updateCache(); // fill caches for all bins
src/CachingNLL.cc (3)
582-593: Clear BB containers alongside multiPdfs_ to avoid stale/duplicated pointers

histSums_ and histErrorPropagators_ aren’t cleared in setup_. If setup_ ever runs again, these can accumulate duplicates or dangle after pdfs_ is rebuilt. Clear them where multiPdfs_ is cleared.

Apply this diff:
@@
-    multiPdfs_.clear();
+    multiPdfs_.clear();
+    histSums_.clear();
+    histErrorPropagators_.clear();
     for (auto &itp : pdfs_) {
821-828: Persist the enable/disable state for later guards

Store the flag (analyticBBEnabled_) so other paths (propagateData/runAnalyticBarlowBeeston) can act accordingly without redundant work.

Apply within this method:
 void cacheutils::CachingAddNLL::setAnalyticBarlowBeeston(bool flag) {
+  analyticBBEnabled_ = flag;
   for (auto* hist : histErrorPropagators_) {
     hist->setAnalyticBarlowBeeston(flag);
   }
   for (auto* hist : histSums_) {
     hist->setAnalyticBarlowBeeston(flag);
   }
 }
830-837: Early-return when analytic BB is disabled

Skip work if not enabled for this channel.

Apply this diff:
 void cacheutils::CachingAddNLL::runAnalyticBarlowBeeston() {
+  if (!analyticBBEnabled_) return;
   for (auto* hist : histErrorPropagators_) {
     hist->evalBarlowBeeston();
   }
   for (auto* hist : histSums_) {
     hist->evalBarlowBeeston();
   }
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 604b623 and c0567b3.

📒 Files selected for processing (6)

interface/CMSHistErrorPropagator.h (2 hunks)
interface/CMSHistSum.h (2 hunks)
interface/CachingNLL.h (3 hunks)
src/CMSHistErrorPropagator.cc (3 hunks)
src/CMSHistSum.cc (3 hunks)
src/CachingNLL.cc (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (6)

interface/CMSHistErrorPropagator.h (1)

src/CMSHistErrorPropagator.cc (4)

evalBarlowBeeston (196-200)

evalBarlowBeeston (196-196)

updateCache (103-194)

updateCache (103-103)

interface/CMSHistSum.h (2)

src/CMSHistErrorPropagator.cc (4)

evalBarlowBeeston (196-200)

evalBarlowBeeston (196-196)

updateCache (103-194)

updateCache (103-103)

src/CMSHistSum.cc (4)

evalBarlowBeeston (422-426)

evalBarlowBeeston (422-422)

updateCache (340-420)

updateCache (340-340)

src/CMSHistErrorPropagator.cc (1)

src/CMSHistSum.cc (6)

updateCache (340-420)

updateCache (340-340)

runBarlowBeeston (428-449)

runBarlowBeeston (428-428)

evalBarlowBeeston (422-426)

evalBarlowBeeston (422-422)

src/CMSHistSum.cc (1)

src/CMSHistErrorPropagator.cc (6)

updateCache (103-194)

updateCache (103-103)

runBarlowBeeston (202-223)

runBarlowBeeston (202-202)

evalBarlowBeeston (196-200)

evalBarlowBeeston (196-196)

interface/CachingNLL.h (5)

interface/CMSHistSum.h (1)

CMSHistSum (17-48)

src/CMSHistSum.cc (3)

CMSHistSum (22-22)

CMSHistSum (24-170)

CMSHistSum (172-197)

interface/CMSHistErrorPropagator.h (1)

CMSHistErrorPropagator (17-47)

src/CMSHistErrorPropagator.cc (3)

CMSHistErrorPropagator (20-20)

CMSHistErrorPropagator (22-38)

CMSHistErrorPropagator (40-52)

src/CachingNLL.cc (2)

runAnalyticBarlowBeeston (830-837)

runAnalyticBarlowBeeston (830-830)

src/CachingNLL.cc (2)

src/CMSHistErrorPropagator.cc (2)

setAnalyticBarlowBeeston (225-282)

setAnalyticBarlowBeeston (225-225)

src/CMSHistSum.cc (2)

setAnalyticBarlowBeeston (451-508)

setAnalyticBarlowBeeston (451-451)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Compile (py3.12, root6.34.4)
GitHub Check: Compile (py3.10, root6.32.2)
GitHub Check: Compile (py3.10, root6.26.4)
GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
GitHub Check: LCG_108 - ROOT 6.36.02
GitHub Check: dev3/latest - ROOT LCG master
GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
GitHub Check: LCG_106 - ROOT 6.32.02
GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09

🔇 Additional comments (9)

interface/CMSHistErrorPropagator.h (1)

79-80: New entrypoint is clear and minimal.

Public evalBarlowBeeston() aligns with the new pre‑evaluation flow. LGTM.

interface/CachingNLL.h (1)

23-24: Forward declarations are appropriate here.

Avoids heavy includes in the header. LGTM.

interface/CMSHistSum.h (2)

82-83: New evalBarlowBeeston() entrypoint looks good.

Matches the ErrorPropagator API for uniform orchestration.

144-144: Flag‑gated updateCache is a clean separation.

Default false preserves the non‑analytic path. LGTM.

src/CMSHistErrorPropagator.cc (2)

155-158: BB run gated correctly.

runBarlowBeeston() now executes only when requested. This avoids side effects during normal evaluation. LGTM.

225-282: Re‑init path freezes gobs at setup time.

If any gobs can change across datasets, consider refreshing bb_.gobs on each (re)enable. Likely fine if they are true observables.

Please confirm gobs are constant across the fit lifecycle in your workflows.

src/CMSHistSum.cc (3)

340-340: API change is tidy.

Single boolean parameter simplifies the sum path. LGTM.

385-387: Gated BB run keeps evaluation side‑effect free.

Matches the new flow; avoids differentiating through BB when not requested. LGTM.

682-685: Verification confirmed: BB is invoked pre-evaluation in the NLL path.

The review comment's concern is already satisfied by the existing implementation:

CachingSimNLL::evaluate() calls runAnalyticBarlowBeeston() at line 1043 before proceeding with NLL evaluation

CachingAddNLL::runAnalyticBarlowBeeston() invokes evalBarlowBeeston() on all histSums_ instances (lines 834–835)

CMSHistSum::evalBarlowBeeston() explicitly calls updateCache(true), enabling Barlow-Beeston minimization (line 425)

When CMSHistSum is integrated into the NLL framework, the Barlow-Beeston minimization runs first via the proper call sequence before any subsequent NLL evaluations. The code structure ensures this ordering is maintained.

coderabbitai · 2025-10-29T14:54:53Z

interface/CachingNLL.h

+        mutable std::vector<CMSHistSum const*> histSums_;
+        mutable std::vector<CMSHistErrorPropagator const*> histErrorPropagators_;


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Non‑owning raw pointers: verify population and clearing.

Ensure these are:

Populated once during setup and deduplicated.

Cleared on data/PDF reset to avoid stale pointers.

Run:

🏁 Script executed:

#!/bin/bash # Where are the vectors filled/cleared and the new method called? rg -nP 'histSums_|histErrorPropagators_' -C2 rg -n 'runAnalyticBarlowBeeston\\(' -C3

Length of output: 2296

🏁 Script executed:

#!/bin/bash # Search for clearing/resetting these vectors rg -nP 'histSums_\s*\.|histErrorPropagators_\s*\.' -g '*.cc' -g '*.h' | grep -E '(clear|erase|reset|resize)' # Look for methods that might reset cached data rg -nP '(reset|clear|flush)' -A5 -B2 interface/CachingNLL.h src/CachingNLL.cc | head -100 # Search for where these vectors are modified/cleared in context rg -n 'histSums_\s*=|histErrorPropagators_\s*=' src/CachingNLL.cc interface/CachingNLL.h

Length of output: 241

🏁 Script executed:

#!/bin/bash # Find the files first fd -e h -e cc 'CachingNLL' # If found, search for clearing operations fd -e cc 'CachingNLL' -x rg -n 'clear|erase' {} \; fd -e h 'CachingNLL' -x rg -n 'clear|erase' {} \; # Search for destructor or reset methods fd -e cc 'CachingNLL' -x rg -n '~CachingAddNLL|reset.*Data' {} \;

Length of output: 1293

🏁 Script executed:

#!/bin/bash # Specifically search entire repo for histSums_ and histErrorPropagators_ rg -n 'histSums_' rg -n 'histErrorPropagators_' # Check the context around line 525 where integrals_ is cleared with other vectors fd -e cc 'CachingNLL' -x rg -n -B5 -A5 'integrals_\.clear' {} \; # Look for the method that clears integrals_ and other vectors fd -e cc 'CachingNLL' -x rg -n -B10 'integrals_\.clear.*pdfs_\.clear' {} \;

Length of output: 1871

Confirm: Non-owning raw pointers must be cleared on reset.

The vectors histSums_ and histErrorPropagators_ are populated during PDF setup (src/CachingNLL.cc:588, 591) but are never cleared. This is inconsistent with other mutable cache vectors (integrals_, pdfs_, coeffs_, multiPdfs_) which are explicitly cleared in setup_() (line 525) and the destructor (line 413). Without clearing, these vectors retain stale pointers when PDFs or data are reset, creating a use-after-free risk in methods like propagateData(), setAnalyticBarlowBeeston(), and runAnalyticBarlowBeeston().

Add histSums_.clear() and histErrorPropagators_.clear() in setup_() after line 525 and in the destructor after line 413.

Verify deduplication logic during population (lines 587–592) to prevent duplicate pointers.

🤖 Prompt for AI Agents

In interface/CachingNLL.h around lines 152–153, the mutable vectors histSums_ and histErrorPropagators_ hold non-owning raw pointers that are never cleared; add histSums_.clear() and histErrorPropagators_.clear() in setup_() immediately after the other cache clears (after line 525) and likewise in the destructor after the other clears (after line 413) to avoid stale pointers, and while here, inspect and fix the deduplication logic used when populating these vectors (around src/CachingNLL.cc lines 587–592) to ensure duplicate pointers are not inserted (e.g., check for existing pointer before push_back).

src/CachingNLL.cc

vgvassilev · 2025-11-04T14:07:06Z

@guitargeek, do we understand why these bots fail?

guitargeek · 2025-11-05T12:22:55Z

Not yet. I still need to dig into this.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/CachingNLL.cc (1)

794-801: Data propagation to CMSHist components is correct but tightly coupled*

Forwarding the channel data via setData(*data_) to all CMSHistErrorPropagator and CMSHistSum instances is exactly what the analytic BB code expects and keeps their internal data_ vectors in sync. Just be aware this relies on histErrorPropagators_ / histSums_ staying in lockstep with pdfs_; if setup_() is ever reused for reconfiguration, those vectors should also be cleared there.

src/CMSHistSum.cc (1)

521-522: Minor: comment now misleading after updateCache signature change

updateCache(); // the arg (1) forces updateCache to fill the caches for all bins is now misleading since there’s no argument. Behavior is still correct, but consider updating or dropping the comment to avoid confusion for future readers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 195763e and 31f2479.

📒 Files selected for processing (8)

interface/CMSHistErrorPropagator.h (3 hunks)
interface/CMSHistFunc.h (2 hunks)
interface/CMSHistSum.h (2 hunks)
interface/CachingNLL.h (3 hunks)
src/CMSHistErrorPropagator.cc (3 hunks)
src/CMSHistFunc.cc (0 hunks)
src/CMSHistSum.cc (3 hunks)
src/CachingNLL.cc (3 hunks)

💤 Files with no reviewable changes (1)

src/CMSHistFunc.cc

🚧 Files skipped from review as they are similar to previous changes (1)

interface/CachingNLL.h

🧰 Additional context used

🧬 Code graph analysis (3)

src/CMSHistSum.cc (1)

src/CMSHistErrorPropagator.cc (6)

updateCache (99-190)

updateCache (99-99)

runBarlowBeeston (198-219)

runBarlowBeeston (198-198)

evalBarlowBeeston (192-196)

evalBarlowBeeston (192-192)

interface/CMSHistErrorPropagator.h (1)

src/CMSHistErrorPropagator.cc (8)

CMSHistErrorPropagator (20-35)

CMSHistErrorPropagator (37-48)

evalBarlowBeeston (192-196)

evalBarlowBeeston (192-192)

initialize (50-96)

initialize (50-50)

updateCache (99-190)

updateCache (99-99)

src/CMSHistErrorPropagator.cc (1)

src/CMSHistSum.cc (6)

updateCache (340-420)

updateCache (340-340)

runBarlowBeeston (428-449)

runBarlowBeeston (428-428)

evalBarlowBeeston (422-426)

evalBarlowBeeston (422-422)

🔇 Additional comments (12)

interface/CMSHistFunc.h (1)

177-190: In-class defaults for flags look good

Initializing initialized_, rebin_, and fast_vertical_ at declaration removes any risk of use-before-init and matches the existing constructor behavior. No issues here.

interface/CMSHistErrorPropagator.h (1)

36-36: Header/API updates align with implementation

Defaulting the constructor, adding evalBarlowBeeston() const, and extending updateCache(int eval, bool doBarlowBeeston=false) are consistent with the .cc implementation and keep existing call sites valid while exposing the new BB entry point. initialized_’s in-class default is also safe.

Also applies to: 77-77, 102-103, 108-110

src/CachingNLL.cc (4)

569-579: Collecting CMSHist pointers in setup_ is safe*

The dynamic_casts to CMSHistSum and CMSHistErrorPropagator and storing raw pointers in histSums_ / histErrorPropagators_ are fine: the underlying objects are owned by the caching pdf wrappers and live as long as pdfs_. setup_() is only called from the constructors, so there’s no risk of stale duplicates from re-entry.

803-810: Analytic BB flag propagation is straightforward

Broadcasting setAnalyticBarlowBeeston(flag) to all CMS histogram propagators and sums matches the design of their BB initialisation routines. Any per-channel masking is already handled one level up in CachingSimNLL::setAnalyticBarlowBeeston, so no extra logic is needed here.

812-819: Channel-level runAnalyticBarlowBeeston helper is well-scoped

Encapsulating the per-channel BB minimisation in CachingAddNLL::runAnalyticBarlowBeeston() and delegating to evalBarlowBeeston() on each CMSHist* instance keeps the logic local and ensures it’s a no-op when BB hasn’t been initialised (bb_.init == false). No issues spotted.

1020-1026: Pre-evaluation BB minimisation in CachingSimNLL::evaluate looks correct

Running pdfs_[i]->runAnalyticBarlowBeeston() as the first step, with both null-check and channel-mask check, cleanly separates analytic BB minimisation from NLL evaluation and avoids the old “dirty state” hack. This keeps masked or missing channels safe and should be cheap when BB is disabled (eval methods early-exit on !bb_.init).

src/CMSHistSum.cc (2)

340-420: Conditional BB inside updateCache preserves cache semantics

Adding the doBarlowBeeston flag and guarding runBarlowBeeston() under if (doBarlowBeeston) keeps the original cache update logic unchanged when BB is off, while allowing an explicit BB pass when requested. Since sentry_/binsentry_ are still used as before, calling updateCache(true) once per evaluation point is enough; follow-up updateCache() calls see binsentry_.good() and leave the BB-adjusted cache untouched.

422-449: evalBarlowBeeston / runBarlowBeeston implementation is consistent

evalBarlowBeeston() correctly no-ops when bb_.init is false and otherwise routes through updateCache(true). The runBarlowBeeston() body mirrors the standard analytic solution pattern (filling dat, valsum, toterr, solving the quadratic, and pushing results back into bb_.push_res), matching the ErrorPropagator implementation. No functional issues visible.

src/CMSHistErrorPropagator.cc (3)

99-190: updateCache(eval, doBarlowBeeston) maintains previous behavior with finer control

The extended signature correctly preserves:

Rebuild of valsum_, err2sum_, toterr_ when the sentry is dirty or eval changes.

Second pass over bins with optional runBarlowBeeston() gated by doBarlowBeeston.
Using last_eval_ in both conditions keeps the 0/1 eval modes distinct (for wrapper propagation vs normal evaluation) while avoiding redundant work at fixed points. Existing call sites (updateCache(0), updateCache(1), and updateCache()) remain valid through the default doBarlowBeeston=false.

192-219: Analytic BB entry/implementation matches CMSHistSum

evalBarlowBeeston() correctly short-circuits when bb_.init is false and triggers a BB pass via updateCache(1, true) otherwise. runBarlowBeeston() follows the expected pattern (fill per-bin arrays from data_, valsum_, toterr_, solve the quadratic, then write back to bb_.push_res), matching the CMSHistSum implementation. This cleanly decouples analytical minimisation from normal cache evaluation.

291-292: Call sites use the new updateCache API appropriately

setupBinPars uses updateCache(1) to fill caches before bin-type analysis, with BB disabled as desired.

applyErrorShifts calls updateCache(0) to derive scaledbinmods_ without re-running BB, which is correct since it should just propagate existing BB solutions.

evaluate, analyticalIntegral, and setData all call updateCache(1) (or defaulted updateCache()), relying on the separate evalBarlowBeeston pre-step when analytic BB is enabled. This separation matches the new design.

Also applies to: 448-449, 463-466, 505-507, 515-516

interface/CMSHistSum.h (1)

82-83: Barlow–Beeston now opt-in via evalBarlowBeeston() and doBarlowBeeston parameter

With updateCache(bool doBarlowBeeston=false) and the new evalBarlowBeeston() helper, Barlow–Beeston is no longer run implicitly on every cache update. Ensure all code paths which relied on analytic BB (especially outside CachingNLL) now either call evalBarlowBeeston() or pass doBarlowBeeston=true before evaluating the NLL.

amarini · 2025-12-01T19:34:58Z

Did you check w/ @nucleosynthesis and @ajgilbert the order of the two? I remember that there has been a fine tuning in order to make sure that the both the discrete profile and the analytical bb were doing the right thing.

anigamova · 2025-12-01T20:12:28Z

About the differences in the fit results, and the failing tests. I've looked into what's happening briefly and it looks like once the analytic minimization is performed - the initial NLL value gets offset because the initial values of the MC stat parameters are set to the best fit values, and it affects the initial gradients, and then this offset is carried through the whole minimization. It also results in slightly worse minimum actually (larger edm). I'm not sure how to fix it exactly, but perhaps we could subtract this initial (constant) offset.

guitargeek · 2025-12-01T20:25:37Z

Thank you! That's an excellent hint 👍 The offset was my blond spot here

guitargeek · 2025-12-01T20:36:38Z

Thanks to the hint of @anigamova I got unstuck here. I realized that the analytical minimization has to be run not only at the beginning of CachingSimNLL::evaluate(), which I thought was the only relevant entry point, but also at the beginning of the evaluation for the NLL class of the individual channels (CachingAddNLL::evaluate()), in order for them to initialize their offsets correctly.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/CMSHistSum.cc (1)
340-426: Ensure analytic BB actually runs once when enabled

With updateCache(bool doBarlowBeeston) gated on !sentry_.good() / !binsentry_.good() and eval != last_eval_, the first call to evalBarlowBeeston() after calling setAnalyticBarlowBeeston(true) can be a no-op if the cache was already filled (both sentries good and last_eval_ == 1). In that case, runBarlowBeeston() is only triggered after some other parameter change dirties the cache, which is unintuitive and means the initial evaluation after enabling analytic BB is done without having solved the BB system.

To force one analytic step immediately after enabling, consider marking the bin cache dirty when you flip the flag on, e.g.:
 void CMSHistSum::setAnalyticBarlowBeeston(bool flag) const {
   // ...
   if (flag && data_.size()) {
     // existing population of bb_.use, bb_.gobs, bb_.push_res, etc.
     bb_.res.resize(n);
-    bb_.init = true;
+    bb_.init = true;
+    // Ensure next evalBarlowBeeston() actually runs runBarlowBeeston().
+    binsentry_.setValueDirty();
   }
 }
The same pattern should be mirrored in CMSHistErrorPropagator::setAnalyticBarlowBeeston to keep behavior consistent.
src/CMSHistErrorPropagator.cc (1)
196-282: Mirror “force one analytic step after enabling” behavior here as well

As in CMSHistSum, enabling analytic BB via setAnalyticBarlowBeeston(true) sets up bb_ and flips bb_.init, but does not change sentry_, binsentry_, or last_eval_. If the propagator’s cache was already clean with last_eval_ == 1, the first call to evalBarlowBeeston():
void CMSHistErrorPropagator::evalBarlowBeeston() const {
  if (!bb_.init)
    return;
  updateCache(1, true);
}
will find both sentries good and eval == last_eval_, so the runBarlowBeeston() block is skipped and analytic BB is only applied after some later change dirties the cache.

To ensure the analytic minimization actually runs once right after enabling, you can mark the bin cache dirty when flag is set:
 void CMSHistErrorPropagator::setAnalyticBarlowBeeston(bool flag) const {
   // ...
   if (flag && data_.size()) {
     // existing population of bb_.use, bb_.gobs, bb_.push_res, etc.
     bb_.res.resize(n);
-    bb_.init = true;
+    bb_.init = true;
+    // Force the next evalBarlowBeeston() to execute runBarlowBeeston().
+    binsentry_.setValueDirty();
   }
 }
This keeps the work minimal (only one forced recomputation) but guarantees that analytic BB is applied immediately when turned on.

🧹 Nitpick comments (2)

src/CachingNLL.cc (2)
569-580: Clear histSums_/histErrorPropagators_ when rebuilding pdfs_

setup_() clears integrals_, pdfs_, coeffs_, prods_, and multiPdfs_, but histSums_ and histErrorPropagators_ are only ever appended to here. Today setup_() is effectively ctor-only, but if it’s ever called again on an existing instance, these vectors will accumulate stale pointers.

Cheap defensive change:
void cacheutils::CachingAddNLL::setup_() 
{
-    fastExit_ = !runtimedef::get("NO_ADDNLL_FASTEXIT");
+    fastExit_ = !runtimedef::get("NO_ADDNLL_FASTEXIT");
     for (int i = 0, n = integrals_.size(); i < n; ++i) delete integrals_[i];
     integrals_.clear(); pdfs_.clear(); coeffs_.clear(); prods_.clear();
+    multiPdfs_.clear();
+    histSums_.clear();
+    histErrorPropagators_.clear();
    ...
This keeps the ownership model unchanged but avoids surprises if setup_ is reused.

598-601: Potentially redundant analytic BB call at both AddNLL and SimNLL levels

CachingAddNLL::evaluate() now calls runAnalyticBarlowBeeston() at entry, and CachingSimNLL::evaluate() also calls runAnalyticBarlowBeeston() on each channel before summing their NLLs. Because the underlying hist classes short‑circuit via sentries and last_eval_, the second call is usually a cheap no‑op, but it’s still redundant work.

Given this, you might consider:

Keeping the SimNLL‑level pre‑eval loop as the canonical place for the analytic BB step when running combined fits, and

Either documenting that direct CachingAddNLL::evaluate() assumes the caller has already run runAnalyticBarlowBeeston(), or leaving the AddNLL‑level call only for that standalone use case.

Not blocking, but worth a quick profile check since this lives in the inner evaluation loop.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d490c03 and dd7e478.

📒 Files selected for processing (6)

interface/CMSHistErrorPropagator.h (2 hunks)
interface/CMSHistSum.h (2 hunks)
interface/CachingNLL.h (3 hunks)
src/CMSHistErrorPropagator.cc (3 hunks)
src/CMSHistSum.cc (3 hunks)
src/CachingNLL.cc (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

interface/CachingNLL.h
interface/CMSHistSum.h

🧰 Additional context used

🧬 Code graph analysis (2)

src/CachingNLL.cc (2)

src/CMSHistErrorPropagator.cc (2)

setAnalyticBarlowBeeston (225-282)

setAnalyticBarlowBeeston (225-225)

src/CMSHistSum.cc (2)

setAnalyticBarlowBeeston (451-508)

setAnalyticBarlowBeeston (451-451)

src/CMSHistErrorPropagator.cc (1)

src/CMSHistSum.cc (6)

updateCache (340-420)

updateCache (340-340)

runBarlowBeeston (428-449)

runBarlowBeeston (428-428)

evalBarlowBeeston (422-426)

evalBarlowBeeston (422-422)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: dev3/latest - ROOT LCG master
GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
GitHub Check: LCG_106 - ROOT 6.32.02
GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11
GitHub Check: LCG_108 - ROOT 6.36.02
GitHub Check: LCG_102 - ROOT 6.26.04
GitHub Check: Compile (py3.10, root6.26.4)
GitHub Check: Compile (py3.12, root6.34.4)
GitHub Check: Compile (py3.10, root6.32.2)

🔇 Additional comments (6)

interface/CMSHistErrorPropagator.h (1)

79-112: Analytic BB API wiring looks consistent

The addition of evalBarlowBeeston() and the extended updateCache(int eval = 1, bool doBarlowBeeston = false) signature are consistent with the implementation in src/CMSHistErrorPropagator.cc and with the new CachingNLL integration. Existing call sites that omit the doBarlowBeeston argument retain previous behavior.

src/CMSHistSum.cc (1)

422-426: evalBarlowBeeston() helper is minimal and fits new flow

evalBarlowBeeston() simply guards on bb_.init and then calls updateCache(true), which keeps the analytical BB step cleanly separated from ordinary cache updates and matches the new CachingAddNLL/CachingSimNLL pre‑evaluation workflow.

src/CMSHistErrorPropagator.cc (2)

103-194: Analytic BB gating is sound; cache conditions are the only trigger

The new doBarlowBeeston flag and eval/sentry logic in updateCache cleanly separate analytic BB from ordinary cache updates and match how evalBarlowBeeston() and CachingNLL drive it. As long as the sentries are marked dirty when parameters or data change, runBarlowBeeston() will only execute when needed.

343-379: Rounding change to use std::floor is fine

Switching from an implicit int(...) cast to std::floor(0.5 + ...) for n and n_p_r preserves the rounding strategy while avoiding dependence on integer width. Given this code already relies on <cmath> (e.g. std::pow), this change is safe and improves clarity.

src/CachingNLL.cc (2)

797-822: Bulk data/flag propagation to CMSHist looks correct*

The new propagateData(), setAnalyticBarlowBeeston(bool), and runAnalyticBarlowBeeston() methods on CachingAddNLL cleanly centralize interactions with CMSHistSum and CMSHistErrorPropagator via the cached histSums_/histErrorPropagators_ vectors. This avoids repeated dynamic_casts in hot paths and keeps the analytic BB pre‑evaluation step localized.

1023-1029: SimNLL‑level analytic BB pre‑step respects null and channel masks

The new loop in CachingSimNLL::evaluate() correctly guards against nullptr entries in pdfs_ and respects channelMasks_ when deciding which per‑channel CachingAddNLL instances should run runAnalyticBarlowBeeston(). That addresses the earlier null‑deref risk and ensures masked channels don’t waste time on analytic BB.

anigamova · 2025-12-01T20:46:48Z

Excellent, thanks @guitargeek!
@nucleosynthesis @ajgilbert Is there something we are missing here?

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

interface/CachingNLL.h (1)
151-152: Non-owning raw pointers must be cleared on reset.

This was flagged in a previous review. The vectors histSums_ and histErrorPropagators_ are populated during PDF setup but are never cleared. Other mutable cache vectors (integrals_, pdfs_, coeffs_, multiPdfs_) are explicitly cleared in setup_(). Without clearing, these vectors retain stale pointers when data/PDFs are reset.

Add clearing statements in setup_() after the existing clears (around line 512 in src/CachingNLL.cc):
 integrals_.clear(); pdfs_.clear(); coeffs_.clear(); prods_.clear();
+histSums_.clear(); histErrorPropagators_.clear();

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd7e478 and 77627ca.

📒 Files selected for processing (6)

interface/CMSHistErrorPropagator.h (2 hunks)
interface/CMSHistSum.h (2 hunks)
interface/CachingNLL.h (3 hunks)
src/CMSHistErrorPropagator.cc (3 hunks)
src/CMSHistSum.cc (3 hunks)
src/CachingNLL.cc (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/CMSHistSum.cc

🧰 Additional context used

🧬 Code graph analysis (2)

src/CachingNLL.cc (2)

src/CMSHistErrorPropagator.cc (2)

setAnalyticBarlowBeeston (224-281)

setAnalyticBarlowBeeston (224-224)

src/CMSHistSum.cc (2)

setAnalyticBarlowBeeston (450-507)

setAnalyticBarlowBeeston (450-450)

interface/CMSHistSum.h (1)

src/CMSHistSum.cc (4)

evalBarlowBeeston (421-425)

evalBarlowBeeston (421-421)

updateCache (339-419)

updateCache (339-339)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
GitHub Check: dev3/latest - ROOT LCG master
GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
GitHub Check: LCG_108 - ROOT 6.36.02
GitHub Check: LCG_106 - ROOT 6.32.02
GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11

🔇 Additional comments (15)

interface/CachingNLL.h (2)

23-24: LGTM!

Forward declarations for CMSHistSum and CMSHistErrorPropagator are correctly placed to enable their use as pointer types in the class members.

132-132: LGTM!

The new runAnalyticBarlowBeeston() method provides a clean public interface for triggering analytic Barlow-Beeston evaluation before likelihood computation.

interface/CMSHistSum.h (2)

82-84: LGTM!

The new evalBarlowBeeston() method provides a dedicated entry point for triggering Barlow-Beeston evaluation, cleanly separating this concern from the general cache update path.

144-144: LGTM!

The updated updateCache(bool doBarlowBeeston = false) signature maintains backward compatibility via the default parameter while enabling conditional Barlow-Beeston execution when called explicitly with true.

interface/CMSHistErrorPropagator.h (2)

79-81: LGTM!

The evalBarlowBeeston() method mirrors the pattern in CMSHistSum, providing a consistent API for triggering analytic Barlow-Beeston evaluation across both histogram-based PDF types.

111-111: LGTM!

The updated updateCache(int eval = 1, bool doBarlowBeeston = false) signature maintains backward compatibility for existing call sites while enabling the new Barlow-Beeston path when explicitly requested.

src/CachingNLL.cc (5)

598-601: LGTM!

Running the analytic Barlow-Beeston minimization at the very start of evaluate() ensures the MC-stat parameters are set to their best-fit values before the likelihood is computed. This addresses the PR objective of factorizing the analytic minimization from likelihood evaluation.

798-804: LGTM!

The bulk iteration over histErrorPropagators_ and histSums_ for data propagation is cleaner than per-PDF type checking in a loop.

807-813: LGTM!

The bulk iteration pattern for setting the analytic Barlow-Beeston flag is consistent with propagateData().

815-822: LGTM!

The new runAnalyticBarlowBeeston() method provides a clear entry point that delegates to evalBarlowBeeston() on each histogram-based PDF. This aligns with the PR objective of running the analytic step before likelihood evaluation.

1022-1032: LGTM!

The pre-evaluation Barlow-Beeston step in CachingSimNLL::evaluate() correctly:

Guards against null pdfs_ entries (line 1026-1027)

Respects channel masking (line 1028-1029)

Runs BB before the main NLL evaluation loop

This addresses the null-deref concern from a previous review and ensures BB is run for channel-level NLLs so they initialize their offsets correctly, as mentioned in PR comments.

src/CMSHistErrorPropagator.cc (4)

102-102: LGTM!

The updated updateCache(int eval, bool doBarlowBeeston) signature enables conditional Barlow-Beeston execution, decoupling the BB step from the general cache update path.

155-157: LGTM!

Gating runBarlowBeeston() behind the doBarlowBeeston flag ensures it only runs when explicitly requested via evalBarlowBeeston(), eliminating the previous unconditional execution during cache updates.

195-199: LGTM!

The evalBarlowBeeston() method correctly guards on bb_.init to short-circuit when analytic BB is not enabled, then delegates to updateCache(1, true) to trigger the BB path. This provides a clean entry point for the pre-evaluation BB step.

201-201: LGTM!

The runBarlowBeeston() method signature is unchanged, maintaining the existing vectorized computation logic. The method is now only invoked conditionally via the doBarlowBeeston flag path.

coderabbitai · 2025-12-01T20:47:05Z

src/CachingNLL.cc

+      if (auto* mpdf = dynamic_cast<const RooMultiPdf*>(itp->pdf())) {
+        multiPdfs_.emplace_back(mpdf, itp.get());
+      }
+      if (auto* hist = dynamic_cast<const CMSHistSum*>(itp->pdf())) {
+        histSums_.emplace_back(hist);
+      }
+      if (auto* hist = dynamic_cast<const CMSHistErrorPropagator*>(itp->pdf())) {
+        histErrorPropagators_.emplace_back(hist);
+      }
    }


⚠️ Potential issue | 🟠 Major

Clear vectors before populating to prevent stale/duplicate pointers.

When setup_() is called (e.g., after data changes), histSums_ and histErrorPropagators_ are populated without first clearing them, while other vectors like integrals_, pdfs_, coeffs_ are cleared at line 512. This can cause duplicate entries on re-setup or stale pointers if PDFs change.

Apply this diff:

for (int i = 0, n = integrals_.size(); i < n; ++i) delete integrals_[i]; integrals_.clear(); pdfs_.clear(); coeffs_.clear(); prods_.clear(); +histSums_.clear(); histErrorPropagators_.clear();

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In src/CachingNLL.cc around lines 571-580, histSums_ and histErrorPropagators_ are appended to without being cleared on re-setup, causing duplicate or stale entries; before the loop that populates these vectors (near the existing clears at line ~512), call histSums_.clear(); and histErrorPropagators_.clear(); (or assign empty vectors) so those containers are emptied prior to repopulation.

codecov · 2025-12-01T20:54:47Z

Codecov Report

❌ Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 20.69%. Comparing base (61aa0b6) to head (1e3f6e3).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/CachingNLL.cc	96.29%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1137      +/-   ##
==========================================
- Coverage   20.69%   20.69%   -0.01%     
==========================================
  Files         195      195              
  Lines       26154    26155       +1     
  Branches     3923     3924       +1     
==========================================
  Hits         5412     5412              
- Misses      20742    20743       +1

Files with missing lines	Coverage Δ
interface/CMSHistErrorPropagator.h	`71.42% <ø> (ø)`
interface/CMSHistSum.h	`100.00% <ø> (ø)`
interface/CachingNLL.h	`77.27% <ø> (ø)`
src/CMSHistErrorPropagator.cc	`61.96% <100.00%> (-0.60%)`	⬇️
src/CMSHistSum.cc	`74.58% <100.00%> (-0.32%)`	⬇️
src/CachingNLL.cc	`69.65% <96.29%> (+0.40%)`	⬆️

Files with missing lines	Coverage Δ
interface/CMSHistErrorPropagator.h	`71.42% <ø> (ø)`
interface/CMSHistSum.h	`100.00% <ø> (ø)`
interface/CachingNLL.h	`77.27% <ø> (ø)`
src/CMSHistErrorPropagator.cc	`61.96% <100.00%> (-0.60%)`	⬇️
src/CMSHistSum.cc	`74.58% <100.00%> (-0.32%)`	⬇️
src/CachingNLL.cc	`69.65% <96.29%> (+0.40%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

guitargeek · 2025-12-01T21:33:10Z

PR is ready from my side, but indeed some feedback from @nucleosynthesis @ajgilbert would also be appreciated!

nucleosynthesis · 2025-12-02T16:05:56Z

I'm not sure this will have any impact on the discrete profiling stuff so from that side I think its ok. @ajgilbert should comment RE barlow-beeston but in principle I think it should be fine. Is there a test somewhere to check we are getting to the same minimum with the order reversed?

guitargeek · 2025-12-02T16:40:38Z

Yes, for the CMSHistSum there is a test that compares the result against a reference file:

I think the CMSHistErrorPropagator is also covered by some test that is run via the CI YAML config:

https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/main/.github/workflows/cvmfs-ci.yml

I also diffed the output of the full CVMFS CI workflow against the current latest commit in main, and there were no differences

anigamova · 2025-12-04T10:10:45Z

@guitargeek could you please rebase, the master has more tests now. But I think we can merge this anyway.
Thanks!

This will make it more efficient to look them up to run the analytical Barlow Beeston.

Doing the analytical Barlow Beeston before evaluating the likelihood has two big advantages, instead of doing it while evaluating the CMSHistSum/CMSHistErrorPropagator: 1. We don't have to hack the dirty state propagation. This was necessary before, because it is not allowed to change a RooRealVar server during the evaluation of the client. Also, the hack relies on the implementation details of the CachingSimNLL (evaluating main pdf before constraints), so it's also a problem for using RooFit regular NLL code path. 2. If the analytical minimization and the likelihood evaluation factorize, we don't need to worry how to differentiate through the analytical minimization and the likelihood evaluation together.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

interface/CachingNLL.h (1)
23-25: Tighten const‑correctness for analytic Barlow–Beeston orchestration

runAnalyticBarlowBeeston() only forwards to CMSHistSum::runBarlowBeeston() / CMSHistErrorPropagator::runBarlowBeeston() and doesn’t mutate CachingAddNLL itself; the histogram pointer vectors are also populated once in setup_().

You can simplify the const‑cast in CachingAddNLL::evaluate() and the member qualifiers by:
-        void runAnalyticBarlowBeeston();
+        void runAnalyticBarlowBeeston() const;
@@
-        mutable std::vector<CMSHistSum const*> histSums_;
-        mutable std::vector<CMSHistErrorPropagator const*> histErrorPropagators_;
+        std::vector<CMSHistSum const*> histSums_;
+        std::vector<CMSHistErrorPropagator const*> histErrorPropagators_;
and updating the definition in src/CachingNLL.cc to match the const signature.

This keeps the API clearer and avoids unnecessary const_cast usage in evaluate().

Also applies to: 132-133, 151-152
src/CMSHistErrorPropagator.cc (1)
192-214: Early‑exit in runBarlowBeeston() to avoid unnecessary cache refresh when BB is disabled

Starting runBarlowBeeston() with updateCache(); is correct to ensure valsum_/toterr_ are current before solving the analytic BB system. When analytic BB has not been enabled (setAnalyticBarlowBeeston(true) never called, or later disabled), bb_.use remains empty and the loops are no‑ops, but you still pay for an updateCache() on every call from the NLL layer.

You can keep behavior identical while avoiding this overhead in the common “BB off” case by short‑circuiting:
 void CMSHistErrorPropagator::runBarlowBeeston() const {
-  updateCache();
+  if (!bb_.init || bb_.use.empty()) {
+    return;
+  }
+  updateCache();
   const unsigned n = bb_.use.size();
   for (unsigned j = 0; j < n; ++j) {
     bb_.dat[j] = data_[bb_.use[j]];
@@
 void CMSHistErrorPropagator::printMultiline(std::ostream& os, Int_t contents, Bool_t verbose,
                     TString indent) const {
   RooAbsReal::printMultiline(os, contents, verbose, indent);
   updateCache();
(Only runBarlowBeeston() needs the guard; printMultiline legitimately wants to refresh the cache unconditionally.)

This is a minor performance cleanup; semantics stay the same for all enabled/disabled BB configurations.

Also applies to: 473-477
src/CMSHistSum.cc (1)
415-423: Align runBarlowBeeston()/setupBinPars() with current updateCache() API

Two small cleanups here:
Avoid work when analytic BB is disabled (optional)
Like the propagator, runBarlowBeeston() now always calls updateCache(); even when analytic BB hasn’t been initialised. You can skip that cost when bb_.use is empty:
void CMSHistSum::runBarlowBeeston() const {
updateCache();
if (!bb_.init || bb_.use.empty()) {

return;

}
updateCache();
const unsigned n = bb_.use.size();
...
}
This keeps behavior identical while reducing unnecessary cache work when BB is off.
Fix stale comment in setupBinPars
The comment still refers to an argument that no longer exists:
// Now fill the bin contents and errors

updateCache(); // the arg (1) forces updateCache to fill the caches for all bins
// Now fill the bin contents and errors for all bins
updateCache();
This avoids confusion for future readers about a non‑existent `eval` parameter.
Also applies to: 506-512
src/CachingNLL.cc (1)
598-601: Analytic BB orchestration at NLL level matches design; only minor polish possible

The new flow:

CachingAddNLL::evaluate() calls runAnalyticBarlowBeeston() before computing the per‑channel NLL.

CachingSimNLL::evaluate() likewise runs runAnalyticBarlowBeeston() on each active channel, with null and channelMasks_ checks.

propagateData() forwards the current dataset to all CMSHistErrorPropagator and CMSHistSum instances.

setAnalyticBarlowBeeston(bool) and runAnalyticBarlowBeeston() simply forward to the histogram PDFs via the non‑owning pointer vectors.

is consistent with the PR objective of performing the analytic Barlow–Beeston minimization before likelihood evaluation, and it avoids touching RooFit dirty flags inside PDF evaluation.

Two small, non‑blocking polish points:
Const‑correctness – as noted on the header, runAnalyticBarlowBeeston() can be made const, removing the need for the const_cast in evaluate():
-Double_t cacheutils::CachingAddNLL::evaluate() const
+Double_t cacheutils::CachingAddNLL::evaluate() const
{
@@

const_cast<CachingAddNLL&>(*this).runAnalyticBarlowBeeston();

runAnalyticBarlowBeeston();
...
}
@@
-void cacheutils::CachingAddNLL::runAnalyticBarlowBeeston() {
+void cacheutils::CachingAddNLL::runAnalyticBarlowBeeston() const {
for (auto* hist : histErrorPropagators_) {
hist->runBarlowBeeston();
}
for (auto* hist : histSums_) {
hist->runBarlowBeeston();
}
}
2. **Cheap early‑out when no histogram PDFs** (optional) – if a channel has no `CMSHistSum`/`CMSHistErrorPropagator` components, `runAnalyticBarlowBeeston()` currently just iterates two empty vectors. You could make that explicit:

```diff
-void cacheutils::CachingAddNLL::runAnalyticBarlowBeeston() const {
+void cacheutils::CachingAddNLL::runAnalyticBarlowBeeston() const {
+  if (histSums_.empty() && histErrorPropagators_.empty()) {
+    return;
+  }
for (auto* hist : histErrorPropagators_) {
  hist->runBarlowBeeston();
}
for (auto* hist : histSums_) {
  hist->runBarlowBeeston();
}
}
These are minor style/perf tweaks; the current behavior is functionally sound and implements the intended ordering of analytic BB vs. likelihood evaluation.

Also applies to: 797-804, 806-813, 815-822, 1023-1031

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83f4c26 and 1e3f6e3.

📒 Files selected for processing (6)

interface/CMSHistErrorPropagator.h (1 hunks)
interface/CMSHistSum.h (1 hunks)
interface/CachingNLL.h (3 hunks)
src/CMSHistErrorPropagator.cc (1 hunks)
src/CMSHistSum.cc (1 hunks)
src/CachingNLL.cc (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

interface/CMSHistErrorPropagator.h

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Compile (py3.10, root6.32.2)
GitHub Check: Compile (py3.12, root6.34.4)
GitHub Check: Compile (py3.10, root6.26.4)
GitHub Check: LCG_108 - ROOT 6.36.02
GitHub Check: dev3/latest - ROOT LCG master
GitHub Check: LCG_102 - ROOT 6.26.04
GitHub Check: LCG_106 - ROOT 6.32.02
GitHub Check: CMSSW_14_1_0_pre4 - ROOT 6.30.07
GitHub Check: CMSSW_11_3_4 - ROOT 6.22.09
GitHub Check: CMSSW_14_0_0_pre1 - ROOT 6.26.11

🔇 Additional comments (2)

interface/CMSHistSum.h (1)

82-83: Public runBarlowBeeston() exposure looks appropriate

Making runBarlowBeeston() public and const matches the new NLL‑level orchestration and cleanly separates analytic BB from evaluate(). No functional concerns from this change.

src/CachingNLL.cc (1)

569-580: Histogram PDF discovery and pointer wiring look correct

The additional dynamic_cast checks in setup_() to populate multiPdfs_, histSums_, and histErrorPropagators_ from pdfs_[i]->pdf() are consistent with how CachingPdfBase owns/clones the underlying RooAbsReal instances. Since setup_() is only called from the constructors and the histogram vectors are only populated here, there’s no risk of stale or duplicated pointers from this change.

vgvassilev · 2025-12-04T10:37:04Z

@guitargeek could you please rebase, the master has more tests now. But I think we can merge this anyway. Thanks!

The repo settings have a setting that enables a button in GitHub that allows anybody with permissions to rebase a PR with a single click. We can enable it.

anigamova · 2025-12-04T10:42:24Z

@vgvassilev thanks for the suggestion, changed it in the settings

vgvassilev · 2025-12-04T16:09:38Z

@vgvassilev thanks for the suggestion, changed it in the settings

When we rebase we have to make sure we select rebase and not merge as it inserts merge commits which bloat the history when the pr gets in.

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

guitargeek force-pushed the bb_b4 branch 2 times, most recently from c4cf610 to c08a0c1 Compare October 30, 2025 11:46

guitargeek force-pushed the bb_b4 branch from c08a0c1 to 195763e Compare November 5, 2025 12:23

guitargeek mentioned this pull request Nov 5, 2025

Dummy codegen implementations of CMSHist classes #1162

Draft

guitargeek force-pushed the bb_b4 branch from 195763e to 31f2479 Compare December 1, 2025 19:02

coderabbitai bot reviewed Dec 1, 2025

View reviewed changes

guitargeek force-pushed the bb_b4 branch from 31f2479 to d490c03 Compare December 1, 2025 19:22

guitargeek force-pushed the bb_b4 branch from d490c03 to dd7e478 Compare December 1, 2025 20:34

guitargeek force-pushed the bb_b4 branch from dd7e478 to 77627ca Compare December 1, 2025 20:37

coderabbitai bot reviewed Dec 1, 2025

View reviewed changes

guitargeek force-pushed the bb_b4 branch from 77627ca to 83f4c26 Compare December 1, 2025 21:11

anigamova mentioned this pull request Dec 2, 2025

Add channel masks tests and CMSHistFunc test to the CMake configuration #1187

Merged

guitargeek added 2 commits December 4, 2025 11:20

Cache pointers CMSHistSum and ErrorPropagator in CachingNLL

9204f16

This will make it more efficient to look them up to run the analytical Barlow Beeston.

guitargeek force-pushed the bb_b4 branch from 83f4c26 to 1e3f6e3 Compare December 4, 2025 10:20

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

anigamova merged commit 3956794 into cms-analysis:main Dec 4, 2025
16 of 17 checks passed

guitargeek deleted the bb_b4 branch December 4, 2025 10:51

coderabbitai bot mentioned this pull request Dec 4, 2025

Add codegen implementations of CMSHist classes #1191

Open

		mutable std::vector<CMSHistSum const*> histSums_;
		mutable std::vector<CMSHistErrorPropagator const*> histErrorPropagators_;

Do analytical Barlow Beeston before likelihood evaluation #1137

Do analytical Barlow Beeston before likelihood evaluation #1137

Uh oh!

Conversation

guitargeek commented Oct 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vgvassilev commented Nov 4, 2025

Uh oh!

guitargeek commented Nov 5, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

amarini commented Dec 1, 2025

Uh oh!

anigamova commented Dec 1, 2025

Uh oh!

guitargeek commented Dec 1, 2025

Uh oh!

guitargeek commented Dec 1, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

anigamova commented Dec 1, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

guitargeek commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nucleosynthesis commented Dec 2, 2025

Uh oh!

guitargeek commented Dec 2, 2025

Uh oh!

anigamova commented Dec 4, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

vgvassilev commented Dec 4, 2025

Uh oh!

anigamova commented Dec 4, 2025

Uh oh!

Uh oh!

vgvassilev commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

guitargeek commented Oct 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 29, 2025 •

edited

Loading

codecov bot commented Dec 1, 2025 •

edited

Loading

guitargeek commented Dec 1, 2025 •

edited

Loading