Robust fit improvements #658

usarica · 2021-04-09T06:24:59Z

Robust fit fails to recognize convergence when the crossing point is on the order of the crossing tolerance. In such cases, the POI approaches close enough in nll to the specified level, but what is returned is still NaN. This PR fixes this issue. It also introduces the runtime define 'FITTER_CROSSING_PERSISTENT_TRIALS'; if this value is >0, the crossing minimizations continue until 'FITTER_CROSSING_PERSISTENT_TRIALS'-many trials are conducted.

usarica · 2021-04-09T18:47:25Z

The last commit also fixes a residual issue related to steep log-likelihood functions. It replaces the guess made between the present and the previous locations with a full quadratic interpolation based on the three closest points. The precision requirement is kept the same.

ajgilbert · 2021-04-12T09:06:34Z

Hi, I would like to understand better what the issues are before considering merging this PR. The robustFit algo is used heavily in CMS and any changes to the algorithm need to be carefully understood and validated (in general we recommend contacting the combine developers first if you encounter a problem like this).

Anyway, the issue that "Robust fit fails to recognize convergence when the crossing point is on the order of the crossing tolerance" - could this be addressed by setting a more appropriate min/max on the parameter being scanned, or is the issue not as simple as that? It's true the algorithm may struggle if the range on the parameter is many orders of magnitude larger than the distance to the crossing.

usarica · 2021-04-12T12:26:18Z

Hi, I would like to understand better what the issues are before considering merging this PR. The robustFit algo is used heavily in CMS and any changes to the algorithm need to be carefully understood and validated (in general we recommend contacting the combine developers first if you encounter a problem like this).

Anyway, the issue that "Robust fit fails to recognize convergence when the crossing point is on the order of the crossing tolerance" - could this be addressed by setting a more appropriate min/max on the parameter being scanned, or is the issue not as simple as that? It's true the algorithm may struggle if the range on the parameter is many orders of magnitude larger than the distance to the crossing.

I don't think setting the min/max of the parameter manually is desirable if one runs several hundreds of jobs, which is the case I am dealing with. It is not a scalable solution. Moreover, as I mentioned, the last commit fixes an important bug, where the interpolation equation assumes certain properties that are not true in general (and I have come across cases where the interval is off by an order of magnitude, within the small uncertainty band it is supposed to have). The solution is tested well, comparison to Minos algorithm closes when Minos manages to succeed.

usarica · 2021-04-12T12:34:18Z

BTW, the default behavior is almost exactly the same if one does not set FITTER_CROSSING_PERSISTENT_TRIALS, with the only meaningful change being in how the interpolation is done in the first if-condition inside the for-loop.

ajgilbert · 2021-04-12T14:02:37Z

Well I think setting sensible ranges on all floating parameters is good practice. This also helps minuit make reasonable initial steps at the start of the minimisation. If this is really a problem, then I guess you could also play with the stepSize or tolerance parameters to achieve a similar effect.

It would be helpful to have more detail on the bug you mention though. Can you provide a minimal example that we can test?

usarica · 2021-04-12T15:22:40Z

Well I think setting sensible ranges on all floating parameters is good practice. This also helps minuit make reasonable initial steps at the start of the minimisation. If this is really a problem, then I guess you could also play with the stepSize or tolerance parameters to achieve a similar effect.

It would be helpful to have more detail on the bug you mention though. Can you provide a minimal example that we can test?

I agree with you in general, but it doesn't always work in practice because the different data cards I have to work with are also in different ranges (I am running some T&P fits using combine, so each bin has a different signal or bkg shape, or signal fraction). I put initial guesses on the central value and ranges, but I can't narrow the range finely enough. That is why I decided to improve the functionality of robustFit instead (I still have several hundreds of datacards, so tuning them by hand was becoming impractical).

Let me upload an example workspace on lxplus and point to it in this thread.

usarica · 2021-04-12T17:16:28Z

Well I think setting sensible ranges on all floating parameters is good practice. This also helps minuit make reasonable initial steps at the start of the minimisation. If this is really a problem, then I guess you could also play with the stepSize or tolerance parameters to achieve a similar effect.
It would be helpful to have more detail on the bug you mention though. Can you provide a minimal example that we can test?

I agree with you in general, but it doesn't always work in practice because the different data cards I have to work with are also in different ranges (I am running some T&P fits using combine, so each bin has a different signal or bkg shape, or signal fraction). I put initial guesses on the central value and ranges, but I can't narrow the range finely enough. That is why I decided to improve the functionality of robustFit instead (I still have several hundreds of datacards, so tuning them by hand was becoming impractical).

Let me upload an example workspace on lxplus and point to it in this thread.

Here it is:
/afs/cern.ch/work/u/usarica/public/HCombRobFitFix
I added the instructions and sample results in the README. You could reproduce them by running the commands I include.

ajgilbert · 2021-04-13T10:11:19Z

Thanks, so I was able to get to (I think) the correct crossing in a fairly small number of steps using: combine /afs/cern.ch/work/u/usarica/public/HCombRobFitFix/higgsCombineSinglesSnapshot2.MultiDimFit.mH120.root -M MultiDimFit --freezeParameters r --redefineSignalPOIs frac_sig -v 1 --alignEdges=1 --saveNLL --saveSpecifiedNuis=rgx{.*} --algo singles --saveWorkspace --X-rtd MINIMIZER_no_analytic -n SinglesFinalSnapshot --snapshotName=MultiDimFit --cminDefaultMinimizerStrategy 0 --robustFit 1 --stepSize 0.0001 --X-rtd FITTER_DYN_STEP

which gives:

Searching for crossing at nll = 0.494466 in the interval 1, 0
      frac_sig      lvl-here  lvl-there   stepping
0.999900    +0.37360  +0.49447    -0.000100
0.999700    +0.02609  +0.37360    -0.000200
0.999496    -0.45808  +0.02609    -0.000204
0.999690    +0.00461  -0.45808    -0.000010
0.999680    -0.01612  +0.00461    -0.000010
0.999687    +0.00030  -0.01612    -0.000002

But then indeed I see the issue with the interpolated value here. So that part should probably be fixed. At this point I'd prefer to apply a minimal fix for this algo (perhaps even just returning rStart at the last point would be sufficient), and then consider your improvements for a new improved algo that could eventually become the default. Will discuss with the other developers.

usarica · 2021-04-13T12:59:01Z

Thanks, so I was able to get to (I think) the correct crossing in a fairly small number of steps using: combine /afs/cern.ch/work/u/usarica/public/HCombRobFitFix/higgsCombineSinglesSnapshot2.MultiDimFit.mH120.root -M MultiDimFit --freezeParameters r --redefineSignalPOIs frac_sig -v 1 --alignEdges=1 --saveNLL --saveSpecifiedNuis=rgx{.*} --algo singles --saveWorkspace --X-rtd MINIMIZER_no_analytic -n SinglesFinalSnapshot --snapshotName=MultiDimFit --cminDefaultMinimizerStrategy 0 --robustFit 1 --stepSize 0.0001 --X-rtd FITTER_DYN_STEP

which gives:
Searching for crossing at nll = 0.494466 in the interval 1, 0
      frac_sig      lvl-here  lvl-there   stepping
0.999900    +0.37360  +0.49447    -0.000100
0.999700    +0.02609  +0.37360    -0.000200
0.999496    -0.45808  +0.02609    -0.000204
0.999690    +0.00461  -0.45808    -0.000010
0.999680    -0.01612  +0.00461    -0.000010
0.999687    +0.00030  -0.01612    -0.000002
But then indeed I see the issue with the interpolated value here. So that part should probably be fixed. At this point I'd prefer to apply a minimal fix for this algo (perhaps even just returning rStart at the last point would be sufficient), and then consider your improvements for a new improved algo that could eventually become the default. Will discuss with the other developers.

Thanks for looking at this issue. I still don't think this adjustment can be done in a general setting (since in the case of these workspaces, the minimum and range of frac_sig could be quite different, and there are as many as 100K of them according to my last count), but for sure I could wait until a discussion with other developers happens first.

usarica added 5 commits April 6, 2021 10:20

Allow reading of remote files when running over condor

359cadb

Amend check to only work with ROOT files

69beb20

Improve handling of small errors and fit at boundary

8dcc9ec

Merge

35bd74a

Fix interpolation. This formula fails when nll is steep.

95ea3ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Robust fit improvements #658

Robust fit improvements #658

Uh oh!

usarica commented Apr 9, 2021

Uh oh!

usarica commented Apr 9, 2021

Uh oh!

ajgilbert commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

ajgilbert commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

ajgilbert commented Apr 13, 2021

Uh oh!

usarica commented Apr 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Robust fit improvements #658

Are you sure you want to change the base?

Robust fit improvements #658

Uh oh!

Conversation

usarica commented Apr 9, 2021

Uh oh!

usarica commented Apr 9, 2021

Uh oh!

ajgilbert commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

ajgilbert commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

usarica commented Apr 12, 2021

Uh oh!

ajgilbert commented Apr 13, 2021

Uh oh!

usarica commented Apr 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants