Skip to content

feat: add sum of weights for BTA_ttbar workflow, fix typo in suball script and make basepath in BTA producers configurable #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

philippgadow
Copy link
Contributor

Description

This pull request introduces several improvements and fixes to the BTA workflow and associated scripts and configurations:

  1. Feature Addition:

    • Implemented support for calculating the sum of weights, including variations such as LHE scale weights, and PS weights, in the BTA_ttbar workflow (LHE PDF weights not considered for the moment).
    • Enabled configuration of basepath and output_dir in BTA_producer and BTA_ttbar_producer workflows for greater flexibility in file management.
  2. Bug Fixes:

    • Typo Correction: Fixed an issue in the suball script where variables paser were incorrectly referenced as parser, ensuring proper command-line parsing functionality.
    • Use events.PuppiMET.pt everywhere in BTA_ttbar_producer workflow if analysis is Run3
  3. Configuration Update:

    • Modified test_env.yml to replace defaults with nodefaults in the conda channel configuration, removing dependencies on Anaconda, which is not free to use for organisations with more than 200 collaborators.

Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @philippgadow this all looks good to me ! Thanks for the PR

Is there any changes you want to include further in this PR, if not we are good to merge :)

Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry noticing one change after approve it @philippgadow could you please have a look with this change?

@philippgadow philippgadow requested a review from Ming-Yan March 10, 2025 16:35
Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @philippgadow , thank you for incorporating the suggestions, I have some minor suggestions for moving things to common selections/functions.

Could you please have a look?

Comment on lines 192 to 198
(muons.pt > 20)
& (abs(muons.eta) < 2.4)
& muons.tightId # pass cut-based tight ID
& (muons.pfRelIso04_all < 0.12) # muon isolation cut
& (
muons.pfRelIso04_all < 0.15
) # muon isolation cut (tight: https://twiki.cern.ch/twiki/bin/viewauth/CMS/SWGuideMuonIdRun2#Particle_Flow_isolation and https://github.com/cms-sw/cmssw/blob/75451d59a7acc30aec874be9a6b9a8835f2f7b3e/PhysicsTools/NanoAOD/python/muons_cff.py#L249)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed this seems match to the selection mu_idiso in our common selection, could you please move to use this mask, the purpose is to find out the selections are synced :)

@@ -1,7 +1,7 @@
name: btv_coffea
channels:
- conda-forge
- defaults
- nodefaults
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this :)

Comment on lines +835 to +858

def transfer_file(self, local_outfile_path, outfile_path):
transfer_command = f"xrdcp -p --silent {local_outfile_path} {outfile_path}"
result = os.system(transfer_command)
# Check if xrdcp failed
if result != 0:
print("xrdcp failed, attempting to transfer with gfal-copy")
transfer_command = (
f"gfal-copy -p -f -t 4200 {local_outfile_path} {outfile_path}"
)
result = os.system(transfer_command)
if result == 0:
print("File transferred successfully with gfal-copy")
else:
print("gfal-copy also failed")
else:
print("File transferred successfully with xrdcp")
if result == 0:
os.system(f"rm {local_outfile_path}")
else:
print("File transfer failed, need to transfer manually")
# append file path to a list for manual transfer which is stored in output_dir
with open(f"{self.output_dir}/manual_transfer.txt", "a") as f:
f.write(f"{transfer_command}\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this as a common function for other workflow if needed?
https://github.com/cms-btv-pog/BTVNanoCommissioning/blob/master/src/BTVNanoCommissioning/helpers/func.py

I would also suggest to have some documentation in the optional changes in doc :)
https://btvnanocommissioning.readthedocs.io/en/latest/developer.html#optional-changes

Comment on lines +620 to +702
"total_lhe_scaleweights_1": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 1] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 1
else ak.Array([0.0])
),
"total_lhe_scaleweights_2": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 2] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 2
else ak.Array([0.0])
),
"total_lhe_scaleweights_3": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 3] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 3
else ak.Array([0.0])
),
"total_lhe_scaleweights_4": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 4] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 4
else ak.Array([0.0])
),
"total_lhe_scaleweights_5": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 5] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 5
else ak.Array([0.0])
),
"total_lhe_scaleweights_6": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 6] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 6
else ak.Array([0.0])
),
"total_lhe_scaleweights_7": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 7] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 7
else ak.Array([0.0])
),
"total_lhe_scaleweights_8": (
ak.Array(
[ak.sum(lhe_scale_w_arrays[:, 8] * events.genWeight)]
)
if lhe_pdf_w_arrays is not None
and number_lhe_scaleweights > 8
else ak.Array([0.0])
),
"total_psweights_0": (
ak.Array([ak.sum(ps_w_arrays[:, 0] * events.genWeight)])
if ps_w_arrays is not None and number_of_psweights > 0
else ak.Array([0.0])
),
"total_psweights_1": (
ak.Array([ak.sum(ps_w_arrays[:, 1] * events.genWeight)])
if ps_w_arrays is not None and number_of_psweights > 1
else ak.Array([0.0])
),
"total_psweights_2": (
ak.Array([ak.sum(ps_w_arrays[:, 2] * events.genWeight)])
if ps_w_arrays is not None and number_of_psweights > 2
else ak.Array([0.0])
),
"total_psweights_3": (
ak.Array([ak.sum(ps_w_arrays[:, 3] * events.genWeight)])
if ps_w_arrays is not None and number_of_psweights > 3
else ak.Array([0.0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking whether we can have a unified function that does the work for all kinds of weight.
something like

def sumw(events,mcweights= "LHEScaleWeight"):
    weight_size= max(ak.count(events[mcweights],axis=-1))
    weights={}
    for i in range(weight_size):
         weights[f'total_{mcweights}_{i}'=ak.sum(events[mcweights][:,i]*events.genWeight) 
   return weights

Then you can have the returned weight arrays included

      LHEweight=sumw(events,mcweights= "LHEScaleWeight")
      f['sumw']={**LHEweight,....}

Then I think this functionality later can be used in other workflows :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants