Description
Production is using CMSSW_15_0_2 to run a remini-renano workflow and seeing rather poor CPU efficiencies.
An example log is /eos/cms/store/logs/prod/recent/PRODUCTION/pdmvserv_Run2024I_EGamma0_MINIv6NANOv15_250321_075704_7868/DataProcessing/cmsgwms-submit9.fnal.gov-2270711-1-log.tar
(others in that same area, this is one i grabbed)
Notably in the log there are quite a number of O(6-8 minute) pauses.
Running from a node CERN, copying the three input files takes a few minutes and then the workflow runs at reasonable efficiency (eg, it finished after 40 minutes while the job with remote file reads is 10% done after 70 minutes).
Running on the original files also illustrates pauses between events being processed (but not at the same event numbers as in the original job). The files in my example are:
xrdcp root://xrootd-cms.infn.it//store/data/Run2024I/EGamma0/AOD/PromptReco-v1/000/386/605/00000/9ad0cdbe-0470-45a8-90fa-9ccbd0ef4087.root .
xrdcp root://xrootd-cms.infn.it//store/data/Run2024I/EGamma0/AOD/PromptReco-v1/000/386/605/00000/5d7bb4b0-674e-4584-afe4-e71b63a95c1b.root .
xrdcp root://xrootd-cms.infn.it//store/data/Run2024I/EGamma0/AOD/PromptReco-v1/000/386/605/00000/6c68d214-3fd6-4b7b-9809-d0c8705a24cf.root .
(at T2_US_Vanderbilt, so not very close to cern...I believe the original job ran in Bari, which would be even further away)