Skip to content

Conversation

@ferencek
Copy link
Contributor

@ferencek ferencek commented Oct 6, 2025

PR description:

This PR enables the SiPixel digi morphing in the offline reconstruction from 2025 onward. The HLT version of digi morphing running on GPUs has been introduced in #48734.

PR validation:

None

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported to CMSSW_15_1_X and CMSSW_15_0_X.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2025

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49065/46295

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2025

A new Pull Request was created by @ferencek for master.

It involves the following packages:

  • Configuration/Eras (operations)

@cmsbuild, @davidlange6, @fabiocos, @ftenchini, @mandrenguyen can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @fabiocos, @makortel this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Oct 6, 2025

@cmsbuild, please test

@mandrenguyen
Copy link
Contributor

looking thru the log it seems these unit tests are failing:

 + TEST_ERRORS='---> test TestDQMOnlineClient-beam_dqm_sourceclient had ERRORS
 ---> test TestDQMOnlineClient-beampixel_dqm_sourceclient had ERRORS'
 ++ grep -ai 'had errors' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/unitTests/log.txt
 + TEST_ERRORS='---> test TestDQMOnlineClient-beam_dqm_sourceclient had ERRORS
 ---> test TestDQMOnlineClient-beampixel_dqm_sourceclient had ERRORS'

@mmusich
Copy link
Contributor

mmusich commented Oct 6, 2025

indeed (see log):

----- Begin Fatal Exception 06-Oct-2025 13:39:34 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 355380 lumi: 19 event: 26716348 stream: 0
   [1] Running path 'p'
   [2] Calling method for module SiPixelClusterProducer/'siPixelClustersPreSplitting'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::DetSetVector<PixelDigi>
Looking for module label: siPixelDigisMorphed
Looking for productInstanceName: 

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "TryToContinue = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------

@mmusich
Copy link
Contributor

mmusich commented Oct 6, 2025

so the client

process.load("RecoLocalTracker.Configuration.RecoLocalTracker_cff")

takes it from here:

from RecoLocalTracker.SiPixelClusterizer.siPixelClustersPreSplitting_cff import *

this comes from the auto generated file -- without getting changed from the process modifier.

Whereas:

from RecoLocalTracker.SiPixelClusterizer.SiPixelClusterizerPreSplitting_cfi import *

gets it from here:

from Configuration.ProcessModifiers.siPixelDigiMorphing_cff import siPixelDigiMorphing
siPixelDigiMorphing.toModify(siPixelClustersPreSplitting,
src = 'siPixelDigisMorphed'
)

So there is a mismatch, which explains why the relvals work fine, where the clients do not.
My simple advise is just to change RecoLocalTracker_cff and use the process modified one instead.

on second thought, this perhaps would break some of the patatrack workflows, to be tested.

@mandrenguyen
Copy link
Contributor

urgent
@ferencek Can you provide a fix following the recommendation of @mmusich today?

@cmsbuild cmsbuild added the urgent label Oct 6, 2025
@ferencek
Copy link
Contributor Author

ferencek commented Oct 6, 2025

urgent @ferencek Can you provide a fix following the recommendation of @mmusich today?

working on it...

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2025

-1

Failed Tests: UnitTests
Size: This PR adds an extra 24KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2be3f1/48481/summary.html
COMMIT: 16c52de
CMSSW: CMSSW_16_0_X_2025-10-06-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49065/48481/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 2 errors in the following unit tests:

---> test TestDQMOnlineClient-beam_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-beampixel_dqm_sourceclient had ERRORS

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2086 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3940073
  • DQMHistoTests: Total failures: 2303
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3937750
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 218 log files, 188 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2025

+1

Size: This PR adds an extra 28KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2be3f1/48489/summary.html
COMMIT: b65ef5d
CMSSW: CMSSW_16_0_X_2025-10-06-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S,NVIDIA_T4
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49065/48489/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

AMD_MI300X Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 242 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 146621
  • DQMHistoTests: Total failures: 27522
  • DQMHistoTests: Total nulls: 7
  • DQMHistoTests: Total successes: 119092
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 42 log files, 45 edm output root files, 11 DQM output files
  • TriggerResults: no differences found

AMD_W7900 Comparison Summary

Summary:

NVIDIA_H100 Comparison Summary

Summary:

NVIDIA_L40S Comparison Summary

Summary:

NVIDIA_T4 Comparison Summary

Summary:

@mandrenguyen
Copy link
Contributor

@cms-sw/dqm-l2 your sig was added after fixing an issue with the unit tests.
This one is quite urgent can you have a quick look please?

@mandrenguyen
Copy link
Contributor

+1
@cms-sw/dqm-l2 please have a look

@nothingface0
Copy link
Contributor

Taking a look now

@nothingface0
Copy link
Contributor

nothingface0 commented Oct 7, 2025

From the comparison failures, I see several histograms not being filled, e.g., these here, or here.

Is this expected?

Edit: P5 tests are OK, if the differences are expected (I cannot really judge with my limited knowledge), we can sign it.

@mmusich
Copy link
Contributor

mmusich commented Oct 7, 2025

Is this expected?

if duplicates vanish it's a good thing (in principle).

@nothingface0
Copy link
Contributor

+dqm

  • Rushed checks of histo comparison probably OK
  • P5 tests ok

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 7, 2025

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit 00b7923 into cms-sw:master Oct 7, 2025
25 checks passed
@mmusich
Copy link
Contributor

mmusich commented Oct 7, 2025

@nothingface0

Rushed checks of histo comparison probably OK

just adding for completeness, there was an entire release validation check about this link to ValDB (including re-reco of an entire data-taking era link). I would not call this "rushed check of histos"....

@nothingface0
Copy link
Contributor

just adding for completeness, there was an entire release validation check about this link to ValDB (including re-reco of an entire data-taking era link). I would not call this "rushed check of histos"....

I was referring to the check that I just did hastily, I had no knowledge of the work that you mentioned.

@ferencek ferencek deleted the SiPixelDigiMorphing_enabledFrom2025 branch October 7, 2025 08:16
@mmusich
Copy link
Contributor

mmusich commented Oct 7, 2025

I was referring to the check that I just did hastily, I had no knowledge of the work that you mentioned.

Sure, I commented for the record, to dispel the impression this got merged in a rush without detailed checks on the physics outcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants