Skip to content

MTD geometry: update ETL geometry to most recent drawings (scenarios v9 and v10) #47535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 20, 2025

Conversation

fabiocos
Copy link
Contributor

@fabiocos fabiocos commented Mar 8, 2025

PR description:

Implementation of the latest ETL geometry drawings for both the full nominal scenario (as v9) and the reduced "1.7" scenario (as v10). For more details see the presentation by @leonardolanteri at the MTD DPG meeting https://indico.cern.ch/event/1522057/contributions/6403842/attachments/3027548/5343884/ETL%20Geom%20Update%20-%20March%202025.pdf .

The new corresponding CMS scenarios D118 and D118 are built on top of D117, integrating the updated v4 version of BTL. The Configuration/Geometry RESDME file is not updated, as this is independently done in the PR #47531 .

PR validation:

All the geometry unit tests have been successfully run and checked up to reconstruction navigation.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2025

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2025

A new Pull Request was created by @fabiocos for master.

It involves the following packages:

  • Configuration/Geometry (upgrade, geometry)
  • Configuration/PyReleaseValidation (pdmv, upgrade)
  • Configuration/StandardSequences (operations)
  • DataFormats/ForwardDetId (upgrade, simulation)
  • Geometry/CMSCommonData (upgrade, geometry)
  • Geometry/MTDCommonData (upgrade, geometry)
  • Geometry/MTDGeometryBuilder (upgrade, geometry)
  • RecoMTD/DetLayers (reconstruction, upgrade)

@AdrianoDee, @Dr15Jones, @Moanwar, @antoniovilela, @bsunanda, @civanch, @cmsbuild, @davidlange6, @DickyChant, @fabiocos, @jfernan2, @kpedro88, @makortel, @mandrenguyen, @mdhildreth, @miquork, @rappoccio, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @Martin-Grunewald, @VinInn, @VourMa, @apsallid, @bsunanda, @denizsun, @dgulhan, @felicepantaleo, @makortel, @martinamalberti, @missirol, @mmusich, @mtosi, @rovere, @salimcerci, @sameasy, @slomeo, @vargasa, @youyingli this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

fabiocos commented Mar 8, 2025

please test workflow 33234.0,33634.0

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2025

-1

Failed Tests: UnitTests RelVals
Size: This PR adds an extra 104KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-016bd3/44861/summary.html
COMMIT: 642dd5f
CMSSW: CMSSW_15_1_X_2025-03-08-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/47535/44861/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 7 errors in the following unit tests:

---> test GeometryMTDGeometryBuilderTestDriver had ERRORS
---> test GeometryMTDCommonDataTestDriver had ERRORS
---> test GeometryTrackerGeometryBuilderTestDriver had ERRORS
and more ...

RelVals

----- Begin Fatal Exception 08-Mar-2025 20:55:40 CET-----------------------
An exception of category 'MTDTopologyEP' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module L1CaloJetProducer/'l1tCaloJet'
   [4] Prefetching for module L1TowerCalibrator/'l1tTowerCalibration'
   [5] Prefetching for module L1EGCrystalClusterEmulatorProducer/'l1tEGammaClusterEmuProducer'
   [6] Prefetching for module EcalEBTrigPrimProducer/'simEcalEBTriggerPrimitiveDigis'
   [7] Prefetching for module MixingModule/'mix'
   [8] Calling method for EventSetup module MTDTopologyEP/'mtdTopology'
Exception Message:
Inconsistent size of ETL structure arrays
----- End Fatal Exception -------------------------------------------------

@fabiocos
Copy link
Contributor Author

fabiocos commented Mar 9, 2025

hold

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 9, 2025

Pull request has been put on hold by @fabiocos
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 9, 2025

@bsunanda
Copy link
Contributor

@subirsarkar please approve this

@subirsarkar
Copy link

+Upgrade

@civanch
Copy link
Contributor

civanch commented Mar 11, 2025

+1

@jfernan2
Copy link
Contributor

+1

@fabiocos
Copy link
Contributor Author

@cms-sw/pdmv-l2 comments?

@AdrianoDee
Copy link
Contributor

+pdmv

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 3c82c29 into cms-sw:master Mar 20, 2025
11 checks passed
@fabiocos
Copy link
Contributor Author

@cms-sw/orp-l2 @cms-sw/geometry-l2 @cms-sw/mtd-dpg-l2 @leonardolanteri

This PR seems to produce systematic crashes in the IB for the scenario D119, which were not seen in these tests, apparently due to excessive memory usage.

We already know that there are some overlaps to fix, and we are working towards an updated version. We will investigate this issue in parallel.

@AdrianoDee
Copy link
Contributor

This PR seems to produce systematic crashes in the IB for the scenario D119, which were not seen in these tests, apparently due to excessive memory usage.

For the future: is this something we could spot via the profiling tests (with enable profiling)?

@fabiocos
Copy link
Contributor Author

@AdrianoDee there is already some memory monitoring in the regular test output, and in the JobReport, and I do not seem to see there evidence of large anomalies, comparing for instance with D118 which is not showing issues in the IB

@slomeo
Copy link
Contributor

slomeo commented Mar 21, 2025

We already know that there are some overlaps to fix,

Hi @fabiocos,
As far as I know, the D117 is built on top of the D116, which has no overlaps in the MUON system (only in the ECAL). Do you think these overlaps could be the cause of this crash?

@makortel
Copy link
Contributor

This PR seems to produce systematic crashes in the IB for the scenario D119, which were not seen in these tests, apparently due to excessive memory usage.

For the future: is this something we could spot via the profiling tests (with enable profiling)?

Not really, because at the moment we don't have a reliable and useful memory profiler (IgProf has not been reliable on el8 onwards).

Written that, the default PR tests have included MaxMemoryPreload monitoring for quite some time in all runTheMatrix workflows. That tool reports the maximum amount of allocated memory at any given time during the job, that we hope will eventually be able catch (automatically) this kind of problems (there is a O(70 MB) variability in ONNX #46966 that presently limits is usefulness). The report has already been cleaned up from the 2-week-old PR tests. But the matrix test outputs were still there (but baseline was not), and that shows

Memory Report: total memory requested: 179389809566
Memory Report:  max memory used: 4726481128
Memory Report:  presently used: 1952053920
Memory Report:  # allocations calls:   246775088
Memory Report:  # deallocations calls: 244158994

Maximum memory used (over 10 events) was 4.4 GB, whereas the IB test is apparently killed after

MemoryCheck: module PoolOutputModule:FEVTDEBUGHLToutput VSIZE 21203.2 640 RSS 17164.2 494.672

so the old PR tests did not suggest this PR would cause a memory problem.

@makortel
Copy link
Contributor

Let's move further discussion to #47652

@fabiocos fabiocos deleted the fc-etlv9v10-20250306 branch April 9, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants