-
Notifications
You must be signed in to change notification settings - Fork 0
Mpi local run #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: MPI_updates
Are you sure you want to change the base?
Mpi local run #30
Conversation
…-overflow Fix event number overflow in ES packer - 150X
[15_0_X] Reduce minYsizeB1 and minYsizeB2 CA cuts for Phase1
[15_0_X] Remove size scalar from HcalRecHitSoALayout as unused
…ScoutingPFMonitor dataset
add back `@standardDQM+@miniAODDQM+@nanoAODDQM` sequences to wf 145.415
Implement a compile-time warp size constant [15.0]
…Clusterizer_GPU_DEBUG Fix `SiPixelClusterizer` alpaka code when `GPU_DEBUG` is defined [15.0.x]
…tingDQM_15_0_X [15.0.X] add a test workflow for testing the new `@hltScouting` DQM sequence in `ScoutingPFMonitor` dataset
add dictionary for `std::pair<short,int>` [`15_0_X`]
…0pre1 Backport: Developing offline JetMET DQM for Scouting jets
[15.0.X] Updated HLT BTV validation paths from DeepCSV to PNet
…e_throwOnMissing_15_0_X [15.0.X] fix `throwOnMissing` logic in `ObjectSelectorBase`
…y_ctor Fix warning about implicitly-declared copy constructor [15.0.x]
add a filter sequence if it is present in the fragment [15.0.X]
fixes to DTH parser based on tests with real DTH output. Also fix case of orbit containing only one event and by default disable checksum in unit test because this does not work for data at the moment
…an be parametrized through DaqDirector, in the future it will be detected from ramdisk metadata. * added fileDiscoveryMode that can be used live instead of fileBroker lower performance expected on NFS due to doing many file operations atomicity in grabbing files ensured by renaming file to unique name (even over NFS) * for the new mode, added eventCounter function (to models) which can do early counting of events in the file if neither the (deprecated) json index file is not provided and file does not come with file header providing event count and (optionally) file size. * Autodetection of raw file header without the file-broker is implemented. * unit tests implemented for various scenarios * copy new daqParameters json file by the fakeBU
…r47047_15_0_x [15.0.X] fix `customizeHLTfor47047` to work also on menus already customized
…_15_0_X Backport to produce 2024 and 2025 Tau Embedding samples
…gration_tests Implement additional framework integration tests [15.0.x]
[GEM][backport] turning on the applyMasking for 2025 Run 3 GEM data taking [15.0.x]
…tern Implement `edm::ProductNamePattern` [15.0.x]
This EDProducer will clone all the event products declared by its configuration, using their ROOT dictionaries.
Let multiple CMSSW processes on the same or different machines coordinate event processing and transfer data products over MPI. The implementation is based on four CMSSW modules. Two are responsible for setting up the communication channels and coordinate the event processing: - the MPIController - the MPISource and two are responsible for the transfer of data products: - the MPISender - the MPIReceiver . The MPIController is an EDProducer running in a regular CMSSW process. After setting up the communication with an MPISource, it transmits to it all EDM run, lumi and event transitions, and instructs the MPISource to replicate them in the second process. The MPISource is a Source controlling the execution of a second CMSSW process. After setting up the communication with an MPIController, it listens for EDM run, lumi and event transitions, and replicates them in its process. Both MPIController and MPISource produce an MPIToken, a special data product that encapsulates the information about the MPI communication channel. The MPISender is an EDProducer that can read a collection of a predefined type from the Event, serialise it using its ROOT dictionary, and send it over the MPI communication channel. The MPIReceiver is an EDProducer that can receive a collection of a predefined type over the MPI communication channel, deserialise is using its ROOT dictionary, and put it in the Event. Both MPISender and MPIReceiver are templated on the type to be transmitted and de/serialised. Each MPISender and MPIReceiver is configured with an instance value that is used to match one MPISender in one process to one MPIReceiver in another process. Using different instance values allows the use of multiple MPISenders/MPIReceivers in a process. Both MPISender and MPIReceiver obtain the MPI communication channel reading an MPIToken from the event. They also produce a copy of the MPIToken, so other modules can consume it to declare a dependency on the previous modules. An automated test is available in the test/ directory.
Let MPISender and MPIReceiver consume, send/receive and produce collections of arbitrary types, as long as they have a ROOT dictionary and can be persisted. Note that any transient information is lost during the transfer, and needs to be recreated by the receiving side. The documentation and tests are updated accordingly. Warning: this approach is a work in progress! TODO: - improve framework integration - add checks between send/receive types
Let multiple CMSSW processes on the same or different machines coordinate event processing and transfer data products over MPI. The implementation is based on four CMSSW modules. Two are responsible for setting up the communication channels and coordinate the event processing: - the MPIController - the MPISource and two are responsible for the transfer of data products: - the MPISender - the MPIReceiver . The MPIController is an EDProducer running in a regular CMSSW process. After setting up the communication with an MPISource, it transmits to it all EDM run, lumi and event transitions, and instructs the MPISource to replicate them in the second process. The MPISource is a Source controlling the execution of a second CMSSW process. After setting up the communication with an MPIController, it listens for EDM run, lumi and event transitions, and replicates them in its process. Both MPIController and MPISource produce an MPIToken, a special data product that encapsulates the information about the MPI communication channel. The MPISender is an EDProducer that can read a collection of a predefined type from the Event, serialise it using its ROOT dictionary, and send it over the MPI communication channel. The MPIReceiver is an EDProducer that can receive a collection of a predefined type over the MPI communication channel, deserialise is using its ROOT dictionary, and put it in the Event. Both MPISender and MPIReceiver are templated on the type to be transmitted and de/serialised. Each MPISender and MPIReceiver is configured with an instance value that is used to match one MPISender in one process to one MPIReceiver in another process. Using different instance values allows the use of multiple MPISenders/MPIReceivers in a process. Both MPISender and MPIReceiver obtain the MPI communication channel reading an MPIToken from the event. They also produce a copy of the MPIToken, so other modules can consume it to declare a dependency on the previous modules. An automated test is available in the test/ directory.
Let MPISender and MPIReceiver consume, send/receive and produce collections of arbitrary types, as long as they have a ROOT dictionary and can be persisted. Note that any transient information is lost during the transfer, and needs to be recreated by the receiving side. The documentation and tests are updated accordingly. Warning: this approach is a work in progress! TODO: - improve framework integration - add checks between send/receive types
int numProducts; | ||
token.channel()->receiveProduct(instance_, numProducts); | ||
edm::LogVerbatim("MPIReceiver") << "Received number of products: " << numProducts; | ||
// int numProducts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why these are commented out ?
I think "local run" may be misleading: both approaches (single Could you rename the option to "useMPINameServer" ? |
PR description:
PR validation:
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Before submitting your pull requests, make sure you followed this checklist: