-
Notifications
You must be signed in to change notification settings - Fork 1
2025 06 02 webex
Discuss issue 733 Upcoming Forum Continue with “dynamic processes with pset” proposal
Howard will revive Sessions attributes in response to updates to this ticket. Sonja will send Howard a blurb about modified MPI spawn functionality available in partec MPI.
Continue discussing example in the dynamic process proposal.
More discussion of the example. Dominik will add a check for MPI_INFO_NULL being returned from MPI_Session_get_pset_info before querying the info object.
Return to deconstruction of MPI_Session_ipsetop. Use this function for getting placement info to be used in a subsequent spawn operation? Ralph talks about PMIx_Group_construct we don't actually need to know all of members when constructing group.
For next time we'll try to split up the example in to two - one for growing only, one for shrinking only. For growing, add a sub-example for spawning rather than relying on something under the hood.
May cancel next meeting as at least one of our key players will be out.
Meeting summary for MPI Sessions WG (06/02/2025)
The team discussed data type persistence concerns in MPI and explored potential solutions for session management and attribute handling. They examined process set implementation details, including membership verification and information retrieval mechanisms, while also addressing IP setup challenges and communication protocols. The group concluded by planning future meetings and code improvements, including splitting example cases and preparing for the upcoming 5.0 ratification.
• Dominik to add a check for if MPI_Session_get_pset_info returns an info null object before calling info_get in the example. • Dominik to create separate examples for growing and shrinking use cases in the dynamic process set proposal. • Dominik, Sonja, and Martin to work on a spawning example using the output psets. • Sonja to send Howard information about the Partec spawn extension. • Howard to include the dynamic process set proposal and flexible spawn mechanism in the working group summary for the upcoming forum. • Howard to consider adding attributes to sessions if the MPI 5.0 ratification does not proceed as expected.
Howard discussed concerns about data type persistence in MPI after sessions finalize, explaining that implementing a cleanup mechanism could lead to Val Grind errors if not properly managed. He expressed reluctance to add API bloat and mentioned that someone proposed adding an MPI session type commit feature to better support C++ interfaces, which would allow for attribute-free methods during session closure.
The team discussed restoring attributes that were previously removed due to concerns from a Cisco representative, with Howard proposing to reinstate them for MPI 5.1. Dominik raised questions about session management and PMIX support, which Howard confirmed were not problematic. The group also prepared for an upcoming forum, with Sonja offering to provide examples of dynamic process sets and non-blocking spawning mechanisms, and Howard inquiring about Dan Holmes' current contact information.
The team discussed the handling of process sets in MPI, particularly when a process is not a member. They reviewed the current implementation where MPI session get P set info returns MPI info null if the process is not a member, and considered adding a new attribute to explicitly indicate membership. Sonja suggested this approach would be similar to the current size attribute, which is not standard. The group also discussed that group operations like difference and intersection are local and should work even if a process is not a member of a process set.
Dominik and Sonja discussed a proposal to omit a list of piece sets and allow the use of psets from other sources, which would not appear in the list but could still be retrieved for information. Howard expressed satisfaction with the current standard but suggested adding a check to ensure the info object is not null before calling the info get function. Michael clarified that the MPI session get p set info call is primarily testing for p set membership, rather than being dependent on the session.
The team discussed the relationship between process sets and sessions, noting that session-specific process set information is not available from the start and must be created separately. Dominik observed that in the current implementation, multiple sessions share the same runtime environment and there is limited separation between sessions regarding process sets. The discussion then shifted to IP setup concerns, where Sonja expressed concerns about the lack of fine-grained control and transparency in the MPI session IP setup process, suggesting a need for a more user-friendly solution.
Sonja proposed querying runtime for potential changes to application operations, allowing users to trigger actions based on the returned information. Dominik and Howard discussed output presets as placeholders for processes and the need to separate optimization and spawn information. They considered having placement and resource info provided in a side document, similar to the Memkind approach, to be used later in spawn calls for manipulating process settings.
Howard explains that the current implementation only requires process set names to create an intercommunicator between spawned processes and the parent group. He describes how the leaders of each group exchange their list of processes, which is then used to construct the groups using Pimex Group construct. This exchange of information allows for the creation of the intercommunicator without needing additional connectivity details beyond the process set names.
Ralph informs the group that he has rewritten Howard's DPM code to allow for the creation of group sessions without knowing all participants. Howard acknowledges this and notes that it shifts the problem, as they still need a unique tag string that all participants agree on. Dominik suggests that the dictionary functionality could help with exchanging the tag used by all participating processes to create a group. Howard confirms this and explains that in the current design, only one process, acting as a controller, would call the MPI IP setup function to interact with the runtime.
The team discussed issues with IP setup and communication protocols, with Howard expressing concern about having too many concurrent processes. Dominik explained that the current system avoids calling the IP setup while requests are pending, but Howard suggested deconstructing the session IP setup process for future improvements. Ralph proposed finding alternative ways to identify groups without relying on unique strings, and the team agreed to explore these options in their next meeting.
The team discussed splitting Example 2 into separate cases for growing and shrinking to better analyze potential issues. They agreed to meet again on the 16th, with Sonja and Dominic planning to work on the spawning use case and Martin Schreiber potentially leading the discussion. Howard requested a small change to the code, asking Dominic to add a check for null objects when using info get. The team also briefly touched on the upcoming 5.0 ratification and the potential need for a quick fix if it fails, though Howard expressed confidence in its success.
**AI-generated content may be inaccurate or misleading. Always check for accuracy. **