Skip to content

OSHMEM/MCA/SPML/UCX: added support for team management functions #13177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2025

Conversation

roiedanino
Copy link
Contributor

@roiedanino roiedanino commented Apr 6, 2025

Implemented team-management functions according to OpenSHMEM 1.5 spec

Tests PR: https://github.com/openshmem-org/tests-mellanox/pull/59/files

Copy link

github-actions bot commented Apr 6, 2025

Hello! The Git Commit Checker CI bot found a few problems with this PR:

26401af: OSHMEM/MCA/SPML/UCX: removed unnecessary team_type...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@roiedanino roiedanino force-pushed the shmem/1.5-support-ucx branch 2 times, most recently from 41068b3 to cff7c93 Compare April 7, 2025 10:43
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

450086a: OSHMEM/MCA/SPML/UCX: WIP

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@roiedanino roiedanino force-pushed the shmem/1.5-support-ucx branch from 450086a to ba0d28e Compare April 21, 2025 14:49
@roiedanino
Copy link
Contributor Author

Hi @MamziB can you please take a look?

@roiedanino roiedanino force-pushed the shmem/1.5-support-ucx branch 2 times, most recently from a024387 to a6babaf Compare April 21, 2025 15:14
janjust
janjust previously approved these changes Apr 22, 2025
Comment on lines 1864 to 1870
ucx_new_team->config = calloc(1, sizeof(mca_spml_ucx_team_config_t));

if (config != NULL) {
memcpy(&ucx_new_team->config->super, config, sizeof(shmem_team_config_t));
}

ucx_new_team->config = (mca_spml_ucx_team_config_t*)config;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucx_new_team->config is getting leaked here.

/* In order to simplify pe translations start and stride are calculated with respect to
* world_team */
ucx_new_team = (mca_spml_ucx_team_t *)malloc(sizeof(mca_spml_ucx_team_t));
ucx_new_team->start = parent_start + start;
Copy link
Contributor

@MamziB MamziB Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is incorrect, we should consider the parent's stride as well when we want to calculate the new start based on the comm world:
ucx_new_team->start = parent_start + (start * parent_stride)

int parent_stride;
int my_pe;

SPML_UCX_ASSERT(((start + size * stride) <= oshmem_num_procs()) && (start < size) && (stride > 0) && (size > 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size can be any positive value , and start is PE index in the parent team, so start < size is not required to be true.

@MamziB
Copy link
Contributor

MamziB commented Apr 23, 2025

looks good now

@roiedanino roiedanino force-pushed the shmem/1.5-support-ucx branch from c8cce70 to d0a4e07 Compare April 24, 2025 07:31
@roiedanino
Copy link
Contributor Author

@jsquyres @janjust, can you please merge this?

@janjust janjust merged commit 1d301f7 into open-mpi:main Apr 24, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants