#1: Implement new benchmark #2

lifflander · 2024-12-18T00:10:21Z

Fixes #1

I'm thinking about adding MPI_Get_processor_name in this or we could do it in a script instead.

lifflander · 2024-12-18T00:11:09Z

This is the output it will produce on 8 ranks and 20 iterations that is meant to be read in by a script:

gather: 0: 3.26723: breakdown: 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272 0.154272
gather: 1: 3.21715: breakdown: 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019 0.149019
gather: 2: 3.24052: breakdown: 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853 0.151853
gather: 3: 3.28802: breakdown: 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015 0.149015
gather: 4: 3.27613: breakdown: 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472 0.147472
gather: 5: 3.27223: breakdown: 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965 0.152965
gather: 6: 3.29078: breakdown: 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535 0.147535
gather: 7: 3.25363: breakdown: 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264 0.151264

…ight order

nlslatt · 2024-12-19T18:50:15Z

CMakeLists.txt

+)
+
+target_link_libraries(${SLOW_NODE_EXE} PUBLIC MPI::MPI_CXX)
+target_link_libraries(${SLOW_NODE_EXE} PUBLIC Kokkos::kokkoskernels)


I had to change this line to link against my installed Trilinos:

target_link_libraries(${SLOW_NODE_EXE} PUBLIC KokkosKernels::kokkoskernels)

Yes, we need an option to switch between finding Trilinos and linking that way vs. directly using Kokkos kernels.

…ranks as there are ranks in a socket; fix naming issues

#1: initial: implement slow node benchmark

4f102a7

lifflander requested review from cwschilly and nlslatt December 18, 2024 00:10

lifflander linked an issue Dec 18, 2024 that may be closed by this pull request

Implement new benchmark #1

Closed

lifflander and others added 2 commits December 18, 2024 08:17

#1: spacing: fix spacing on output

84dac5c

#1: detection: add initial python script for detecting slow nodes

7fb2b74

cwschilly changed the title ~~#1: initial: implement slow node benchmark~~ #1: Implement new benchmark Dec 18, 2024

lifflander and others added 3 commits December 18, 2024 12:40

#1: allocation: remove without initializing so pages are touched in r…

ca11a3c

…ight order

#1: spacing: fix blank line to make git checks pass

516a157

#1: detection: use pct above mean to find slow nodes

03e78c3

nlslatt reviewed Dec 19, 2024

View reviewed changes

cwschilly added 10 commits December 19, 2024 14:08

#1: output: write out proc name in final table

c70596d

#1: output: add proc name to python script output

b94feb3

#1: cleanup: reduce code duplication

1a517e9

#1: detection: quick fix to script

7ef6c50

#1: detection: add comments throughout

f53696d

#1: output: write out hostfile with good nodes

f37f5ee

#1: output: only write nodes if no slow ranks were found on it

b158af9

#1: detection: exclude nodes only if there are at least as many slow …

f532dba

…ranks as there are ranks in a socket; fix naming issues

#1: detection: small bug fixes

a1be771

#1: typing: ensure that threshold pct is always a float

b8d5538

lifflander merged commit 1176e51 into master Mar 4, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#1: Implement new benchmark #2

#1: Implement new benchmark #2

Uh oh!

lifflander commented Dec 18, 2024 •

edited by cwschilly

Loading

Uh oh!

lifflander commented Dec 18, 2024 •

edited

Loading

Uh oh!

nlslatt Dec 19, 2024

Uh oh!

lifflander Dec 19, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

#1: Implement new benchmark #2

#1: Implement new benchmark #2

Uh oh!

Conversation

lifflander commented Dec 18, 2024 • edited by cwschilly Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lifflander commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nlslatt Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

lifflander Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lifflander commented Dec 18, 2024 •

edited by cwschilly

Loading

lifflander commented Dec 18, 2024 •

edited

Loading