-
Notifications
You must be signed in to change notification settings - Fork 9
#2183: LDMS: add stream for phase data #2184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#2183: LDMS: add stream for phase data #2184
Conversation
Pipelines resultsPR tests (gcc-12, ubuntu, mpich) Build for 1c98fba (2023-11-09 17:49:25 UTC) vt-build-amd64-ubuntu-20-04-clang-9-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-alpine-3-16-clang-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-gcc-14-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-10-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-10-openmpi-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-clang-15-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-gcc-12-vtk-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-clang-12-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-10-openmpi-cpp-spack Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-clang-11-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-clang-16-zoltan-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-gcc-12-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-gcc-11-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-icpx-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-clang-18-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-9-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-gcc-13-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-9-cuda-11-4-3-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-clang-17-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-24-04-clang-16-vtk-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-clang-10-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-9-cuda-12-2-0-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-clang-14-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-9-cpp-docs Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-22-04-clang-13-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) vt-build-amd64-ubuntu-20-04-gcc-9-ldms-cpp Build for cee9682 (2025-08-12 15:20:44 UTC) |
365191b to
cacd48a
Compare
8baa8fc to
7b49e32
Compare
|
Code used for reading the data: #include <ldms/ldms.h>
#include <ldms/ldmsd_stream.h>
#include <ovis_util/util.h>
int stream_handler(ldmsd_stream_client_t c, void *ctxt,
ldmsd_stream_type_t stream_type,
const char *data, size_t data_len,
json_entity_t entity)
{
// Process the received data
printf("%.*s\n", (int)data_len, data);
return 0;
}
int main()
{
ldms_init(256);
ldms_t ldms = ldms_xprt_new_with_auth("sock", "none", NULL);
int rc = ldms_xprt_connect_by_name(ldms, "localhost", "10444", NULL, NULL);
if(!rc){
printf("Error code %d \n", rc);
}
ldmsd_stream_subscribe("LB_data", stream_handler, NULL);
while (1){
// Keep running to continue receiving data
}
ldms_xprt_put(ldms);
return 0;
}Config file: Script to launch the LDMS deamon: #!/bin/bash
TOP=/ovis/LDMS_install
export LD_LIBRARY_PATH=$TOP/lib/:$TOP/lib:$LD_LIBRARY_PATH
export LDMSD_PLUGIN_LIBPATH=$TOP/lib/ovis-ldms
export ZAP_LIBPATH=$TOP/lib/ovis-ldms
export PATH=$TOP/sbin:$TOP/bin:$PATH
export PYTHONPATH=$TOP/lib/python2.7/site-packages
ldmsd -x sock:10444 -c /ldmsd.conf -l /tmp/demo_ldmsd_log -v DEBUG -r $(pwd)/ldmsd.pid
|
caca0d1 to
1939ddb
Compare
1939ddb to
1c98fba
Compare
fb560c0 to
78c2a34
Compare
|
diff --git a/src/vt/configs/features/features_defines.h b/src/vt/configs/features/features_defines.h
index 3e645c2fb..39c01bbc1 100644
--- a/src/vt/configs/features/features_defines.h
+++ b/src/vt/configs/features/features_defines.h
@@ -63,7 +63,7 @@
#define vt_feature_memory_pool 0 || vt_feature_cmake_memory_pool
#define vt_feature_priorities 0 || vt_feature_cmake_priorities
#define vt_feature_fcontext 0 || vt_feature_cmake_fcontext
-#define vt_feature_ldms 0 || vt_feature_cmake_ldms
+#define vt_feature_ldms 0 || vt_feature_cmake_ldms
#define vt_feature_mimalloc 0 || vt_feature_cmake_mimalloc
#define vt_feature_mpi_access_guards 0 || vt_feature_cmake_mpi_access_guards
#define vt_feature_zoltan 0 || vt_feature_cmake_zoltan
diff --git a/src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc b/src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc
index 1f7f8adc8..0e23ab61b 100644
--- a/src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc
+++ b/src/vt/vrt/collection/balance/lb_invoke/lb_manager.cc
@@ -631,11 +631,10 @@ void LBManager::stagePreLBStatistics(const StatisticMapType &statistics) {
nlohmann::json j;
j["pre-LB"] = lb::jsonifyPhaseStatistics(statistics);
-
- #if vt_check_enabled(ldms)
+#if vt_check_enabled(ldms)
j["ts"] = MPI_Wtime();
theNodeLBData()->writeJSONToLDMS(j);
- #endif
+#endif
if (!statistics_writer_) {
createStatisticsFile();
diff --git a/src/vt/vrt/collection/balance/node_lb_data.cc b/src/vt/vrt/collection/balance/node_lb_data.cc
index 2d87ed386..bff24dd3e 100644
--- a/src/vt/vrt/collection/balance/node_lb_data.cc
+++ b/src/vt/vrt/collection/balance/node_lb_data.cc
@@ -67,7 +67,6 @@
#include INCLUDE_FMT_FORMAT
-
namespace vt { namespace vrt { namespace collection { namespace balance {
void NodeLBData::setProxy(objgroup::proxy::Proxy<NodeLBData> in_proxy) {
@@ -169,8 +168,12 @@ void NodeLBData::initializeLDMS() {
const auto hostname = getenv("VT_LDMS_HOSTNAME");
const auto port = getenv("VT_LDMS_PORT");
- const auto returnCode = ldms_xprt_connect_by_name(ldms_, hostname, port, NULL, NULL);
- vtWarnIf(returnCode == 0, fmt::format("ldms_xprt_connect_by_name failed with code {} \n", returnCode));
+ const auto returnCode =
+ ldms_xprt_connect_by_name(ldms_, hostname, port, NULL, NULL);
+ vtWarnIf(
+ returnCode == 0,
+ fmt::format("ldms_xprt_connect_by_name failed with code {} \n", returnCode)
+ );
}
#endif
@@ -340,9 +343,8 @@ void NodeLBData::outputLBDataForPhase(PhaseType phase) {
void NodeLBData::writeJSONToLDMS([[maybe_unused]] const nlohmann::json& j) {
if (ldms_prev_submission_ == 0) {
ldms_prev_submission_ = MPI_Wtime();
- } else if (
- (MPI_Wtime() - ldms_prev_submission_) * 1000.0 < (double)ldms_milli_freq_
- ) {
+ } else if ((MPI_Wtime() - ldms_prev_submission_) * 1000.0 <
+ (double)ldms_milli_freq_) {
return;
} else {
ldms_prev_submission_ = MPI_Wtime();
@@ -352,7 +354,10 @@ void NodeLBData::writeJSONToLDMS([[maybe_unused]] const nlohmann::json& j) {
const auto returnVal = ldmsd_stream_publish(
ldms_, "vtLBStats", LDMSD_STREAM_JSON, jsonStr.c_str(), jsonStr.length() + 1
);
- vtWarnIf(returnVal == 0, fmt::format("ldmsd_stream_publish returned {}!\n", returnVal));
+ vtWarnIf(
+ returnVal == 0,
+ fmt::format("ldmsd_stream_publish returned {}!\n", returnVal)
+ );
}
#endif
|
0100832 to
de1449a
Compare
c49ac24 to
80ed20e
Compare
|
@Logan590 Please fix the commit message for ca0cdd3 (just start it with https://github.com/DARMA-tasking/vt/wiki/Pull-Requests#commit-formatting |
ca0cdd3 to
9637749
Compare
|
@lifflander, there are many versions of LDMS. I'm currently using the version 4.3.5 to work on this issue. Does it seem good to you ? |
3105c59 to
a0c37c5
Compare
| if (auto ldms_freq = getenv("VT_LDMS_MILLI_FREQ")) { | ||
| ldms_milli_freq_ = atoi(ldms_freq); | ||
| } | ||
| const auto xPrt = getenv("VT_LDMS_XPRT"); | ||
| const auto auth = getenv("VT_LDMS_AUTH"); | ||
| ldms_ = ldms_xprt_new_with_auth(xPrt, NULL, auth, NULL); | ||
| vtWarnIf(ldms_, "ldms_xprt_new_with_auth failed!"); | ||
|
|
||
| const auto hostname = getenv("VT_LDMS_HOSTNAME"); | ||
| const auto port = getenv("VT_LDMS_PORT"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those variables should be documented along with the LDMS test procedure described earlier. Maybe a wiki page?
a0c37c5 to
f1c010d
Compare
6968966 to
8146684
Compare
|
@JacobDomagala Can you please have a look? |
JacobDomagala
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
5004ca8 to
ce87f55
Compare
cz4rs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me in the parts I haven't touched.
lifflander
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Fixes #2183