Kernel logger upgrades for better output and better code comparisons#276
Kernel logger upgrades for better output and better code comparisons#276csiefer2 wants to merge 6 commits intokokkos:developfrom
Conversation
|
Yeah, yeah, I'll fix clang later. Right now I want feedback. |
|
The CI check formatting-check with clang-tidy is fine to not pass for now for the sake of a PR review. However, the CI check for simple-build is showing Also, JM2C, but it may be good to separate this out into two PRs. I think the first change does not depend on whether the Kokkos Tools user sets |
|
@vlkale This is a debugging tool that fences all over the place. Host-side conditionals are basically irrelevant in terms of the performance hit the fences create. As for the compile error, that's because the headers are no longer consistent with the headers in Kokkos. I'll add a fix to that in a second. |
Signed-off-by: Chris Siefert <csiefer@sandia.gov>
|
Compiles just fine now... |
Signed-off-by: Chris Siefert <csiefer@sandia.gov>
| if (eid.type == DeviceType::Serial) | ||
| device_label += "Serial"; | ||
| else if (eid.type == DeviceType::OpenMP) | ||
| device_label += "OpenMP"; | ||
| else if (eid.type == DeviceType::Cuda) | ||
| device_label += "Cuda"; | ||
| else if (eid.type == DeviceType::HIP) | ||
| device_label += "HIP"; | ||
| else if (eid.type == DeviceType::OpenMPTarget) | ||
| device_label += "OpenMPTarget"; | ||
| else if (eid.type == DeviceType::HPX) | ||
| device_label += "HPX"; | ||
| else if (eid.type == DeviceType::Threads) | ||
| device_label += "Threads"; | ||
| else if (eid.type == DeviceType::SYCL) | ||
| device_label += "SYCL"; | ||
| else if (eid.type == DeviceType::OpenACC) | ||
| device_label += "OpenACC"; | ||
| else if (eid.type == DeviceType::Unknown) | ||
| device_label += "Unknown"; | ||
| else | ||
| device_label += "Unknown to KokkosTools"; |
There was a problem hiding this comment.
I wouldn't be opposed to pushing this part (or even the whole function) to Kokkos_Profiling_Interface.hpp. Getting string out of the device id doesn't seem to be specific to the kernel logger tool (also see https://github.com/kokkos/kokkos-tools/pull/265/files#diff-839f34fcb31addd9a48252bebf4d37cf674f6fcdd16539c7039813192674b9c0R165-R187).
There was a problem hiding this comment.
Concur. I also have to maintain this function in Trilinos, so I would love to put it in Kokkos_Profiling_Interface.hpp.
There was a problem hiding this comment.
If @crtrott doesn't object, I'll open a Kokkos PR with that.
Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
True about fencing (aside: maybe in the future this tool could reduce fencing more, e.g., just allowing for global fencing to be turned off, if some Kokkos users want that). Stepping back though and forgetting about efficiency, I think I may be actually misunderstanding the purpose. I thought you are handling a case where the output of kernel-logger is not different across different Kokkos backends for a particular Kokkos application like LAMMPS, but also is non-deterministic for a given Kokkos backend. Did I get that right from "comparing two versions of the code with roughly the same flow, but some different kernel calls here and there. Specifically, code paths with and without UnifiedMemory." ?
Thanks and I realized just now that this is the issue that Kokkos Tools Github Issue #275 brings up. |
I'm interested using the tool to track down cases in the case code flow is deterministic, but the answer is not due to an unintended race condition. |
| using namespace Kokkos::Tools::Experimental; | ||
| std::string device_label("("); | ||
| ExecutionSpaceIdentifier eid = identifier_from_devid(deviceId); | ||
| if (eid.type == DeviceType::Serial) |
There was a problem hiding this comment.
This is a switch case.
| if (varVal) { | ||
| std::string v = std::string(varVal); | ||
| // default to false | ||
| if (v == "1" || v == "ON" || v == "on" || v == "TRUE" || v == "true" || |
There was a problem hiding this comment.
Most of the time, it seems that this repo wants either 0 or 1 for the value of an environment variable.
Then use atoi.
I think it is important that all environment variables follow consistent conventions.
There was a problem hiding this comment.
@romintomasetti This is a Tpetra-ism to allow it to take anything reasonable as an option. If @vlkale or @crtrott doesn't like it that's fine, but we've found the flexibility to be useful in Trilinos
| "KokkosP: Executing parallel-scan kernel on device %s with unique " | ||
| "execution identifier %llu\n", | ||
| devID, (unsigned long long)(*kID)); | ||
| deviceIdToString(devID).c_str(), (unsigned long long)(output)); |
There was a problem hiding this comment.
| deviceIdToString(devID).c_str(), (unsigned long long)(output)); | |
| deviceIdToString(devID).c_str(), suppressCounts() ? 0 : *kID); |
That's twisted, but sure, why not? Co-authored-by: Tomasetti Romin <romin.tomasetti@gmail.com>
This PR does two things:
Upgrades the output on the Kernel logger to look more like the Tpetra DeepCopyTools --- rather than output space ID's we demanagle that and output names.
Allow you to use the KOKKOS_TOOLS_PROFILE_SUPPRESS_COUNTS variable to suppress the kernel counts in the output --- this is really useful if you're comparing two versions of the code with roughly the same flow, but some different kernel calls here and there. Specifically, code paths with and without UnifiedMemory. The call counting makes diff really hard and so turning it off can make diff easier.