Component(s)
No response
Request
An explicitly opt-in system-level profiling mode that includes kernel thread profiles.
Use case
Hello,
I’ve been working with Alloy and had a question / proposal around kernel-space visibility.
During several performance investigations, I’ve observed workloads spending a significant portion of CPU time in kernel paths (e.g., memory management, IO). These are clearly visible in tools like perf, but are not surfaced in Pyroscope profiles.
My understanding is that this stems from the current process-centric attribution model. In particular, the discovery() logic filters out PIDs/TIDs that do not have attributes like exe, cwd, or commandline. As a result, kernel threads are not included. As a quick experiment, I modified this behavior to ignore the absence of these attributes, and was able to collect kernel thread profiles and visualize them in Grafana.
This highlights an observability gap: from a Pyroscope/Grafana perspective, a workload may appear “healthy,” while in reality a significant portion of CPU time is spent in kernel paths.
I understand the concerns around cardinality and overhead if kernel threads were included by default. That said, would it make sense to support an explicitly opt-in “system-level profiling mode”?
Curious if this has been considered before, or if there are architectural constraints that would make this difficult to support. I’d appreciate your thoughts on this.
Thanks!
-Sri
Tip
React with 👍 if this issue is important to you.
Component(s)
No response
Request
An explicitly opt-in system-level profiling mode that includes kernel thread profiles.
Use case
Hello,
I’ve been working with Alloy and had a question / proposal around kernel-space visibility.
During several performance investigations, I’ve observed workloads spending a significant portion of CPU time in kernel paths (e.g., memory management, IO). These are clearly visible in tools like
perf, but are not surfaced in Pyroscope profiles.My understanding is that this stems from the current process-centric attribution model. In particular, the
discovery()logic filters out PIDs/TIDs that do not have attributes likeexe,cwd, orcommandline. As a result, kernel threads are not included. As a quick experiment, I modified this behavior to ignore the absence of these attributes, and was able to collect kernel thread profiles and visualize them in Grafana.This highlights an observability gap: from a Pyroscope/Grafana perspective, a workload may appear “healthy,” while in reality a significant portion of CPU time is spent in kernel paths.
I understand the concerns around cardinality and overhead if kernel threads were included by default. That said, would it make sense to support an explicitly opt-in “system-level profiling mode”?
Curious if this has been considered before, or if there are architectural constraints that would make this difficult to support. I’d appreciate your thoughts on this.
Thanks!
-Sri
Tip
React with 👍 if this issue is important to you.