Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions doc/PCM-EXPORTER.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,62 @@ The default output of pcm-sensor-server endpoint in a browser:

![image](https://user-images.githubusercontent.com/25432609/226344012-8783e154-998e-48a7-a2ca-f2c42af9c843.png)

## Security Warning

pcm-sensor-server collects and serves internal CPU metric information on the system. Do not expose its HTTP/HTTPS endpoints to untrusted or publicly accessible networks. Prefer binding to localhost or a dedicated management interface (see `-l|--listen` above), and use firewalling and/or an authenticated reverse proxy if remote access is required. High request rates can overload the host and lead to a denial of service.

## Integration with Grafana

The PCM exporter can be used together with Grafana to obtain these Intel processor metrics (see [how-to](../scripts/grafana/README.md)):

![pcm grafana output](https://raw.githubusercontent.com/wiki/intel/pcm/pcm-dashboard-full.png)

# Low-Level Metric Reference

## Global PCM Events

| Event Name | Description |
|-----------------------------|-----------------------------------------------------------------------------|
| Measurement_Interval_in_us | How many us elapsed to complete the last measurement |
| Number_of_sockets | Number of CPU sockets in the system |

## Core Counters per socket

OS_ID is the OS assigned ID of the logical CPU core and denotes the socket id, core id and thread id.

The events below are followed by the same {socket="socket id",core="core id",thread="thread id"} as
the OS_ID of their section with source="socket/core/thread" appended that denotes what the quantity
of the event accounts for.

For example Instructions_Retired_Any{socket="0",core="1",thread="1",source="core"} refers to
Instructions_Retired_Any for socket 0, core 1, thread 1, and accounts for the total instructions
retired of the specified core.

| Event | Description |
|------------------------------------------------|--------------------------------------------------------------|
| Instructions_Retired_Any | Total number of Retired instructions |
| Clock_Unhalted_Thread | Counts the number of core cycles while the thread is not |
| | in a halt state. |
| Clock_Unhalted_Ref | Counts the number of reference cycles that the thread is |
| | not in a halt state. The thread enters the halt state when |
| | it is running the HLT instruction. This event is not |
| | affected by thread frequency changes but counts as if the |
| | thread is running at the maximum frequency all the time. |
| L3_Cache_Misses | Total number of L3 Cache misses |
| L3_Cache_Hits | Total number of L3 Cache hits |
| L2_Cache_Misses | Total number of L2 Cache misses |
| L2_Cache_Hits | Total number of L2 Cache hits |
| L3_Cache_Occupancy | Computes L3 Cache Occupancy |
| SMI_Count | SMI (System Management Interrupt) count |
| Invariant_TSC | Calculates the invariant TSC clocks (the invariant TSC |
| | means that the TSC continues at a fixed rate regardless of |
| | the C-state or frequency of the processor as long as the |
| | processor remains in the ACPI S0 state. |
| Thermal_Headroom | Celsius degrees before reaching TjMax temperature |
| CStateResidency | This is the percentage of time that the core (or the whole |
| | package) spends in a particular level of C-state |

References:

https://software.intel.com/content/www/us/en/develop/articles/intel-performance-counter-monitor.html
https://software.intel.com/content/dam/develop/external/us/en/documents-tps/325384-sdm-vol-3abcd.pdf - Chapter 18 Performance Monitoring
48 changes: 2 additions & 46 deletions doc/PCM-SENSOR-SERVER-README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,3 @@
# Global PCM Events
# PCM Sensor Server Metric Reference

| Event Name | Description |
|-----------------------------|-----------------------------------------------------------------------------|
| Measurement_Interval_in_us | How many us elapsed to complete the last measurement |
| Number_of_sockets | Number of CPU sockets in the system |


# Core Counters per socket

OS_ID is the OS assigned ID of the logical CPU core and denotes the socket id, core id and thread id.

The events below are followed by the same {socket="socket id",core="core id",thread="thread id"} as
the OS_ID of their section with source="socket/core/thread" appended that denotes what the quantity
of the event accounts for.

For example Instructions_Retired_Any{socket="0",core="1",thread="1",source="core"} refers to
Instructions_Retired_Any for socket 0, core 1, thread 1, and accounts for the total instructions
retired of the specified core.

| Event | Description |
|------------------------------------------------|--------------------------------------------------------------|
| Instructions_Retired_Any | Total number of Retired instructions |
| Clock_Unhalted_Thread | |
| Clock_Unhalted_Ref | Counts the number of reference cycles that the thread is |
| | not in a halt state. The thread enters the halt state when |
| | it is running the HLT instruction. This event is not |
| | affected by thread frequency changes but counts as if the |
| | thread is running at the maximum frequency all the time. |
| L3_Cache_Misses | Total number of L3 Cache misses |
| L3_Cache_Hits | Total number of L3 Cache hits |
| L2_Cache_Misses | Total number of L2 Cache misses |
| L2_Cache_Hits | Total number of L3 Cache hits |
| L3_Cache_Occupancy | Computes L3 Cache Occupancy |
| SMI_Count | SMI (System Management Interrupt) count |
| Invariant_TSC | Calculates the invariant TSC clocks (the invariant TSC |
| | means that the TSC continues at a fixed rate regardless of |
| | the C-state or frequency of the processor as long as the |
| | processor remains in the ACPI S0 state. |
| Thermal_Headroom | Celsius degrees before reaching TjMax temperature |
| CStateResidency | This is the percentage of time that the core (or the whole |
| | package) spends in a particular level of C-state | |

References:

https://software.intel.com/content/www/us/en/develop/articles/intel-performance-counter-monitor.html
https://software.intel.com/content/dam/develop/external/us/en/documents-tps/325384-sdm-vol-3abcd.pdf - Chapter 18 Performance Monitoring
The PCM sensor server metric documentation has moved to [PCM-EXPORTER.md](PCM-EXPORTER.md).
2 changes: 1 addition & 1 deletion perfmon
Submodule perfmon updated 37 files
+2 −2 .github/workflows/bandit.yml
+3 −3 .github/workflows/codeql.yml
+1 −1 .github/workflows/create-perf-json.yml
+1 −1 .github/workflows/scorecard.yml
+28 −3 ADL/events/alderlake_goldencove_core.json
+3 −3 ADL/events/alderlake_gracemont_core.json
+3 −3 ADL/events/alderlake_uncore.json
+3 −3 ADL/events/alderlake_uncore_experimental.json
+741 −120 ARL/events/arrowlake_crestmont_core.json
+63 −63 ARL/events/arrowlake_lioncove_core.json
+4 −4 ARL/events/arrowlake_skymont_core.json
+3 −3 ARL/events/arrowlake_uncore.json
+3 −3 ARL/events/arrowlake_uncore_experimental.json
+1,606 −148 CWF/events/clearwaterforest_core.json
+3 −3 CWF/events/clearwaterforest_uncore.json
+3 −3 CWF/events/clearwaterforest_uncore_experimental.json
+28 −3 EMR/events/emeraldrapids_core.json
+3 −3 EMR/events/emeraldrapids_uncore.json
+3 −3 EMR/events/emeraldrapids_uncore_experimental.json
+29 −4 GNR/events/graniterapids_core.json
+5 −5 GNR/events/graniterapids_uncore.json
+22 −4 GNR/events/graniterapids_uncore_experimental.json
+63 −63 LNL/events/lunarlake_lioncove_core.json
+58 −4 LNL/events/lunarlake_skymont_core.json
+77 −5 LNL/events/lunarlake_uncore.json
+197 −5 LNL/events/lunarlake_uncore_experimental.json
+5 −5 PTL/events/pantherlake_cougarcove_core.json
+61 −7 PTL/events/pantherlake_darkmont_core.json
+89 −5 PTL/events/pantherlake_uncore.json
+227 −0 PTL/events/pantherlake_uncore_experimental.json
+28 −3 SPR/events/sapphirerapids_core.json
+3 −3 SPR/events/sapphirerapids_uncore.json
+3 −3 SPR/events/sapphirerapids_uncore_experimental.json
+1,033 −83 SRF/events/sierraforest_core.json
+3 −3 SRF/events/sierraforest_uncore.json
+3 −3 SRF/events/sierraforest_uncore_experimental.json
+62 −57 mapfile.csv
Loading