Skip to content

EthDevOps/astral-power-monitor

Repository files navigation

ASUS ROG Astral RTX 5090 — Per-Pin Current Sensors on Linux

Linux tool for reading per-pin voltage and current from the 12V-2x6 (ATX12V) power connector on ASUS ROG Astral RTX 5090/5080 GPUs. This is the same data that GPU Tweak III's "Power Detector+" feature shows on Windows.

How it works

The ASUS ROG Astral RTX 5090 O32G Gaming has an ITE IT8915FN embedded controller on each GPU board that monitors individual pin voltage and current on the 12V-2x6 power connector. On Windows, ASUS GPU Tweak III reads this via the undocumented NvAPI_I2CReadEx API. On Linux, the NVIDIA driver exposes the GPU's i2c buses as standard /dev/i2c-* devices, so we can read the IT8915 directly using SMBus byte-data reads.

Quick start

# Single reading, all GPUs
sudo python3 astral_power_monitor.py

# Continuous monitoring, 1-second refresh
sudo python3 astral_power_monitor.py -w 1

# JSON output (for Prometheus/Grafana/scripts)
sudo python3 astral_power_monitor.py --json

# Alert if any pin exceeds 9.2A (ASUS default threshold)
sudo python3 astral_power_monitor.py --alert 9.2

# Monitor only GPU 0 with extra rails
sudo python3 astral_power_monitor.py -g 0 --extra

Example output

GPU 0  [i2c-15]
────────────────────────────────────────────────────
   Pin    Voltage    Current     Power
   ───  ─────────  ─────────  ────────
     1    12.256V     1.160A    14.22W
     2    12.264V     1.240A    15.21W
     3    12.256V     1.220A    14.95W
     4    12.264V     1.220A    14.96W
     5    12.264V     1.220A    14.96W
     6    12.264V     1.240A    15.21W
   ───  ─────────  ─────────  ────────
   Tot    12.261V     7.300A    89.51W

JSON output

{
  "timestamp": 1776955414.94,
  "gpus": [
    {
      "gpu_index": 0,
      "pci_addr": "1:00.0",
      "bus": 15,
      "pins": [
        {"pin": 1, "voltage_v": 12.288, "current_a": 0.32, "power_w": 3.93},
        {"pin": 2, "voltage_v": 12.288, "current_a": 0.34, "power_w": 4.18},
        ...
      ],
      "total_connector_power_w": 24.82,
      "total_connector_current_a": 2.02,
      "avg_voltage_v": 12.289
    }
  ]
}

Daemon mode

The daemon (astral_daemon.py) provides continuous monitoring with Prometheus metrics and automatic fire-hazard detection.

Installation

pip install -r requirements.txt   # prometheus_client, pyyaml

# Copy config
sudo cp config.yaml /etc/astral-power-monitor.yaml

# Install service
sudo mkdir -p /opt/astral-power-monitor
sudo cp astral_daemon.py astral_power_monitor.py /opt/astral-power-monitor/
sudo cp astral-power-monitor.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now astral-power-monitor

Running manually

# With default config location (/etc/astral-power-monitor.yaml)
sudo python3 astral_daemon.py

# With custom config
sudo python3 astral_daemon.py -c ./config.yaml

# Validate config and exit
python3 astral_daemon.py --validate -c ./config.yaml

Prometheus metrics

The daemon exposes metrics on port 9101 (configurable) at /metrics:

Metric Labels Description
astral_pin_voltage_volts gpu, pin Per-pin voltage
astral_pin_current_amps gpu, pin Per-pin current
astral_pin_power_watts gpu, pin Per-pin power
astral_connector_power_watts gpu Total connector power
astral_connector_current_amps gpu Total connector current
astral_connector_voltage_volts gpu Average connector voltage
astral_pin_current_stddev_amps gpu Stddev of pin currents (imbalance)
astral_alert_state gpu 0=ok, 1=warning, 2=critical
astral_enforced_power_limit_watts gpu Throttled power limit (0=none)
astral_extra_rail_voltage_volts gpu, rail Extra rail voltage
astral_extra_rail_current_amps gpu, rail Extra rail current
astral_extra_rail_power_watts gpu, rail Extra rail power

Example Prometheus scrape config:

scrape_configs:
  - job_name: 'astral-gpu-power'
    static_configs:
      - targets: ['gpu-server:9101']

Fire-hazard detection

The fire-hazard detector monitors pin current imbalance across the 6 power pins. Uneven current distribution can indicate a loose connector, damaged pin, or poor cable contact — conditions that can cause localized heating and connector damage.

Detection is based on the standard deviation of pin currents:

Level Trigger Action
Warning stddev > 0.1A or any pin > 8.0A Log warning, set astral_alert_state=1
Critical stddev > 0.5A or any pin > 9.2A Log critical, set astral_alert_state=2
Throttle 3 consecutive critical readings Reduce GPU power limit via nvidia-smi -pl
Restore Alert clears back to OK Restore original power limit

All thresholds, the throttle target wattage, consecutive count, and restore behavior are configurable in config.yaml.

Configuration

See config.yaml for all options. Key settings:

poll_interval: 1.0          # seconds between readings

prometheus:
  port: 9101                # metrics endpoint port

fire_hazard:
  warning_stddev: 0.1       # amps — pin current imbalance warning
  critical_stddev: 0.5      # amps — pin current imbalance critical
  pin_current_warning: 8.0  # amps — absolute per-pin warning
  pin_current_critical: 9.2 # amps — absolute per-pin critical (ASUS default)
  throttle_power_limit_watts: 250  # watts — throttle target on critical
  critical_count_before_throttle: 3 # consecutive readings before throttling
  restore_on_clear: true    # restore original power limit when alert clears

Requirements

CLI tool (astral_power_monitor.py)

  • ASUS ROG Astral RTX 5090 or 5080 GPU (other ASUS models may also have the IT8915 — untested)
  • Linux with NVIDIA proprietary driver loaded (i2c buses must be exposed)
  • Python 3.6+ (stdlib only, no external dependencies)
  • Root access or membership in the i2c group

Daemon (astral_daemon.py)

All of the above, plus:

  • prometheus_client (pip)
  • pyyaml (pip)
  • See requirements.txt

Hardware details

The sensor chip: ITE IT8915FN

The IT8915FN is an embedded controller on the GPU PCB that monitors individual power pins on the 12V-2x6 connector. It sits on the GPU's internal i2c bus and is accessed through the NVIDIA i2c adapter.

Property Value
Chip ITE IT8915FN
I2C address 0x2B (7-bit)
I2C bus NVIDIA i2c adapter 1 (per GPU)
Access method SMBus byte-data reads (I2C_SMBUS ioctl)

Note: The IT8915 does not support i2c block reads or combined write+read transfers. Each register must be read individually using SMBus byte-data protocol. This is why the tool uses the I2C_SMBUS ioctl rather than raw read()/write() on the i2c-dev fd.

Register map

The power monitoring data lives at registers 0x800x97 (24 bytes). It is organized as 6 rails of 4 bytes each:

Offset Bytes Content Format
0x800x81 2 Rail 0 voltage 16-bit big-endian, millivolts
0x820x83 2 Rail 0 current 16-bit big-endian, milliamps
0x840x87 4 Rail 1 (same format)
0x880x8B 4 Rail 2
0x8C0x8F 4 Rail 3
0x900x93 4 Rail 4
0x940x97 4 Rail 5

Pin order is reversed: Rail 0 (offset 0x80) corresponds to Pin 6, and Rail 5 (offset 0x94) corresponds to Pin 1.

Register layout (24 bytes from 0x80):
  Byte  0- 1: Rail 0 Voltage  →  Pin 6
  Byte  2- 3: Rail 0 Current  →  Pin 6
  Byte  4- 5: Rail 1 Voltage  →  Pin 5
  Byte  6- 7: Rail 1 Current  →  Pin 5
  Byte  8- 9: Rail 2 Voltage  →  Pin 4
  Byte 10-11: Rail 2 Current  →  Pin 4
  Byte 12-13: Rail 3 Voltage  →  Pin 3
  Byte 14-15: Rail 3 Current  →  Pin 3
  Byte 16-17: Rail 4 Voltage  →  Pin 2
  Byte 18-19: Rail 4 Current  →  Pin 2
  Byte 20-21: Rail 5 Voltage  →  Pin 1
  Byte 22-23: Rail 5 Current  →  Pin 1

Extra rails (0x98–0xA3)

Two additional voltage/current pairs exist beyond the 6-pin connector data:

Offset Content Notes
0x980x9B Rail 7 (voltage + current) Likely PCIe slot power or aggregate measurement
0xA00xA3 Rail 8 (voltage + current) Slightly lower voltage (~12.25V), purpose unconfirmed

These can be shown with the --extra flag.

Other observed registers

During exploration, additional data regions were identified but not fully decoded:

Range Observation
0x600x6F 8 x 16-bit BE values, vary between GPUs — possibly per-phase VRM voltage or temperature
0x700x75 3 x 16-bit values near 0x1000 — possibly ADC reference or another voltage
0xC00xCF 4 x 32-bit values — possibly power aggregates (one reading ~8000 at idle, plausible as mW)

GPU-to-bus discovery

The tool auto-discovers GPUs by:

  1. Scanning /sys/bus/i2c/devices/i2c-*/name for entries matching NVIDIA i2c adapter 1 at <PCI_ADDR>
  2. Probing each for the IT8915 by reading register 0x80 and checking the voltage is in the 10–14V range
  3. Correlating PCI bus addresses with nvidia-smi to assign GPU indices

On Windows vs. Linux

Windows Linux
API NvAPI_I2CReadEx (undocumented) /dev/i2c-* via I2C_SMBUS ioctl
Dependencies NVIDIA NVAPI (nvapi64.dll) Python stdlib only
Bus selection portId: 1 in NVAPI struct NVIDIA i2c adapter 1 in sysfs
Address format 0x56 (8-bit, left-shifted) 0x2B (7-bit)

Verified behavior

Tested on a 4x ASUS ROG Astral RTX 5090 O32G Gaming system (NVIDIA driver 590.48.01, Ubuntu 24.04):

  • At idle: ~200–340 mA per pin, ~12.29V, total connector power ~15–25W
  • Under CUDA stress: ~1.2A per pin, ~12.26V (slight droop), total connector power ~90W
  • Extra Rail 7: ~0.6A idle, ~3.7A under load — significant power rail

References and credits

License

MIT

About

Linux Tool/service to monitor voltages/currents per pin on ASUS ROG Astral 5090 O32G cards

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors