Skip to content

[BUG] NVIDIA NVML ioctl stalls (/dev/nvidiactl) trigger runner-stall loop and abort #1612

@akoumjian

Description

@akoumjian

Describe the bug

btop repeatedly logs:

ERROR: Stall in Runner thread, restarting!

and then aborts with:

terminate called without an active exception

Environment

  • btop package version: 1.3.0-1 (apt, Ubuntu noble)
  • Also tested: source build v1.4.5 with GPU_SUPPORT=true (same behavior)
  • GPU: NVIDIA RTX PRO 6000 Blackwell Workstation Edition
  • NVIDIA driver: 570.195.03

Reproduction

  1. Start btop with GPU enabled (show_gpu_info=Auto/On).
  2. Leave running for a while.
  3. Observe repeated runner stall messages.
  4. Process eventually aborts with terminate called without an active exception.

Expected behavior

  • If GPU polling hangs/fails, btop should degrade/skip those metrics and continue running.
  • It should not enter repeated runner restarts and abort.

Actual behavior

  • Repeated Stall in Runner thread restarts.
  • Eventual abort (SIGABRT) and process exit.

Key evidence

strace -f shows worker threads blocking in NVIDIA ioctls on /dev/nvidiactl before abort.

Relevant pattern observed:

  • ioctl(9, _IOC(_IOC_READ|_IOC_WRITE, 0x46, 0x2a, 0x20), ...) (fd for /dev/nvidiactl)
  • thread then emits terminate called without an active exception
  • sends SIGABRT (tgkill(...))

This strongly suggests NVML/GPU polling path stalls (same family as prior NVML-related reports).

What I already tested

  • show_gpu_info=Off => stable (no runner stalls in multi-minute soak).
  • show_gpu_info=On + nvml_measure_pcie_speeds=False => still stalls.
  • shown_boxes excluding proc but keeping GPU => still stalls.
  • Upgrading from apt 1.3.0 to locally built 1.4.5 did not eliminate stalls on this system.

Possibly related issues

Ask

Could GPU/NVML collection be isolated with hard timeouts/fallback so blocked NVML calls cannot stall the runner and cascade into abort?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions