/proc/cvitek/vi_dbg is read via popen("cat | grep | awk") on a hot poll path
Summary
On firmware 2.3.6 (PCIe variant), two processes (kvm_system and NanoKVM-Server) read /proc/cvitek/vi_dbg via shell pipelines on a recurring poll. The read itself takes about 1 second on my box, almost all of it spent blocked on the Cvitek driver. Each shell pipeline launches 3-4 short-lived processes that live for that full second, and the two callers don't share the result.
Even if the 1-second driver read is expected, userspace shouldn't be paying that cost through repeated shell pipelines from multiple independent callers.
Hardware / firmware
- NanoKVM PCIe
- Firmware 2.3.6 (image v1.4.1, application 2.3.6)
- Kernel:
Linux kvm-7c12 5.10.4-tag- riscv64
- The box had 99 days of uptime when the numbers were worst
Measurements
/proc/cvitek/vi_dbg takes ~1 second per read
Five consecutive timed runs of the exact pipeline kvm_system uses:
$ /usr/bin/time -f "%e sec (%U user + %S sys)" sh -c \
"cat /proc/cvitek/vi_dbg | grep VIFPS | awk '{print \$3}'"
60
1.01 sec (0.00 user + 0.04 sys)
60
1.01 sec (0.01 user + 0.03 sys)
60
1.01 sec (0.01 user + 0.03 sys)
60
1.01 sec (0.01 user + 0.03 sys)
60
1.01 sec (0.00 user + 0.03 sys)
~40 ms of CPU per call, the rest is sleep time waiting on the driver.
Process churn is continuous
Sampling /proc/stat's processes counter over 5 seconds on an otherwise idle box:
procs created in 5s: 21 (≈ 4/sec)
That's one pipeline per second give or take. Parent PIDs are kvm_system (2705) and NanoKVM-Server (2720):
PID PPID STAT COMMAND
4008 2720 S sh -c cat /proc/cvitek/vi_dbg | grep -A 17 VIDevFPS | awk '{print $3}'
4015 2705 S sh -c cat /proc/cvitek/vi_dbg | grep VIFPS | awk '{print $3}'
4016 4015 D cat /proc/cvitek/vi_dbg <- uninterruptible sleep on the driver
Because each pipeline takes 1 second and they fire roughly every second, there's effectively always one process in D state contributing to load average.
Load average drifts up over uptime
Fresh boot: load ~1.0. After 99 days of uptime:
load average: 8.08 7.84 7.47 (single-core RISC-V SoC)
NanoKVM-Server %CPU: 77
A reboot resets it. On one core, load in the 4-8 range is enough that the NanoKVM-Server HTTP workers start getting blocked long enough for /api/auth/login to return 500 under any concurrency. I suspect that's what the "web UI takes 9 minutes to load" reports are seeing (#223), though I haven't verified that part.
Where the code is
Two C++ call sites, both in support/sg2002/:
-
support/sg2002/kvm_system/main/lib/system_state/system_state.cpp:245
fp = popen("cat /proc/cvitek/vi_dbg | grep VIFPS | awk '{print $3}'", "r");
In kvm_update_hdmi_state(). The static uint8_t check_times = 4 seed with ++check_times > 5 fires on iterations 2, 8, 14… so once every 6 ticks of the caller loop.
-
support/sg2002/additional/kvm/src/kvm_vision.cpp:220
char cmd[100] = "cat /proc/cvitek/vi_dbg | grep -A 17 VIDevFPS | awk '{print $3}'";
FILE* fp = popen(cmd, "r");
The NanoKVM-Server side is linked against libkvm.so. server/common/kvm_vision.go has // #cgo LDFLAGS: -lkvm, and:
$ strings server/dl_lib/libkvm.so | grep vi_dbg
cat /proc/cvitek/vi_dbg | grep -A 17 VIDevFPS | awk '{print $3}'
So the Go server's shell children are coming from libkvm.so through cgo, not from anything in server/*.go.
Bonus bug worth fixing while you're in there
system_state.cpp:245-249:
fp = popen("cat /proc/cvitek/vi_dbg | grep VIFPS | awk '{print $3}'", "r");
if (fp == NULL) {
pclose(fp); // pclose(NULL) — undefined behavior
return;
}
If popen returns NULL, pclose(NULL) is undefined. Doesn't matter in practice because popen rarely fails here, but worth cleaning up.
Suggested directions
The 1-second driver read is the underlying cost and none of these eliminates it. Each one is still a real win on its own:
- Replace the shell pipelines with native
fopen/read plus a small parser. Removes the user/sys CPU of the shell children (a few percent of one core continuously) and all the fork/exec/pipe churn. Also fixes the pclose(NULL) bug.
- Share the parsed result between callers. Right now
kvm_system and libkvm.so are independently polling the same slow file and paying for it twice. One shared cache with a short TTL (seconds) would halve the read rate immediately.
- Look at poll cadence. HDMI-state transitions don't happen at 1 Hz. Widening the interval to 10-30 seconds would remove most of the D-state time regardless of how the read is done.
- Separately: investigate why the proc read blocks for ~1 second. That's in the Cvitek driver/stats layer and probably a different conversation with different maintainers.
Related reports that might share this cause
The "slow after days of uptime, reboot fixes it" pattern in several of these lines up with what I'm measuring. Not all of them will trace to this (some are clearly HDMI or network), but enough of them fit the pattern to make it worth flagging.
Offer
I can submit a PR for the popen → native-read refactor (both C++ sites) and the pclose(NULL) fix if that'd be useful. The Go-side change would need someone who can rebuild libkvm.so.
Used Claude to help read the code and draft this. All measurements are from my own device; code references verified locally.
/proc/cvitek/vi_dbgis read viapopen("cat | grep | awk")on a hot poll pathSummary
On firmware 2.3.6 (PCIe variant), two processes (
kvm_systemandNanoKVM-Server) read/proc/cvitek/vi_dbgvia shell pipelines on a recurring poll. The read itself takes about 1 second on my box, almost all of it spent blocked on the Cvitek driver. Each shell pipeline launches 3-4 short-lived processes that live for that full second, and the two callers don't share the result.Even if the 1-second driver read is expected, userspace shouldn't be paying that cost through repeated shell pipelines from multiple independent callers.
Hardware / firmware
Linux kvm-7c12 5.10.4-tag- riscv64Measurements
/proc/cvitek/vi_dbgtakes ~1 second per readFive consecutive timed runs of the exact pipeline
kvm_systemuses:~40 ms of CPU per call, the rest is sleep time waiting on the driver.
Process churn is continuous
Sampling
/proc/stat'sprocessescounter over 5 seconds on an otherwise idle box:That's one pipeline per second give or take. Parent PIDs are
kvm_system(2705) andNanoKVM-Server(2720):Because each pipeline takes 1 second and they fire roughly every second, there's effectively always one process in D state contributing to load average.
Load average drifts up over uptime
Fresh boot: load ~1.0. After 99 days of uptime:
A reboot resets it. On one core, load in the 4-8 range is enough that the
NanoKVM-ServerHTTP workers start getting blocked long enough for/api/auth/loginto return 500 under any concurrency. I suspect that's what the "web UI takes 9 minutes to load" reports are seeing (#223), though I haven't verified that part.Where the code is
Two C++ call sites, both in
support/sg2002/:support/sg2002/kvm_system/main/lib/system_state/system_state.cpp:245In
kvm_update_hdmi_state(). Thestatic uint8_t check_times = 4seed with++check_times > 5fires on iterations 2, 8, 14… so once every 6 ticks of the caller loop.support/sg2002/additional/kvm/src/kvm_vision.cpp:220The
NanoKVM-Serverside is linked againstlibkvm.so.server/common/kvm_vision.gohas// #cgo LDFLAGS: -lkvm, and:So the Go server's shell children are coming from
libkvm.sothrough cgo, not from anything inserver/*.go.Bonus bug worth fixing while you're in there
system_state.cpp:245-249:If
popenreturns NULL,pclose(NULL)is undefined. Doesn't matter in practice becausepopenrarely fails here, but worth cleaning up.Suggested directions
The 1-second driver read is the underlying cost and none of these eliminates it. Each one is still a real win on its own:
fopen/readplus a small parser. Removes the user/sys CPU of the shell children (a few percent of one core continuously) and all the fork/exec/pipe churn. Also fixes thepclose(NULL)bug.kvm_systemandlibkvm.soare independently polling the same slow file and paying for it twice. One shared cache with a short TTL (seconds) would halve the read rate immediately.Related reports that might share this cause
The "slow after days of uptime, reboot fixes it" pattern in several of these lines up with what I'm measuring. Not all of them will trace to this (some are clearly HDMI or network), but enough of them fit the pattern to make it worth flagging.
Offer
I can submit a PR for the
popen→ native-read refactor (both C++ sites) and thepclose(NULL)fix if that'd be useful. The Go-side change would need someone who can rebuildlibkvm.so.Used Claude to help read the code and draft this. All measurements are from my own device; code references verified locally.