-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Summary
The gnmic collect command crashes with a fatal Go runtime panic on every startup when targets are loaded via the file loader. The panic is a concurrent map read and write race condition inside TargetsManager.setTargetState() at targets_manager.go:1111.
The crash occurs consistently regardless of configuration, number of targets, or gnmic version. Reproduced on both v0.44.1 and v0.45.0.
gnmic subscribe is not affected and works correctly with the same configuration.
Environment
| Field | Detail |
|---|---|
| gnmic versions tested | v0.44.1, v0.45.0 |
| OS | Ubuntu 24.04 LTS |
| Go runtime (from stacktrace) | go1.24.12 |
| gNMI targets | 5 targets (4x Arista EOS, 1x Juniper) |
| Loader type | file loader (targets.yaml) |
| Output type | prometheus_write |
Steps to Reproduce
Step 1 — gnmic.yaml:
username: lab-user
password: lab-pass
port: 6030
timeout: 30s
insecure: true
encoding: proto
subscriptions:
cpu-util:
mode: stream
stream-mode: sample
sample-interval: 30s
paths:
- '/system/cpus/cpu/state/total/instant'
outputs:
prometheus-write:
type: prometheus_write
url: "http://10.0.0.100:9090/api/v1/write"
metric-prefix: "gnmic"
loader:
type: file
path: "/opt/gnmic/config/targets.yaml"
interval: 60s
start-delay: 10s
enable-metrics: true
watch: true
api-server:
address: :7890Step 2 — targets.yaml:
LAB-LEAF-01:
address: 10.0.0.1:6030
insecure: true
event-tags:
Hostname: lab-arista-leaf-01
IP: 10.0.0.1
site: LAB
role: LEAF
vendor: ARISTA
LAB-LEAF-02:
address: 10.0.0.2:6030
insecure: true
event-tags:
Hostname: lab-arista-leaf-02
IP: 10.0.0.2
site: LAB
role: LEAF
vendor: ARISTA
LAB-SPINE-01:
address: 10.0.0.3:6030
insecure: true
event-tags:
Hostname: lab-arista-spine-01
IP: 10.0.0.3
site: LAB
role: SPINE
vendor: ARISTA
LAB-ROUTER-01:
address: 10.0.0.4:50051
insecure: true
event-tags:
Hostname: lab-juniper-router-01
IP: 10.0.0.4
site: LAB
role: ROUTER
vendor: JUNIPERStep 3 — Run:
gnmic collect --config /opt/gnmic/config/gnmic.yamlStep 4 — Observe crash. gnmic prints one line then immediately panics.
Expected Behaviour
gnmic collect starts successfully, loads targets from the file loader, establishes gNMI subscriptions, and begins sending telemetry to the configured output.
Actual Behaviour
gnmic collect crashes immediately after file_loader initialization with fatal error: concurrent map read and map write. Process exits with code 2. No subscriptions are established and no telemetry is collected.
Crash Output
Only line printed before crash:
2026/03/16 11:47:16.253975 [file_loader] initialized loader type "file": {"path":"/opt/gnmic/config/targets.yaml","interval":60000000000,"start-delay":10000000000,"enable-metrics":true}
Full panic and stack trace:
fatal error: concurrent map read and map write goroutine 66 [running]: internal/runtime/maps.fatal({0x5274c79?, 0xc0000a4808?}) /opt/hostedtoolcache/go/1.24.12/x64/src/runtime/panic.go:1058 +0x18 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).setTargetState(0xc0007419a0, {0xc000042468, 0x13}, {0x51fefc3, 0x7}, {0x0, 0x0}) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:1111 +0x3a5 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).start(0xc0007419a0, 0xc001596280) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:528 +0x1611 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).apply(0xc0007419a0, {0xc000042468, 0x13}, 0xc000211200) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:315 +0x1048 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).Start.func3() /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:224 +0xb95 created by github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).Start in goroutine 1 /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:194 +0xb13 goroutine 1 [chan receive]: github.com/openconfig/gnmic/pkg/collector.(*Collector).Start(0xc00080d050) /home/runner/work/gnmic/gnmic/pkg/collector/collector.go:135 +0x271 github.com/openconfig/gnmic/pkg/collector.(*Collector).CollectorRunE(0x5068256b49c4d6?, 0xc000bd5bb0?, {0x0?, 0x0?, 0x0?}) /home/runner/work/gnmic/gnmic/pkg/collector/collector.go:218 +0x65 github.com/spf13/cobra.(*Command).execute(0xc0008bb508, {0xc000adfc20, 0x3, 0x3}) /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1015 +0xaaa github.com/spf13/cobra.(*Command).ExecuteC(0xc0006dc008) /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1148 +0x46f
Root Cause Analysis
The crash is in TargetsManager.setTargetState() at targets_manager.go:1111. The collect command starts goroutines via TargetsManager.Start() which spawns a goroutine at line 194 that calls apply() → start() → setTargetState(). This goroutine writes to a shared map (the target state map) while another goroutine is concurrently reading it, triggering Go's built-in fatal race error.
The subscribe command does not hit this path as it does not use TargetsManager — targets are loaded directly at startup.
Secondary Issue: event-tags Not Propagated to Event Stream in collect Mode
When using the file loader in collect mode, event-tags defined per target in targets.yaml are loaded correctly into the target configuration (confirmed via the REST API at /api/v1/targets) but are never injected into the event stream.
The subscribe command correctly injects event-tags into every event. The collect command does not, regardless of version.
The changelog notes: "Targets static tags are now properly propagated to outputs when a cache is used." However, enabling cache: type: oc on the output causes gnmic to hang indefinitely on startup with no error output.
Workaround
Use gnmic subscribe instead of gnmic collect. Subscribe mode works correctly with all features including event-tags. The limitation is that it does not support hot-reload of targets at runtime.
Versions Tested
- v0.45.0 (dev) — crashes with identical stack trace
- v0.44.1 (latest stable) — crashes with identical stack trace
- v0.44.1 subscribe mode — works correctly, no crash, event-tags present