Skip to content

[bug] collect command: fatal panic "concurrent map read and map write" in TargetsManager.setTargetState() — 100% reproducible #835

@mahapule84

Description

@mahapule84

Summary

The gnmic collect command crashes with a fatal Go runtime panic on every startup when targets are loaded via the file loader. The panic is a concurrent map read and write race condition inside TargetsManager.setTargetState() at targets_manager.go:1111.

The crash occurs consistently regardless of configuration, number of targets, or gnmic version. Reproduced on both v0.44.1 and v0.45.0.

gnmic subscribe is not affected and works correctly with the same configuration.


Environment

Field Detail
gnmic versions tested v0.44.1, v0.45.0
OS Ubuntu 24.04 LTS
Go runtime (from stacktrace) go1.24.12
gNMI targets 5 targets (4x Arista EOS, 1x Juniper)
Loader type file loader (targets.yaml)
Output type prometheus_write

Steps to Reproduce

Step 1 — gnmic.yaml:

username: lab-user
password: lab-pass
port: 6030
timeout: 30s
insecure: true
encoding: proto

subscriptions:
  cpu-util:
    mode: stream
    stream-mode: sample
    sample-interval: 30s
    paths:
      - '/system/cpus/cpu/state/total/instant'

outputs:
  prometheus-write:
    type: prometheus_write
    url: "http://10.0.0.100:9090/api/v1/write"
    metric-prefix: "gnmic"

loader:
  type: file
  path: "/opt/gnmic/config/targets.yaml"
  interval: 60s
  start-delay: 10s
  enable-metrics: true
  watch: true

api-server:
  address: :7890

Step 2 — targets.yaml:

LAB-LEAF-01:
  address: 10.0.0.1:6030
  insecure: true
  event-tags:
    Hostname: lab-arista-leaf-01
    IP: 10.0.0.1
    site: LAB
    role: LEAF
    vendor: ARISTA

LAB-LEAF-02:
  address: 10.0.0.2:6030
  insecure: true
  event-tags:
    Hostname: lab-arista-leaf-02
    IP: 10.0.0.2
    site: LAB
    role: LEAF
    vendor: ARISTA

LAB-SPINE-01:
  address: 10.0.0.3:6030
  insecure: true
  event-tags:
    Hostname: lab-arista-spine-01
    IP: 10.0.0.3
    site: LAB
    role: SPINE
    vendor: ARISTA

LAB-ROUTER-01:
  address: 10.0.0.4:50051
  insecure: true
  event-tags:
    Hostname: lab-juniper-router-01
    IP: 10.0.0.4
    site: LAB
    role: ROUTER
    vendor: JUNIPER

Step 3 — Run:

gnmic collect --config /opt/gnmic/config/gnmic.yaml

Step 4 — Observe crash. gnmic prints one line then immediately panics.


Expected Behaviour

gnmic collect starts successfully, loads targets from the file loader, establishes gNMI subscriptions, and begins sending telemetry to the configured output.

Actual Behaviour

gnmic collect crashes immediately after file_loader initialization with fatal error: concurrent map read and map write. Process exits with code 2. No subscriptions are established and no telemetry is collected.


Crash Output

Only line printed before crash:
2026/03/16 11:47:16.253975 [file_loader] initialized loader type "file": {"path":"/opt/gnmic/config/targets.yaml","interval":60000000000,"start-delay":10000000000,"enable-metrics":true}

Full panic and stack trace:
fatal error: concurrent map read and map write goroutine 66 [running]: internal/runtime/maps.fatal({0x5274c79?, 0xc0000a4808?}) /opt/hostedtoolcache/go/1.24.12/x64/src/runtime/panic.go:1058 +0x18 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).setTargetState(0xc0007419a0, {0xc000042468, 0x13}, {0x51fefc3, 0x7}, {0x0, 0x0}) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:1111 +0x3a5 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).start(0xc0007419a0, 0xc001596280) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:528 +0x1611 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).apply(0xc0007419a0, {0xc000042468, 0x13}, 0xc000211200) /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:315 +0x1048 github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).Start.func3() /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:224 +0xb95 created by github.com/openconfig/gnmic/pkg/collector/managers/targets.(*TargetsManager).Start in goroutine 1 /home/runner/work/gnmic/gnmic/pkg/collector/managers/targets/targets_manager.go:194 +0xb13 goroutine 1 [chan receive]: github.com/openconfig/gnmic/pkg/collector.(*Collector).Start(0xc00080d050) /home/runner/work/gnmic/gnmic/pkg/collector/collector.go:135 +0x271 github.com/openconfig/gnmic/pkg/collector.(*Collector).CollectorRunE(0x5068256b49c4d6?, 0xc000bd5bb0?, {0x0?, 0x0?, 0x0?}) /home/runner/work/gnmic/gnmic/pkg/collector/collector.go:218 +0x65 github.com/spf13/cobra.(*Command).execute(0xc0008bb508, {0xc000adfc20, 0x3, 0x3}) /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1015 +0xaaa github.com/spf13/cobra.(*Command).ExecuteC(0xc0006dc008) /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.9.1/command.go:1148 +0x46f


Root Cause Analysis

The crash is in TargetsManager.setTargetState() at targets_manager.go:1111. The collect command starts goroutines via TargetsManager.Start() which spawns a goroutine at line 194 that calls apply()start()setTargetState(). This goroutine writes to a shared map (the target state map) while another goroutine is concurrently reading it, triggering Go's built-in fatal race error.

The subscribe command does not hit this path as it does not use TargetsManager — targets are loaded directly at startup.


Secondary Issue: event-tags Not Propagated to Event Stream in collect Mode

When using the file loader in collect mode, event-tags defined per target in targets.yaml are loaded correctly into the target configuration (confirmed via the REST API at /api/v1/targets) but are never injected into the event stream.

The subscribe command correctly injects event-tags into every event. The collect command does not, regardless of version.

The changelog notes: "Targets static tags are now properly propagated to outputs when a cache is used." However, enabling cache: type: oc on the output causes gnmic to hang indefinitely on startup with no error output.


Workaround

Use gnmic subscribe instead of gnmic collect. Subscribe mode works correctly with all features including event-tags. The limitation is that it does not support hot-reload of targets at runtime.


Versions Tested

  • v0.45.0 (dev) — crashes with identical stack trace
  • v0.44.1 (latest stable) — crashes with identical stack trace
  • v0.44.1 subscribe mode — works correctly, no crash, event-tags present

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions