Skip to content

[Collector] [StatusReporter] Status reporting can deadlock each other #12495

@splunkericl

Description

@splunkericl

Component(s)

What happened?

Describe the bug
During collector startup, if a component attempts to report fatal status asynchronously, collector start up will be blocked completely due to StatusReporter deadlocking(mutex blocking).

Steps to reproduce

  1. set up a pipeline that starts a server asynchronously. For example, hec receiver
  2. Attempts to trigger an error that prevents receiver to startup. For example, port conflict in hec receiver and cause fatal error event to be reported: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/splunkhecreceiver/receiver.go#L184

What did you expect to see?

  1. collector exited with async error: https://github.com/open-telemetry/opentelemetry-collector/blob/main/otelcol/collector.go#L335

What did you see instead?

  1. collector becomes deadlock because other components are starting up and isn't able to report status https://github.com/open-telemetry/opentelemetry-collector/blob/main/service/internal/graph/graph.go#L426

Collector version

v0.114

Environment information

Environment

OS: mac
Compiler(if manually compiled): go 1.23

OpenTelemetry Collector configuration

Log output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions