Skip to content

[receiver/sqlquery] Add Connection Health Check Metric for Database Connectivity Status #43837

@josepcorrea

Description

@josepcorrea

Component(s)

receiver/sqlquery

Is your feature request related to a problem? Please describe.

The current behavior of the sqlqueryreceiver lacks explicit observability for critical connection failures. If the receiver fails to establish a connection to the configured SQL database (DB), the expected business metrics simply stop being emitted, and the only indication of the failure is an error message logged within the Collector's logs.

This presents a significant observability gap because:

Passive Detection: Relying on the absence of metrics or requiring users to scrape and parse the Collector's internal logs makes failure detection slow, difficult to automate, and highly reactive.

Lack of Alerting Primitive: There is no dedicated explicit metric for "connection status" that can be used to set up immediate, high-priority, and reliable alerts in a monitoring platform (e.g., Prometheus, Grafana). We are frustrated that a critical dependency failure (database connectivity) only surfaces as a log entry instead of a dedicated, alertable signal.

Describe the solution you'd like

We propose extending the sqlqueryreceiver to include an optional configuration for a dedicated Connection Health Check. This check should attempt to establish/verify the connection to the DB and report the outcome as a standardized metric.

This metric should be a simple Gauge type, for instance:

Metric Name: sqlquery.connection.status

Value:

1 if the connection attempt to the database was successful.

0 if the connection attempt failed.

Attributes: The metric must include identifying attributes such as driver, host, and database (or the configured receiver name) to allow users to scope the alert accurately.

This connection status metric should be emitted regularly (ideally aligned with the collection interval) to ensure real-time status updates, allowing users to configure immediate alerts whenever the value drops to 0.

Describe alternatives you've considered

Monitoring for Metric Absence: The current, deficient solution. It introduces a high mean time to detect (MTTD) due to required "no-data" alerting rules.

Using a separate processor to convert logs to metrics: This approach is overly complex and resource-intensive (parsing Collector logs for specific connection errors) when the receiver itself has immediate knowledge of the failure. It increases pipeline fragility and complexity.

Configuring a dummy query (e.g., SELECT 1) as a regular metric: While functional, this only verifies query execution, not the underlying connection health check logic of the receiver itself. A dedicated health check metric is more explicit, robust, and clearer in intent.

Additional context

A dedicated connection health metric is a standard best practice across many telemetry components and would significantly enhance the operational visibility of the Collector pipeline. It cleanly separates failure modes:

Query Failure: Handled by existing query-specific metric logic (e.g., generating an error count).

Critical Connectivity Failure: Handled by the new dedicated sqlquery.connection.status metric.

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions