Skip to content

Beat receivers do not correctly report status back to the Elastic Agent #8210

Open
@cmacknz

Description

@cmacknz

The easiest way to reproduce this is to work from #8148 which enables running system/metrics as a Beat receiver. System process metrics are the most reliable way to produce input permission errors to force the input to be degraded.

The starting configuration in elastic-agent.yml is:

outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    api_key: "example-key"
    preset: balanced

inputs:
  - type: system/metrics
    id: unique-system-metrics-input
    data_stream.namespace: default
    use_output: default
    streams:
      - metricsets:
        - cpu
        data_stream.dataset: system.cpu
      - metricsets:
        - memory
        data_stream.dataset: system.memory
      - metricsets:
        - network
        data_stream.dataset: system.network
      - metricsets:
        - filesystem
        data_stream.dataset: system.filesystem
      - metricsets:
        - process
        data_stream.dataset: system.process
        degrade_on_partial: true # Force degraded on permission errors

Then do the following which shows without the otel runtime status reporting works:

sudo ./elastic-agent install --develop --unprivileged -f

sudo elastic-development-agent status
┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a degraded state
   └─ system/metrics-default
      ├─ status: (HEALTHY) Healthy: communicating with pid '12591'
      └─ system/metrics-default-unique-system-metrics-input
         └─ status: (DEGRADED) Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 517 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.

Then switch to the otel runtime with:

outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    api_key: "example-key"
    preset: balanced

inputs:
  - type: system/metrics
    _runtime_experimental: "otel" # THIS IS THE ONLY CHANGE
    id: unique-system-metrics-input
    data_stream.namespace: default
    use_output: default
    streams:
      - metricsets:
        - cpu
        data_stream.dataset: system.cpu
      - metricsets:
        - memory
        data_stream.dataset: system.memory
      - metricsets:
        - network
        data_stream.dataset: system.network
      - metricsets:
        - filesystem
        data_stream.dataset: system.filesystem
      - metricsets:
        - process
        data_stream.dataset: system.process
        degrade_on_partial: true

This then shows the input as healthy:

sudo ./elastic-agent install --develop --unprivileged -f

sudo elastic-development-agent status
┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (HEALTHY) Running
   └─ pipeline:logs/_agent-component/system/metrics-default
      ├─ status: StatusOK
      ├─ exporter:elasticsearch/_agent-component/default
      │  └─ status: StatusOK
      └─ receiver:metricbeatreceiver/_agent-component/system/metrics-default
         └─ status: StatusOK

The following log was present showing the error was happening as expected, but the beat status was not updating.

{"log.level":"error","@timestamp":"2025-05-21T21:02:29.325Z","log.origin":{"function":"github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).handleFetchError","file.name":"module/wrapper.go","file.line":333},"message":"Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 521 processes, most likely a \"permission denied\" error. Enable debug logging to determine the exact cause.","log":{"source":"elastic-agent"},"component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log":{"source":"system/metrics-default"},"service.name":"Craigs-Macbook-Pro.local","ecs.version":"1.6.0","ecs.version":"1.6.0"}

This problem could exist in either the Beat or the Agent, it's unclear what the missing step is currently.

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions