Open
Description
- Relates https://github.com/elastic/ingest-dev/issues/5116
- This issue is more about the format of the status not whether it is present.
The easiest way to reproduce this is to work from #8148 which enables running system/metrics
as a Beat receiver. System process metrics are the most reliable way to produce input permission errors to force the input to be degraded.
The starting configuration in elastic-agent.yml is:
outputs:
default:
type: elasticsearch
hosts: [127.0.0.1:9200]
api_key: "example-key"
preset: balanced
inputs:
- type: system/metrics
id: unique-system-metrics-input
data_stream.namespace: default
use_output: default
streams:
- metricsets:
- cpu
data_stream.dataset: system.cpu
- metricsets:
- memory
data_stream.dataset: system.memory
- metricsets:
- network
data_stream.dataset: system.network
- metricsets:
- filesystem
data_stream.dataset: system.filesystem
- metricsets:
- process
data_stream.dataset: system.process
degrade_on_partial: true # Force degraded on permission errors
Then do the following which shows without the otel runtime status reporting works:
sudo ./elastic-agent install --develop --unprivileged -f
sudo elastic-development-agent status
┌─ fleet
│ └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
├─ status: (DEGRADED) 1 or more components/units in a degraded state
└─ system/metrics-default
├─ status: (HEALTHY) Healthy: communicating with pid '12591'
└─ system/metrics-default-unique-system-metrics-input
└─ status: (DEGRADED) Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 517 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.
Then switch to the otel runtime with:
outputs:
default:
type: elasticsearch
hosts: [127.0.0.1:9200]
api_key: "example-key"
preset: balanced
inputs:
- type: system/metrics
_runtime_experimental: "otel" # THIS IS THE ONLY CHANGE
id: unique-system-metrics-input
data_stream.namespace: default
use_output: default
streams:
- metricsets:
- cpu
data_stream.dataset: system.cpu
- metricsets:
- memory
data_stream.dataset: system.memory
- metricsets:
- network
data_stream.dataset: system.network
- metricsets:
- filesystem
data_stream.dataset: system.filesystem
- metricsets:
- process
data_stream.dataset: system.process
degrade_on_partial: true
This then shows the input as healthy:
sudo ./elastic-agent install --develop --unprivileged -f
sudo elastic-development-agent status
┌─ fleet
│ └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
├─ status: (HEALTHY) Running
└─ pipeline:logs/_agent-component/system/metrics-default
├─ status: StatusOK
├─ exporter:elasticsearch/_agent-component/default
│ └─ status: StatusOK
└─ receiver:metricbeatreceiver/_agent-component/system/metrics-default
└─ status: StatusOK
The following log was present showing the error was happening as expected, but the beat status was not updating.
{"log.level":"error","@timestamp":"2025-05-21T21:02:29.325Z","log.origin":{"function":"github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).handleFetchError","file.name":"module/wrapper.go","file.line":333},"message":"Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 521 processes, most likely a \"permission denied\" error. Enable debug logging to determine the exact cause.","log":{"source":"elastic-agent"},"component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log":{"source":"system/metrics-default"},"service.name":"Craigs-Macbook-Pro.local","ecs.version":"1.6.0","ecs.version":"1.6.0"}
This problem could exist in either the Beat or the Agent, it's unclear what the missing step is currently.