Skip to content

Conversation

@belimawr
Copy link
Contributor

@belimawr belimawr commented Dec 3, 2025

Proposed commit message

Correctly report V2 inputs as failed to the manager. If a V2 input Run
method returns an error, it is now correctly reported to the input
manager. The Filestream input reports the input as degraded if the
pipeline fails to connect.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

## Disruptive User Impact
## Author's Checklist

How to test this PR locally

Run the tests

cd x-pack/filebeat
mage buildSystemTestBinary
go test -tags=integration -v -count=1 -run=TestPipelineConnectionError ./tests/integration

Manual test

1. Download Elastic Agent snapthost

Make sure to download the latest snapshot in the correct version for your OS/Architecture

wget https://artifacts-snapshot.elastic.co/elastic-agent-package/9.3.0-9fb1a0c3/downloads/beats/elastic-agent/elastic-agent-9.3.0-SNAPSHOT-linux-x86_64.tar.gz

2. Build Agentbeat

cd x-pack/agentbeat
DEV=true mage -v build

3. Extract Elastic Agent and copy the Agentbeat binary

# Extract Elastic Agent using your favorite tool
cd elastic-agent-9.3.0-SNAPSHOT-linux-arm64/data/elastic-agent-24d365/components

4. Copy the Agentbeat binary you built, replacing the one from the archive

cp ~/devel/beats/x-pack/agentbeat/agentbeat .
cd ../../../

5.Create a log file for Filestream to ingest

Using flog

docker run -it --rm mingrammer/flog -n 50 > /tmp/flog.log

6.Create the elastic-agent.yml

The connection credentials to Elasticsearch are irrelevant, no data will be shipped, so they can point to an inexistent Elasticsearch host.

elastic-agent.yml

outputs:
  default:
    type: elasticsearch
    hosts: [http://localhost:9200]

inputs:
  - type: filestream
    id: filestream-unit-id
    type: filestream
    streams:
      - type: filestream
        id: filestream-input-id
        paths:
          - /tmp/flog.log
        processors:
          - add_fields:
              INVALID_CONFIG_KEY: true
              fields:
                labels:
                  foo: bar
  - type: cel
    id: cel-unit-id
    type: cel
    streams:
      - type: cel
        id: cel-input-id
        interval: 1m
        resource.url: https://api.ipify.org/?format=text
        program: |
          {"events": [{"ip": string(get(state.url).Body)}]}
        processors:
          - add_fields:
              INVALID_CONFIG_KEY: true
              fields:
                labels:
                  foo: bar

agent.monitoring:
  enabled: false
  logs: false
  http:
    enabled: false

agent.grpc:
  address: localhost
  port: 4242

agent.logging.level: info
agent.logging.to_stderr: true

7.Look for the logs showing Filestream and Cel input failing

Filestream:

{
  "@timestamp": "2025-12-08T09:56:57.403-0500",
  "component": {
    "id": "filestream-default",
    "state": "HEALTHY"
  },
  "ecs.version": "1.6.0",
  "log": {
    "source": "elastic-agent"
  },
  "log.level": "warn",
  "log.origin": {
    "file.line": 1040,
    "file.name": "coordinator/coordinator.go",
    "function": "github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator.logComponentStateChange"
  },
  "message": "Unit state changed filestream-default-filestream-unit-id (HEALTHY->DEGRADED): Harvester for Filestream input \"filestream-input-id\" failed: error while connecting to output with pipeline: unexpected INVALID_CONFIG_KEY option in processors.6.add_fields",
  "unit": {
    "id": "filestream-default-filestream-unit-id",
    "old_state": "HEALTHY",
    "state": "DEGRADED",
    "type": "input"
  }
}

Cel:

{
  "@timestamp": "2025-12-08T09:56:57.761-0500",
  "component": {
    "id": "cel-default",
    "state": "HEALTHY"
  },
  "ecs.version": "1.6.0",
  "log": {
    "source": "elastic-agent"
  },
  "log.level": "error",
  "log.origin": {
    "file.line": 1040,
    "file.name": "coordinator/coordinator.go",
    "function": "github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator.logComponentStateChange"
  },
  "message": "Unit state changed cel-default-cel-unit-id (HEALTHY->FAILED): Input 'cel' failed with: input cel-input-id failed: unexpected INVALID_CONFIG_KEY option in processors.6.add_fields",
  "unit": {
    "id": "cel-default-cel-unit-id",
    "old_state": "HEALTHY",
    "state": "FAILED",
    "type": "input"
  }
}

8.Look at the Elastic Agent status

The Filestream input must be in a degraded state and Cel in a failed state. Both inputs must have errored with

unexpected INVALID_CONFIG_KEY option in processors...
% ./elastic-agent status
┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a failed state
   ├─ cel-default
   │  ├─ status: (HEALTHY) Healthy: communicating with pid '53315'
   │  └─ cel-default-cel-unit-id
   │     └─ status: (FAILED) Input 'cel' failed with: input cel-input-id failed: unexpected INVALID_CONFIG_KEY option in processors.6.add_fields
   └─ filestream-default
      ├─ status: (HEALTHY) Healthy: communicating with pid '53289'
      └─ filestream-default-filestream-unit-id
         └─ status: (DEGRADED) Harvester for Filestream input "filestream-input-id" failed: error while connecting to output with pipeline: unexpected INVALID_CONFIG_KEY option in processors.6.add_fields

Related issues

## Use cases
## Screenshots
## Logs

This commit updates the status of V2 Inputs managed by the
input-cursor.managedInput when the pipeline fails to connect.

One of the instances where the pipeline fails to connect is when
invalid processors are defined in the input configuration.
@belimawr belimawr self-assigned this Dec 3, 2025
@belimawr belimawr added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Dec 3, 2025
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Dec 3, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @belimawr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@belimawr belimawr changed the title [WIP] Update input status when pipeline fails to connect [Filebeat] Correctly report V2 inputs as failed to the Manager Dec 8, 2025
@belimawr belimawr added the backport-active-all Automated backport with mergify to all the active branches label Dec 8, 2025
@belimawr belimawr marked this pull request as ready for review December 8, 2025 15:24
@belimawr belimawr requested a review from a team as a code owner December 8, 2025 15:24
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Copy link
Member

@AndersonQ AndersonQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure we're on the same page. I don't see it recovering from the degraded/failed state. Is is so because those error are irrecoverable?

@belimawr
Copy link
Contributor Author

Just to be sure we're on the same page. I don't see it recovering from the degraded/failed state. Is is so because those error are irrecoverable?

Kinda. It depends on how the input was started. The parts of the codebase I changed are not responsible for restarting failed/degraded inputs, that's the manager or the input runner responsibility.

Aside from that, all the cases I can think of are not recoverable. The error is coming from connecting to the publishing pipeline, as far as I know, only configuration errors will cause issues and those are not recoverable.

@belimawr belimawr merged commit 462751c into elastic:main Dec 11, 2025
25 checks passed
@github-actions
Copy link
Contributor

@Mergifyio backport 8.19 9.1 9.2

@mergify
Copy link
Contributor

mergify bot commented Dec 11, 2025

backport 8.19 9.1 9.2

✅ Backports have been created

Details

mergify bot pushed a commit that referenced this pull request Dec 11, 2025
Correctly report V2 inputs as failed to the manager. If a V2 input Run
method returns an error, it is now correctly reported to the input
manager. The Filestream input reports the input as degraded if the
pipeline fails to connect.

(cherry picked from commit 462751c)

# Conflicts:
#	x-pack/filebeat/tests/integration/managerV2_test.go
mergify bot pushed a commit that referenced this pull request Dec 11, 2025
Correctly report V2 inputs as failed to the manager. If a V2 input Run
method returns an error, it is now correctly reported to the input
manager. The Filestream input reports the input as degraded if the
pipeline fails to connect.

(cherry picked from commit 462751c)
mergify bot pushed a commit that referenced this pull request Dec 11, 2025
Correctly report V2 inputs as failed to the manager. If a V2 input Run
method returns an error, it is now correctly reported to the input
manager. The Filestream input reports the input as degraded if the
pipeline fails to connect.

(cherry picked from commit 462751c)
belimawr added a commit that referenced this pull request Dec 12, 2025
… (#48060)

Correctly report V2 inputs as failed to the manager. If a V2 input Run
method returns an error, it is now correctly reported to the input
manager. The Filestream input reports the input as degraded if the
pipeline fails to connect.

(cherry picked from commit 462751c)

Co-authored-by: Tiago Queiroz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent state is Healthy when input fails to start

4 participants