-
Notifications
You must be signed in to change notification settings - Fork 168
Allow using beats receivers for self-monitoring #8031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request does not have a backport label. Could you fix it @swiatekm? 🙏
|
4363785
to
4b94fd4
Compare
The test failures are due to the beats update. I'm going to do that in a separate PR for clarity: #8041. |
This pull request is now in conflicts. Could you fix it? 🙏
|
4ccd165
to
2d9c182
Compare
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
# Conflicts: # internal/pkg/agent/application/monitoring/v1_monitor.go # Conflicts: # internal/pkg/otel/configtranslate/otelconfig.go
This LGTM and testing locally I see the monitoring receivers running. It would be nice if we can make sure that only the beat receivers are used for monitoring, right now we have at least the beat receivers are used for monitoring in the tests. sudo elastic-development-agent status --output=full
┌─ fleet
│ └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
├─ status: (HEALTHY) Running
├─ info
│ ├─ id: 9293b312-f874-4866-bb1c-ebc21244c75c
│ ├─ version: 9.1.0
│ └─ commit: 3b2fe0010f4075f5dd47fad96a4b2c1dc5a97f52
├─ filestream-default
│ ├─ status: (HEALTHY) Healthy: communicating with pid '82070'
│ ├─ filestream-default
│ │ ├─ status: (HEALTHY) Healthy
│ │ └─ type: OUTPUT
│ └─ filestream-default-your-input-id
│ ├─ status: (HEALTHY) Healthy
│ └─ type: INPUT
├─ system/metrics-default
│ ├─ status: (HEALTHY) Healthy: communicating with pid '82069'
│ ├─ system/metrics-default
│ │ ├─ status: (HEALTHY) Healthy
│ │ └─ type: OUTPUT
│ └─ system/metrics-default-unique-system-metrics-input
│ ├─ status: (HEALTHY) Healthy
│ └─ type: INPUT
├─ pipeline:logs/_agent-component/beat/metrics-monitoring
│ ├─ status: StatusOK
│ ├─ exporter:elasticsearch/_agent-component/monitoring
│ │ └─ status: StatusOK
│ └─ receiver:metricbeatreceiver/_agent-component/beat/metrics-monitoring
│ └─ status: StatusOK
├─ pipeline:logs/_agent-component/filestream-monitoring
│ ├─ status: StatusOK
│ ├─ exporter:elasticsearch/_agent-component/monitoring
│ │ └─ status: StatusOK
│ └─ receiver:filebeatreceiver/_agent-component/filestream-monitoring
│ └─ status: StatusOK
└─ pipeline:logs/_agent-component/http/metrics-monitoring
├─ status: StatusOK
├─ exporter:elasticsearch/_agent-component/monitoring
│ └─ status: StatusOK
└─ receiver:metricbeatreceiver/_agent-component/http/metrics-monitoring
└─ status: StatusOK |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Co-authored-by: Khushi Jain <[email protected]>
|
💛 Build succeeded, but was flaky
Failed CI StepsHistory
cc @swiatekm |
@Mergifyio backport 9.0 |
✅ Backports have been created
|
* Handle nil case in monitoring config parsing * Allow using otel runtime for self-monitoring # Conflicts: # internal/pkg/agent/application/monitoring/v1_monitor.go # Conflicts: # internal/pkg/otel/configtranslate/otelconfig.go * Modify e2e test * Make the monitoring e2e test more restrictive * Revert "Handle nil case in monitoring config parsing" This reverts commit bb11a0f. * Check receiver statuses in e2e test * Send data from beats processes and receivers to different namespaces * Check all component statuses in E2E test * Fix typo Co-authored-by: Khushi Jain <[email protected]> --------- Co-authored-by: Khushi Jain <[email protected]> (cherry picked from commit a31d56f)
* Handle nil case in monitoring config parsing * Allow using otel runtime for self-monitoring # Conflicts: # internal/pkg/agent/application/monitoring/v1_monitor.go # Conflicts: # internal/pkg/otel/configtranslate/otelconfig.go * Modify e2e test * Make the monitoring e2e test more restrictive * Revert "Handle nil case in monitoring config parsing" This reverts commit bb11a0f. * Check receiver statuses in e2e test * Send data from beats processes and receivers to different namespaces * Check all component statuses in E2E test * Fix typo Co-authored-by: Khushi Jain <[email protected]> --------- Co-authored-by: Khushi Jain <[email protected]> (cherry picked from commit a31d56f)
* Handle nil case in monitoring config parsing * Allow using otel runtime for self-monitoring # Conflicts: # internal/pkg/agent/application/monitoring/v1_monitor.go # Conflicts: # internal/pkg/otel/configtranslate/otelconfig.go * Modify e2e test * Make the monitoring e2e test more restrictive * Revert "Handle nil case in monitoring config parsing" This reverts commit bb11a0f. * Check receiver statuses in e2e test * Send data from beats processes and receivers to different namespaces * Check all component statuses in E2E test * Fix typo --------- (cherry picked from commit a31d56f) Co-authored-by: Mikołaj Świątek <[email protected]> Co-authored-by: Khushi Jain <[email protected]>
* Handle nil case in monitoring config parsing * Allow using otel runtime for self-monitoring # Conflicts: # internal/pkg/agent/application/monitoring/v1_monitor.go # Conflicts: # internal/pkg/otel/configtranslate/otelconfig.go * Modify e2e test * Make the monitoring e2e test more restrictive * Revert "Handle nil case in monitoring config parsing" This reverts commit bb11a0f. * Check receiver statuses in e2e test * Send data from beats processes and receivers to different namespaces * Check all component statuses in E2E test * Fix typo --------- (cherry picked from commit a31d56f) Co-authored-by: Mikołaj Świątek <[email protected]> Co-authored-by: Khushi Jain <[email protected]>
* upstream/main: Guard against `nil` pointer dereference (elastic#8107) Generate NOTICE.txt with only modules used by binaries (elastic#8053) Retry enrollment requests when an error is returned, add enrollment timeout (elastic#8056) Changelog for 8.17.6 version (elastic#8062) (elastic#8106) [main][Automation] Update versions (elastic#8098) Allow using beats receivers for self-monitoring (elastic#8031) Adding new configuration setting: `agent.upgrade.rollback.window` (elastic#8065) [Integration Testing] Allow tests to declare themselves as needing a FIPS environment (elastic#8083) fix(agentless): overcome SIGPIPE in agentless promotion pipeline (elastic#8094) ksm autosharing integration configuration update (elastic#8086)
What does this PR do?
Adds the ability to use beats receivers for agent self-monitoring. To do so, we add a new configuration key to
agent.monitoring
named_runtime_experimental
- identical to how you can currently switch inputs to the Otel runtime.In terms of implementation, the changes are very straightforward. In the monitoring injection manager, we set the runtime manager for inputs we add, if it's set in the monitoring configuration.
Most of this PR's code changes lie in tests, and more specifically in the
TestAgentMonitoring
E2E test. This test compares the data collected by agent self-monitoring using beats processes to an equivalent Otel configuration of beats receivers in Hybrid mode. Instead of doing that, we can now just changeagent.monitoring._runtime_experimental
, so the test becomes much simpler conceptually.I have simplified some of the test logic, but I haven't yet made it compare metrics. This should be doable now, but we have another PR (#8009 ) in-flight doing it, so I held off.
Why is it important?
We want to be able to use beats receivers for agent self-monitoring.
Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added an entry in./changelog/fragments
using the changelog toolHow to test this PR locally
Build the agent locally and use the following configuration:
Looking at Kibana dashboards for the agent integration can prove the data is actually being ingested. You can verify that beats receivers are being used for self-monitoring by looking at their CPU usage - it should be 0.
Related issues