-
Notifications
You must be signed in to change notification settings - Fork 168
[otel]: Add e2e test for monitoring metrics in otel mode #8009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This pull request does not have a backport label. Could you fix it @khushijain21? 🙏
|
Does this overlap with what #7622 intends to do? |
This pull request is now in conflicts. Could you fix it? 🙏
|
Thanks for clarifying! |
}) | ||
} | ||
|
||
var configTemplateOTel = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once #8031 exists, instead of manually reproducing the configurations you could just install agent twice in the tests.
You can see an example of agent running twice in the same test in
elastic-agent/testing/integration/install_test.go
Lines 154 to 165 in 3b2fe00
out, err := fixture.Install(ctx, &opts) | |
if err != nil { | |
t.Logf("install output: %s", out) | |
require.NoError(t, err) | |
} | |
// Check that Agent was installed in the custom base path | |
topPath := filepath.Join(basePath, "Elastic", "Agent") | |
require.NoError(t, installtest.CheckSuccess(ctx, fixture, topPath, &installtest.CheckOpts{Privileged: opts.Privileged})) | |
t.Run("check agent package version", testAgentPackageVersion(ctx, fixture, true)) | |
t.Run("check second agent installs with --namespace", testSecondAgentCanInstall(ctx, fixture, basePath, false, opts)) |
Then you will have a true side by side comparison with agent generating the configs itself. CC @mauri870 I think this idea probably applies to the test you are working on as well once the same thing is possible for non-monitoring metrics.
Reproducing the configs manually just decoupled you from having to wait for #8031, but once the feature flag to switch to beat receivers for monitoring exists it'll make sure we test with the latest config agent generates itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --namespace
feature is what is used to implement the --develop
support in https://github.com/elastic/elastic-agent?tab=readme-ov-file#development-installations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like #8031 is merged now.
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take a look at:
main...leehinman:elastic-agent:4876_agent_monitoring_tests
I think it has some ideas you can use. Specifically the changes to the rawQuery
including the sort, and the use of match_phrase
over match
, and the initial criteria to find the events. Doing that made the ignoredFields much smaller and the results were consistent.
query: map[string]any{ | ||
// metric-elastic_agent.elastic_agent-* stores cpu metrics emitted by EA AND all running beats | ||
// here, we only compare elastic-agent self metrics for simplicity | ||
"component.id": "elastic-agent", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it was easier to to pick a field name that lead to a unique kind of metric. For example beat.stats.memstats.rss
This pull request is now in conflicts. Could you fix it? 🙏
|
investigating why memory related stats (such as beat.stats.memory.rss are not available with |
if failureThreshold != nil { | ||
httpStream[failureThresholdKey] = *failureThreshold | ||
// Do not create http streams if runtime-manager is otel and binary is of beat type | ||
if compInfo.RuntimeManager != component.OtelRuntimeManager || !strings.HasSuffix(binaryName, "beat") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http/metrics
pulls process related metrics from beats and sends it to metrics-elastic_agent.elastic_agent.*
index. These metrics are not applicable for beatreceivers - hence dropping these streams for this special case
|
💛 Build succeeded, but was flaky
Failed CI StepsHistory
|
What does this PR do?
This PR adds e2e tests for self-monitoring metrics exposed using beatreceivers. It also asserts document equivalency for metrics exposed by normal mode vs otel mode.
Why is it important?
Required to safely transition running elastic-agent in otel mode.
Checklist
./changelog/fragments
using the changelog toolRelated issues