Skip to content

Conversation

@pickypg
Copy link
Member

@pickypg pickypg commented Oct 1, 2025

This provides a separate OTel configuration to allow users to better understand what data they are shipping with the AutoOps ES module.

What does this PR do?

This adds a new OTel configuration available for every OS that helps users to understand what data the AutoOps ES module exports.

Why is it important?

Many users have asked what data is collected by AutoOps and this helps them to explore it locally before shipping it anywhree.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added an entry in ./changelog/fragments using the changelog tool

Disruptive User Impact

None. This adds a new, optional configuration that the user can run to better understand AutoOps.

How to test this PR locally

Run the Elastic Agent against an Elasticsearch cluster.

./elastic-agent --config otel_samples/autoops_es_debug.yml

You will want to supply these environment variables (with values filled in relevant to your environment):

# Values are examples and have no effect other than to appear in output unchanged
AUTOOPS_TEMP_RESOURCE_ID=example-id
AUTOOPS_TOKEN=example_token

# Values must work:
AUTOOPS_ES_URL=https://localhost:9200

# Username / Password _OR_ API Key
AUTOOPS_ES_USERNAME=my-username
AUTOOPS_ES_PASSWORD=my-secure-password

AUTOOPS_ES_API_KEY=myapikey==

This provides a separate OTel configuration to allow users to
better understand what data they are shipping with the AutoOps ES
module.
@pickypg pickypg self-assigned this Oct 1, 2025
@pickypg pickypg added the enhancement New feature or request label Oct 1, 2025
@pickypg pickypg requested review from a team as code owners October 1, 2025 22:58
@pickypg pickypg added the backport-9.2 Automated backport to the 9.2 branch label Oct 1, 2025
@pickypg pickypg requested a review from a team as a code owner October 1, 2025 22:58
@pickypg pickypg enabled auto-merge (squash) October 1, 2025 22:58
@pickypg pickypg requested a review from a team as a code owner October 1, 2025 23:02
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

cc @pickypg

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as these new pipelines with the debug exporters are identical to their otlphttp exporter counterparts except, of course, for the exporter used, I wonder if you could consolidate these into one set of pipelines that defined both exporters and used a routingconnector based on an environment variable to route the data to one or the other exporter?

That would save a lot of duplication and the need for ensuring that the two sets of pipelines are kept in sync whenever one changes.

@pickypg
Copy link
Member Author

pickypg commented Oct 2, 2025

the need for ensuring that the two sets of pipelines are kept in sync whenever one changes.

I see this your potential change as a major upside. However, wouldn't using a connector add a potential performance barrier? This is my own lack of knowledge, but is that a one-time check or would that be a per-telemetry event check? If it's a one-time setup cost, then I think it's the way to go. If it's a cost for every piece of data flowing through the agent, then I'd be happy to pay the overhead cost of an extra config file versus that downstream cost because large clusters will be sending a lot of data through the pipeline and the debug output is mostly for customers to feel comfortable using it, rather than a regular-use.

@ycombinator
Copy link
Contributor

ycombinator commented Oct 2, 2025

wouldn't using a connector add a potential performance barrier? This is my own lack of knowledge, but is that a one-time check or would that be a per-telemetry event check?

I'm not familiar with how the routingconnector works either (pinging @swiatekm or @andrzej-stencel in case they have some insights) but I suspect it inspects every event since the conditions are based on OTTL. I'm not sure if there's any optimization if the condition doesn't reference any event fields but just refers to "static" environment variables instead.


If we decide to go with two sets of nearly-identical configuration files, we might want to consider introducing a _templates or similar folder that's not included in the Agent distribution but is used at build time to generate the two near-identical configuration files from the same template file via something like a mage otel:generateSamples target that's also a dependency of the mage check target, similar to the mage otel:readme target.

@pickypg
Copy link
Member Author

pickypg commented Oct 2, 2025

If we decide to go with two sets of nearly-identical configuration files, we might want to consider introducing a _templates or similar folder

I like this. I have been actively thinking about how to provide a set of configuration that enables providing TLS / SSL configuration without being a total duplication. If I had different ways to piece together the receivers, I could have the exporters and pipeline provided by a different file, then allow the "user" to merge them together as needed (even better with #10267).

@ycombinator
Copy link
Contributor

If I had different ways to piece together the receivers, I could have the exporters and pipeline provided by a different file, then allow the "user" to merge them together as needed

I'm far from an OTel configuration expert and I'm starting to think there's probably already mechanisms or tricks to merge OTel configuration files defining different components and/or ideas for generating configuration from templates that someone has already come up with. So I'd really like to hear from folks like @andrzej-stencel and @swiatekm who are active in the OTel community before we head down any concrete paths.

@pickypg
Copy link
Member Author

pickypg commented Oct 2, 2025

Happy to hear their insights, but unfortuantely I don't think "merging" really works in any functional way yet without

I tried locally with the latest Elastic Agent to use ?merge_paths and it failed with

invalid uri: "otel_samples/autoops_es_ssl.yml?merge_paths=receivers::metricbeatreceiver::metricbeat::modules"

@andrzej-stencel
Copy link
Contributor

The Routing connector approach will surely have a performance penalty. Whether it's big or small or negligible would need to be tested. I think however that using the Routing connector in this case will harm configuration readability and comprehensibility, which is the opposite that we seem to want here - the example configurations for debugging should be simple, to enhance user understanding.

I'm in favor of generating the similar configurations from a template, as mentioned above.

Copy link
Contributor

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good as it is now. Would be good to prevent duplication by e.g. templating.

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 7, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for now. Would be good to reduce duplication via templating as discussed in this PR but that can be done via a follow up PR.

@pickypg pickypg merged commit 8f1cc86 into main Oct 8, 2025
23 checks passed
@pickypg pickypg deleted the autoops/debug-output branch October 8, 2025 22:12
mergify bot pushed a commit that referenced this pull request Oct 8, 2025
This provides a separate OTel configuration to allow users to
better understand what data they are shipping with the AutoOps ES
module.

(cherry picked from commit 8f1cc86)
pickypg added a commit that referenced this pull request Oct 9, 2025
This provides a separate OTel configuration to allow users to
better understand what data they are shipping with the AutoOps ES
module.

(cherry picked from commit 8f1cc86)

Co-authored-by: Chris Earle <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-9.2 Automated backport to the 9.2 branch enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants