Skip to content

Conversation

@pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Nov 6, 2025

@adrielp
Copy link
Member

adrielp commented Nov 17, 2025

Count me down for being a contributing member! Looking forward to this SIG. Here's a related thread in the #otel-semantic-conventions slack channel from a couple weeks ago that may be of interest to this SIG.

* [pavolloffay/opentelemetry-mcp-server](https://github.com/pavolloffay/opentelemetry-mcp-server): Focuses on collector configuration.
* [austinlparker/otel-mcp](https://github.com/austinlparker/otel-mcp): Handles collector configuration and data profiling.
* [mottibec/otelcol-mcp](https://github.com/mottibec/otelcol-mcp): Focuses on collector configuration.
* [shiftyp/otel-mcp-server](https://github.com/shiftyp/otel-mcp-server): Provides data profiling, but requires OpenSearch.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to answer any specific questions about this project!

@pavolloffay
Copy link
Member Author

@niwoerner @shiftyp @adrielp I have added you to the proposal. Thanks!

Copy link

@shiftyp shiftyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions specifically related to the use case around data profiling, which I take to mean connecting telemetry itself to an agent flow, not just the configuration and troubleshooting use case for the OTEL stack itself.

- OpenTelemetry collector configuration

Phase 2: Data profiling via collector (Months 1-2)
- OpenTelemetry collector extension which provides API to query and profile the processed telemetry data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as context, this was the focus of my particular project. An MCP server that generated OpenSearch / ElasticSearch queries to feed relevant telemetry data into an agent (tested with Claude). I used it specifically with Claude Code to combine telemetry intelligence with code context to answer root cause question about incidents, analyze performance of code changes, ect. This is probably where I could contribute the most in terms of thought partnership, although my effort could be used towards various goals.


### Project Scope and Architecture

The scope of this project is to create OpenTelemetry MCP server(s) to simplify deployment and day-2 operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we'll likely would end up with different MCP servers for different use-cases, I'd love to establish an unified interface as part of this project.

Both, for a streamlined development of (multiple) servers AND a consistent way for the end-user to setup/configure the server(s)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the consistent configuration and deployment. However, I am not sure what you mean by the streamlined development?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That different kind of OTel MCP servers would follow the same (where possible) implementation/development patterns. Kind of to avoid that we have 5 different MCP servers with 5 different implementations

@pavolloffay
Copy link
Member Author

Thanks for the review @niwoerner . I have updated the proposal based on your feedback.

@austinlparker
Copy link
Member

A few notes --

  • I think the scope of this is very broad as it stands, touching multiple discrete components/parts of the project with divergent use cases and goals.
  • I think there's discrete use cases (eg, mcp endpoints in Weaver, or ability to do config validation) that are valuable, but I also think that these should probably be accepted at a per-component level. We don't necessarily need a dedicated project for adding weaver support to weaver, or the collector, for example.
  • I also think that there's a little bit of an argument for "hey, is anyone asking for this?" I think we do need to be able to experiment like this, but I also think that there's some level of... "hey, if people want to go off and build these MCP servers as an extension and then get a user base and try to upstream it" then that's fine. It's different to have stuff with proven fit/use cases.

I think it'd be totally appropriate to try and build a community around this independent of the main project and consider how different otel components could integrate MCP. I even think that there's smaller/point stuff (eg, a config validator MCP for the collector) that should be addressed at the existing SIG level.

@jpkrohling
Copy link
Member

Question based on the discussion we had at the GC call today: would the Docs SIG be a good initial place for this SIG?

@niwoerner
Copy link
Member

Note: I'm referring to AI-Tooling instead of MCP-servers because I believe not every functionality related to AI would require a MCP-Server but still bring a benefit for the OTel ecosystem

There's a huge potential to simplify the adoption/implementation of OTel with AI-Tooling and as mentioned the scope of potential OTel components which could benefit from is broad.

While I agree that the concrete implementation should happen on component level in communication with the affected SIG, there are a few challenges which come with that.

We'll end up with different implementations + inconsistencies across those tools, resulting in different user/developer-experiences. Additionally there might be redundant efforts for tooling with similar goals if there is no coordination happening across SIGs. Also, It would be great for users to have a common place to look at "Which official AI-Tooling is available in context of OTel today? What is currently developed and might be available soon?".

A "MCP project/SIG" is perhaps not be the right term, but I believe what @pavolloffay and myself are looking for is a shared place to track and align the development of AI-Tooling in context of OTel.

Having a cross-cutting SIG could be the right place to coordinate the development of AI-Tooling as it'd impact several different implementation/specification SIGs.

I see the point that there is no critical need for a dedicated SIG to be able to experiment/develop this type of tools right now - so I understand the idea of placing this project into an existing SIG and based on demand eventually split it to a later point.

Could the Developer Experience SIG be a good starting point? At the end of the day, the goal of AI-Tooling is to facilitate the usage of OpenTelemetry.

@pavolloffay
Copy link
Member Author

We'll end up with different implementations + inconsistencies across those tools, resulting in different user/developer-experiences.

I fully support this view. It is vital that we coordinate to avoid redundancy, as overlapping tools will negatively impact the MCP's effectiveness.

Having a cross-cutting SIG could be the right place to coordinate the development of AI-Tooling as it'd impact several different implementation/specification SIGs.

Exactly. A key driver for this proposal is to establish a centralized forum for discussion, acknowledging that the actual implementation will be distributed across various components (such as the collector for config schema retrieval).

Could the Developer Experience SIG be a good starting point? At the end of the day, the goal of AI-Tooling is to facilitate the usage of OpenTelemetry.

It works for me, If we can promote the MCP topics in that SIG and other people can join.

@lmolkova
Copy link
Member

lmolkova commented Dec 19, 2025

I agree that MCP fits into broader Developer Experience SIG, but DevEx could mean pretty much anything.
To the point raised by @julianocosta89 #3128 (comment) - the original scope of DevEx (survey) was reached and the current efforts align more with blueprints project proposal (#3094).

It would make sense to me to sunset DevEx SIG and, if current members are interested in working on the MCP server, they could join that effort instead. As a result we would have a better scoped project with clear deliverables.

@tsloughter
Copy link
Member

I don't think we should sunset the DevEx SIG. I haven't gotten my thoughts together on the matter but I plan to propose a renewed focus on integration of OpenTelemetry into projects. In interviews we've done for our upcoming blog posts patterns have come up that I think we've all known about, such as in house libraries for providing more ergonomic APIs and setup, as well as templates for Docker image building that incorporates, I think, things like resource attributes and auto instrumentation. These are places I'd like to see the DevEx SIG focus.

The other place which I've seen new movement from outside the org on that this SIG had originally discussed, but was not able to tackle, is local telemetry handling for development. While such a project could end up as a piece of the collector, it may make sense to be developed/stewarted through the DevEx SIG.

Of course, having a lot of potential projects but not the people power to do it isn't a reason to keep a SIG alive and I do admit this has been an issue, including by myself. A hope for me has been the MCP servers initiative helping keep the DevEx SIG being productive, allowing us to eventually tackle the other topics.

@lmolkova
Copy link
Member

@tsloughter - if you believe there is scope and energy to keep DevEx SIG alive - that's great and I have no objections, but what I think we should do is to scope it down to specific problems you folks outlined in your interviews. If we found that lack of MCP server is an existing developer experience problem, MCP could be in the scope.

If MCP is part of DevEx, then we should have a single group of leads and staff working on both - from this proposal it seems there are two independent groups, repositories, sets of goals. What are we achieving by having these two projects being under the same SIG?

@tsloughter
Copy link
Member

@lmolkova ah, I see your concern. It won't be two independent groups. There will be a separate repository, I'd expect the same from any project DevEx approaches that has an actual code deliverable -- except those that live in existing repos. It could be that the collector mcp server ultimately is submitted as a component of the collector in its repos but there have been multiple mcp servers identified.

All those involved in the DevEx SIG agreed we'd work on the MCP servers. My ideas for additional tasks are not to replace that effort at the moment but to be discussed and worked on only enough to table until such a time that we are able to work on them (at least this is my thinking, this is not something decided on by the SIG yet).

All that said, I can see the argument for MCP servers being their own SIG. I don't necessarily feel strongly either way, only strongly that it doesn't mean the end of the DevEx SIG :) since I think there is much to do.

@pavolloffay
Copy link
Member Author

Thanks for all the comments. I have updated the PR and fixed the review comments.

The PR is approved by DevX mantainers and @svrnm. I would like to move the PR forward.

Copy link
Member

@alolita alolita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited about this proposal. I'm interested in understanding the specific scope of that this generic MCP implementation would support.

If I look at the deliverables section -

  • Collector MCP server
    • Configuration use-cases
    • Data profiling use-cases: writing PII rules, high cardinality attributes, broken traces, single span traces
  • Standalone MCP Server
    • Instrumentation use-cases
    • Collector provisioning and configuration use-cases
    • Understanding changes in released artifacts

We need to be clear about which deliverables are features developers would find most valuable (e.g. instrumentation) vs. what features that are operationally focused e.g. provisioning or configuration management).

I recommend the core team identify the top 3 features for developers vs operational / SRE features to get started.

@jerbly
Copy link

jerbly commented Jan 7, 2026

As @austinlparker wrote:

  • I think there's discrete use cases (eg, mcp endpoints in Weaver, or ability to do config validation) that are valuable, but I also think that these should probably be accepted at a per-component level. We don't necessarily need a dedicated project for adding weaver support to weaver, or the collector, for example.

@lmolkova pointed me at this PR in the Weaver SIG meeting this morning where I demonstrated this PR: open-telemetry/weaver#1113 (gif included) - an MCP server for Weaver allowing you to search, get* and live-check any registry, OTel and/or a custom registry.

Just putting this out there to add to the discussion. I'm happy to help.

@tedsuo
Copy link
Contributor

tedsuo commented Jan 7, 2026

Hi @pavolloffay! This is a cool project, I'm excited for us to think harder about how we can best leverage AI to make Otel easier.

In its current form, it still feels like this proposal is really broad. It also feels like something that would require the involvement of the Collector and SDK maintainers. I don't want to speak for all maintainers, but those SIGs have a lot on their plate with graduation and stability-related projects.

Is there a way to narrow the scope of this project so that it does not require attention from those SIGs at this time? I don't want to block people who are excited to work on MCP, but I also don't want that work to generate a lot of review activity from other maintainers, as that would likely lead to things getting blocked.

Copy link

@CharlieTLe CharlieTLe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the MCP server know about all components from the core and the contrib repos? Will there be a way to prevent bias in the suggestions it provides towards any specific vendor when asked for example to set up an exporter for metrics?

@pavolloffay
Copy link
Member Author

@svrnm @jpkrohling @alolita @tedsuo @lmolkova @trask

I have made some major updates to this PR. These changes were discussed in the devx SIG and we agree that the proposal should focus on integrating OpenTelemetry with the agentic workflow. We have also added more details in goals and deliverables.

Please take a look. We would like to get unblocked and start working on the deliverables. There is interest in the community in helping to build this.


The sheer size and velocity of the OpenTelemetry ecosystem add to this difficulty. The project encompasses instrumentation for over 12 languages and includes diverse components like the Collector, OpAMP, and Weaver. Each component is released independently with its own setup requirements and release schedule. For example, the Collector is released bi-weekly, while auto-instrumentation libraries follow different schedules.

Maintenance is also complex. The ecosystem evolves rapidly, introducing frequent breaking changes. Our analysis of the Collector changelogs indicates that approximately 29% of changes are breaking. Keeping up with these updates requires significant manual effort to review release notes, update configuration files, and modify code.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To support this I have briefly looked at the changelogs and categorized changes https://github.com/pavolloffay/community/blob/mcp-changelog-analysis/FINAL_CHANGELOG_REPORT.md


The Collector follows a fast two-week release cadence, which requiries constant maintenance to stay up to date and avoid breaking changes. Additionally configuring the collector correctly and writing valid OTTL statements is important for effective usage, but requires domain expertise and isn't always trivial. General-purpose coding agents struggle here because they lack up-to-date knowledge of recent releases and aren't specialized for Collector workflows.

* Enable agents to read and write valid Collector configuration.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example how agentic workflow can help with maintaining collector docs https://gist.github.com/pavolloffay/c78595721676576b64768c247d1e22c5

Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>

## Project Scope and Architecture

The scope of this project is to enable **Agentic Workflows** for OpenTelemetry to simplify deployment, configuration, and day-2 operations across the OpenTelemetry project (collectors, SDKs, instrumentation, semantic conventions). To support this process, a standardized interface is required for Agents and LLMs to interact with the OpenTelemetry ecosystem. For instance [The Model Context Protocol (MCP)](https://modelcontextprotocol.io/) or [Agent Skills](https://agentskills.io/home) provide an idiomatic approach for this interaction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me here, if the scope is to FIND that standardized interface and then provide that reasearch to the project and SIGs can build from there, or if the goal is to BUILD that standardized interface, or is this project to CONSULT other SIGs on building that, those are very different goals, and from what I read here it's about BUILD?

Having had some time to think about this, I think this needs to be clarified first, and if it is about BUILD, we need to understand what and how is this going to be build, is there going to be a binary that people can run, or will there be a hosted MCP server, that people can access, or will there be a skills.md for all of otel somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our goals is to build MCP server(s) or Agent Skills that will enable agentic workflows for the projects in the OpenTelemetry ecosystem and make sure they provide a coherent user experience. We would like to start with the projects defined in goals and deliverables.

Individual SIGs can help as well (e.g. collector SIG offered help in this PR) or build the MCP themselves (e.g. weaver). Our goal is to make sure all OpenTelemetry MCP servers will provide a coherent end user experience in terms on installation, docs and functionality for the end user.

I have slightly rephrased the paragraph:

The scope of this project is to enable Agentic Workflows for OpenTelemetry to simplify deployment, configuration, and day-2 operations across the OpenTelemetry project (collectors, SDKs, instrumentation, semantic conventions). To support this workflow, a standardized interface is required for Agents and LLMs to interact with the OpenTelemetry ecosystem. The projet will focus on The Model Context Protocol (MCP) and Agent Skills concepts to provide this interface for agents to interact with the OpenTelemetry projects. The goal of this SIG is to deliver
reference implementation of MCP server(s) and/or Agent Skills for the OpenTelemetry project and ensure coherent behaviour and end user experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this is a crucial update, because of that I also need to nitpick on the word "Individual SIGs can help as well or build the MCP themselves":

They need to co-own this

There needs to be a mutual agreement between SIG maintainers (collector, weaver, ...) and the MCP Project Members within SIG DevEx about the what, how and especially who. This goes in both directions: the mcp project members need to work with the SIG maintainers, but likewise a SIG that wants to build a MCP server themselves should work with the project members. The mutual agreement can be that one of them does the majority of the work (which translates to "help" and "build"), but we need to avoid that down the line that one group works around the other one, and that we get some "shadow implementations".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this paragraph with this framing:

The goal of this SIG is to deliver reference implementation of MCP server(s) and/or Agent Skills for the OpenTelemetry project in coordination with existing SIGs to ensure coherent behaviour and end user experience. We will establish bi-directional collaboration to ensure implementation ownership is mutually agreed upon, such that each new component has a clear owner/maintainer aligned with best practices of the targeted SIGs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
@vidam-io
Copy link

@pavolloffay
I'm an internal platform operator running an observability platform powered by OpenTelemetry.

Our setup:

  • Users submit Helm values through our internal portal
  • Our platform automatically converts these values into OTel Collector configs and Helm manifests
  • The generated resources are then deployed to users' Kubernetes clusters
  • We run multiple OTel Collectors per cluster (for K8s metrics, logs/metrics/traces ingestion, etc.)

The challenge:
Users often struggle with writing correct Helm values and OTel Collector configs.


Question 1: Helm chart support

I see MCP support is being discussed for OTel Collector config generation. Is there any consideration for extending similar assistance to Helm chart values?

Question 2: Transform preview

What are your thoughts on live preview for log parsing and OTTL transformations — similar to ottl.run

I was planning to build something like this internally, so I was pleasantly surprised to find it already exists. Is there any plan to integrate such preview capabilities into official tooling or MCP support?

Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/project-proposal Submitting a filled out project template

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Onboard SIG: MCP Server