-
Notifications
You must be signed in to change notification settings - Fork 279
Add Model Context Protocol (MCP) project proposal #3128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Count me down for being a contributing member! Looking forward to this SIG. Here's a related thread in the #otel-semantic-conventions slack channel from a couple weeks ago that may be of interest to this SIG. |
| * [pavolloffay/opentelemetry-mcp-server](https://github.com/pavolloffay/opentelemetry-mcp-server): Focuses on collector configuration. | ||
| * [austinlparker/otel-mcp](https://github.com/austinlparker/otel-mcp): Handles collector configuration and data profiling. | ||
| * [mottibec/otelcol-mcp](https://github.com/mottibec/otelcol-mcp): Focuses on collector configuration. | ||
| * [shiftyp/otel-mcp-server](https://github.com/shiftyp/otel-mcp-server): Provides data profiling, but requires OpenSearch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to answer any specific questions about this project!
|
@niwoerner @shiftyp @adrielp I have added you to the proposal. Thanks! |
shiftyp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions specifically related to the use case around data profiling, which I take to mean connecting telemetry itself to an agent flow, not just the configuration and troubleshooting use case for the OTEL stack itself.
projects/mcp-server.md
Outdated
| - OpenTelemetry collector configuration | ||
|
|
||
| Phase 2: Data profiling via collector (Months 1-2) | ||
| - OpenTelemetry collector extension which provides API to query and profile the processed telemetry data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as context, this was the focus of my particular project. An MCP server that generated OpenSearch / ElasticSearch queries to feed relevant telemetry data into an agent (tested with Claude). I used it specifically with Claude Code to combine telemetry intelligence with code context to answer root cause question about incidents, analyze performance of code changes, ect. This is probably where I could contribute the most in terms of thought partnership, although my effort could be used towards various goals.
projects/mcp-server.md
Outdated
|
|
||
| ### Project Scope and Architecture | ||
|
|
||
| The scope of this project is to create OpenTelemetry MCP server(s) to simplify deployment and day-2 operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we'll likely would end up with different MCP servers for different use-cases, I'd love to establish an unified interface as part of this project.
Both, for a streamlined development of (multiple) servers AND a consistent way for the end-user to setup/configure the server(s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on the consistent configuration and deployment. However, I am not sure what you mean by the streamlined development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That different kind of OTel MCP servers would follow the same (where possible) implementation/development patterns. Kind of to avoid that we have 5 different MCP servers with 5 different implementations
|
Thanks for the review @niwoerner . I have updated the proposal based on your feedback. |
|
A few notes --
I think it'd be totally appropriate to try and build a community around this independent of the main project and consider how different otel components could integrate MCP. I even think that there's smaller/point stuff (eg, a config validator MCP for the collector) that should be addressed at the existing SIG level. |
|
Question based on the discussion we had at the GC call today: would the Docs SIG be a good initial place for this SIG? |
|
Note: I'm referring to There's a huge potential to simplify the adoption/implementation of OTel with AI-Tooling and as mentioned the scope of potential OTel components which could benefit from is broad. While I agree that the concrete implementation should happen on component level in communication with the affected SIG, there are a few challenges which come with that. We'll end up with different implementations + inconsistencies across those tools, resulting in different user/developer-experiences. Additionally there might be redundant efforts for tooling with similar goals if there is no coordination happening across SIGs. Also, It would be great for users to have a common place to look at "Which official AI-Tooling is available in context of OTel today? What is currently developed and might be available soon?". A "MCP project/SIG" is perhaps not be the right term, but I believe what @pavolloffay and myself are looking for is a shared place to track and align the development of AI-Tooling in context of OTel. Having a cross-cutting SIG could be the right place to coordinate the development of AI-Tooling as it'd impact several different implementation/specification SIGs. I see the point that there is no critical need for a dedicated SIG to be able to experiment/develop this type of tools right now - so I understand the idea of placing this project into an existing SIG and based on demand eventually split it to a later point. Could the |
I fully support this view. It is vital that we coordinate to avoid redundancy, as overlapping tools will negatively impact the MCP's effectiveness.
Exactly. A key driver for this proposal is to establish a centralized forum for discussion, acknowledging that the actual implementation will be distributed across various components (such as the collector for config schema retrieval).
It works for me, If we can promote the MCP topics in that SIG and other people can join. |
|
I agree that MCP fits into broader Developer Experience SIG, but DevEx could mean pretty much anything. It would make sense to me to sunset DevEx SIG and, if current members are interested in working on the MCP server, they could join that effort instead. As a result we would have a better scoped project with clear deliverables. |
|
I don't think we should sunset the DevEx SIG. I haven't gotten my thoughts together on the matter but I plan to propose a renewed focus on integration of OpenTelemetry into projects. In interviews we've done for our upcoming blog posts patterns have come up that I think we've all known about, such as in house libraries for providing more ergonomic APIs and setup, as well as templates for Docker image building that incorporates, I think, things like resource attributes and auto instrumentation. These are places I'd like to see the DevEx SIG focus. The other place which I've seen new movement from outside the org on that this SIG had originally discussed, but was not able to tackle, is local telemetry handling for development. While such a project could end up as a piece of the collector, it may make sense to be developed/stewarted through the DevEx SIG. Of course, having a lot of potential projects but not the people power to do it isn't a reason to keep a SIG alive and I do admit this has been an issue, including by myself. A hope for me has been the MCP servers initiative helping keep the DevEx SIG being productive, allowing us to eventually tackle the other topics. |
|
@tsloughter - if you believe there is scope and energy to keep DevEx SIG alive - that's great and I have no objections, but what I think we should do is to scope it down to specific problems you folks outlined in your interviews. If we found that lack of MCP server is an existing developer experience problem, MCP could be in the scope. If MCP is part of DevEx, then we should have a single group of leads and staff working on both - from this proposal it seems there are two independent groups, repositories, sets of goals. What are we achieving by having these two projects being under the same SIG? |
|
@lmolkova ah, I see your concern. It won't be two independent groups. There will be a separate repository, I'd expect the same from any project DevEx approaches that has an actual code deliverable -- except those that live in existing repos. It could be that the collector mcp server ultimately is submitted as a component of the collector in its repos but there have been multiple mcp servers identified. All those involved in the DevEx SIG agreed we'd work on the MCP servers. My ideas for additional tasks are not to replace that effort at the moment but to be discussed and worked on only enough to table until such a time that we are able to work on them (at least this is my thinking, this is not something decided on by the SIG yet). All that said, I can see the argument for MCP servers being their own SIG. I don't necessarily feel strongly either way, only strongly that it doesn't mean the end of the DevEx SIG :) since I think there is much to do. |
|
Thanks for all the comments. I have updated the PR and fixed the review comments. The PR is approved by DevX mantainers and @svrnm. I would like to move the PR forward. |
alolita
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited about this proposal. I'm interested in understanding the specific scope of that this generic MCP implementation would support.
If I look at the deliverables section -
- Collector MCP server
- Configuration use-cases
- Data profiling use-cases: writing PII rules, high cardinality attributes, broken traces, single span traces
- Standalone MCP Server
- Instrumentation use-cases
- Collector provisioning and configuration use-cases
- Understanding changes in released artifacts
We need to be clear about which deliverables are features developers would find most valuable (e.g. instrumentation) vs. what features that are operationally focused e.g. provisioning or configuration management).
I recommend the core team identify the top 3 features for developers vs operational / SRE features to get started.
|
As @austinlparker wrote:
@lmolkova pointed me at this PR in the Weaver SIG meeting this morning where I demonstrated this PR: open-telemetry/weaver#1113 (gif included) - an MCP server for Weaver allowing you to search, get* and live-check any registry, OTel and/or a custom registry. Just putting this out there to add to the discussion. I'm happy to help. |
|
Hi @pavolloffay! This is a cool project, I'm excited for us to think harder about how we can best leverage AI to make Otel easier. In its current form, it still feels like this proposal is really broad. It also feels like something that would require the involvement of the Collector and SDK maintainers. I don't want to speak for all maintainers, but those SIGs have a lot on their plate with graduation and stability-related projects. Is there a way to narrow the scope of this project so that it does not require attention from those SIGs at this time? I don't want to block people who are excited to work on MCP, but I also don't want that work to generate a lot of review activity from other maintainers, as that would likely lead to things getting blocked. |
CharlieTLe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the MCP server know about all components from the core and the contrib repos? Will there be a way to prevent bias in the suggestions it provides towards any specific vendor when asked for example to set up an exporter for metrics?
|
@svrnm @jpkrohling @alolita @tedsuo @lmolkova @trask I have made some major updates to this PR. These changes were discussed in the devx SIG and we agree that the proposal should focus on integrating OpenTelemetry with the agentic workflow. We have also added more details in goals and deliverables. Please take a look. We would like to get unblocked and start working on the deliverables. There is interest in the community in helping to build this. |
|
|
||
| The sheer size and velocity of the OpenTelemetry ecosystem add to this difficulty. The project encompasses instrumentation for over 12 languages and includes diverse components like the Collector, OpAMP, and Weaver. Each component is released independently with its own setup requirements and release schedule. For example, the Collector is released bi-weekly, while auto-instrumentation libraries follow different schedules. | ||
|
|
||
| Maintenance is also complex. The ecosystem evolves rapidly, introducing frequent breaking changes. Our analysis of the Collector changelogs indicates that approximately 29% of changes are breaking. Keeping up with these updates requires significant manual effort to review release notes, update configuration files, and modify code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support this I have briefly looked at the changelogs and categorized changes https://github.com/pavolloffay/community/blob/mcp-changelog-analysis/FINAL_CHANGELOG_REPORT.md
|
|
||
| The Collector follows a fast two-week release cadence, which requiries constant maintenance to stay up to date and avoid breaking changes. Additionally configuring the collector correctly and writing valid OTTL statements is important for effective usage, but requires domain expertise and isn't always trivial. General-purpose coding agents struggle here because they lack up-to-date knowledge of recent releases and aren't specialized for Collector workflows. | ||
|
|
||
| * Enable agents to read and write valid Collector configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example how agentic workflow can help with maintaining collector docs https://gist.github.com/pavolloffay/c78595721676576b64768c247d1e22c5
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
projects/mcp-server.md
Outdated
|
|
||
| ## Project Scope and Architecture | ||
|
|
||
| The scope of this project is to enable **Agentic Workflows** for OpenTelemetry to simplify deployment, configuration, and day-2 operations across the OpenTelemetry project (collectors, SDKs, instrumentation, semantic conventions). To support this process, a standardized interface is required for Agents and LLMs to interact with the OpenTelemetry ecosystem. For instance [The Model Context Protocol (MCP)](https://modelcontextprotocol.io/) or [Agent Skills](https://agentskills.io/home) provide an idiomatic approach for this interaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not entirely clear to me here, if the scope is to FIND that standardized interface and then provide that reasearch to the project and SIGs can build from there, or if the goal is to BUILD that standardized interface, or is this project to CONSULT other SIGs on building that, those are very different goals, and from what I read here it's about BUILD?
Having had some time to think about this, I think this needs to be clarified first, and if it is about BUILD, we need to understand what and how is this going to be build, is there going to be a binary that people can run, or will there be a hosted MCP server, that people can access, or will there be a skills.md for all of otel somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our goals is to build MCP server(s) or Agent Skills that will enable agentic workflows for the projects in the OpenTelemetry ecosystem and make sure they provide a coherent user experience. We would like to start with the projects defined in goals and deliverables.
Individual SIGs can help as well (e.g. collector SIG offered help in this PR) or build the MCP themselves (e.g. weaver). Our goal is to make sure all OpenTelemetry MCP servers will provide a coherent end user experience in terms on installation, docs and functionality for the end user.
I have slightly rephrased the paragraph:
The scope of this project is to enable Agentic Workflows for OpenTelemetry to simplify deployment, configuration, and day-2 operations across the OpenTelemetry project (collectors, SDKs, instrumentation, semantic conventions). To support this workflow, a standardized interface is required for Agents and LLMs to interact with the OpenTelemetry ecosystem. The projet will focus on The Model Context Protocol (MCP) and Agent Skills concepts to provide this interface for agents to interact with the OpenTelemetry projects. The goal of this SIG is to deliver
reference implementation of MCP server(s) and/or Agent Skills for the OpenTelemetry project and ensure coherent behaviour and end user experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this is a crucial update, because of that I also need to nitpick on the word "Individual SIGs can help as well or build the MCP themselves":
They need to co-own this
There needs to be a mutual agreement between SIG maintainers (collector, weaver, ...) and the MCP Project Members within SIG DevEx about the what, how and especially who. This goes in both directions: the mcp project members need to work with the SIG maintainers, but likewise a SIG that wants to build a MCP server themselves should work with the project members. The mutual agreement can be that one of them does the majority of the work (which translates to "help" and "build"), but we need to avoid that down the line that one group works around the other one, and that we get some "shadow implementations".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this paragraph with this framing:
The goal of this SIG is to deliver reference implementation of MCP server(s) and/or Agent Skills for the OpenTelemetry project in coordination with existing SIGs to ensure coherent behaviour and end user experience. We will establish bi-directional collaboration to ensure implementation ownership is mutually agreed upon, such that each new component has a clear owner/maintainer aligned with best practices of the targeted SIGs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
|
@pavolloffay Our setup:
The challenge: Question 1: Helm chart support I see MCP support is being discussed for OTel Collector config generation. Is there any consideration for extending similar assistance to Helm chart values? Question 2: Transform preview What are your thoughts on live preview for log parsing and OTTL transformations — similar to ottl.run I was planning to build something like this internally, so I was pleasantly surprised to find it already exists. Is there any plan to integrate such preview capabilities into official tooling or MCP support? |
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Resolves #3129
Related to open-telemetry/opentelemetry.io#8331