|
| 1 | +## Embeddable Prometheus Exporters in OpenTelemetry Collectors |
| 2 | + |
| 3 | +* **Owners:** |
| 4 | + * [@ArthurSens](https://github.com/ArthurSens) |
| 5 | + |
| 6 | +* **Implementation Status:** Not implemented |
| 7 | + |
| 8 | +* **Related Issues and PRs:** |
| 9 | + * https://github.com/prometheus/exporter-toolkit/pull/357 |
| 10 | + * https://github.com/prometheus/node_exporter/pull/3459 |
| 11 | + * https://github.com/ArthurSens/prometheus-opentelemetry-collector (proof of concept) |
| 12 | + |
| 13 | +* **Other docs or links:** |
| 14 | + |
| 15 | +> TL;DR: This proposal introduces a mechanism to embed Prometheus exporters as native OpenTelemetry Collector receivers, reducing duplication of effort between the two ecosystems and enabling the "single binary" promise for telemetry collection without forcing reimplementation of hundreds of existing Prometheus exporters. |
| 16 | +
|
| 17 | +## Why |
| 18 | + |
| 19 | +The OpenTelemetry Collector ecosystem faces a significant challenge: many components in collector-contrib are "drop-in" replacements for existing Prometheus exporters but often become unmaintained before reaching stability. This duplication of effort occurs because the promise of "one binary to collect all telemetry" is valuable to users, leading to reimplementation of functionality that already exists in mature Prometheus exporters. |
| 20 | + |
| 21 | +This issue became particularly visible during OpenTelemetry's CNCF Graduation attempt, where feedback highlighted that users often feel frustrated when upgrading versions. In response, the Collector SIG decided to be stricter about accepting new components and more proactive in removing unmaintained or low-quality ones. |
| 22 | + |
| 23 | +Meanwhile, the Prometheus community has developed hundreds of exporters over many years, many of which are stable and well-maintained. Creating parallel implementations in the OpenTelemetry ecosystem wastes community resources and often results in "drive-by contributions" that are abandoned shortly after acceptance. |
| 24 | + |
| 25 | +### Pitfalls of the current solution |
| 26 | + |
| 27 | +1. **Duplication of Work**: Infrastructure monitoring receivers are reimplemented in OpenTelemetry when functionally equivalent Prometheus exporters already exist. |
| 28 | + |
| 29 | +2. **Unmaintained Components**: Many OpenTelemetry receivers that replicate Prometheus exporter functionality become unmaintained in early development stages. |
| 30 | + |
| 31 | +3. **Quality and Stability Issues**: The pressure to provide comprehensive coverage leads to accepting components that may not meet quality standards, contributing to collector-contrib's stability problems. |
| 32 | + |
| 33 | +4. **Diverging Ecosystems**: Two communities are solving the same problems independently, fragmenting effort and expertise. |
| 34 | + |
| 35 | +5. **Maintenance Burden**: Both ecosystems must independently maintain similar functionality for monitoring the same infrastructure components. |
| 36 | + |
| 37 | +## Goals |
| 38 | + |
| 39 | +* Enable embedding of Prometheus exporters as native OpenTelemetry Collector receivers via the OpenTelemetry Collector Builder (OCB). |
| 40 | +* Reduce duplication of effort between Prometheus and OpenTelemetry communities. |
| 41 | +* Maintain the "single binary" promise for users who want comprehensive telemetry collection. |
| 42 | +* Leverage existing, mature Prometheus exporters instead of reimplementing them in OTel Collector's side. |
| 43 | +* Unify the two ecosystems to increase the likelihood of attracting more maintainers and contributors. |
| 44 | + |
| 45 | +### Audience |
| 46 | + |
| 47 | +* Prometheus exporter maintainers and developers |
| 48 | +* OpenTelemetry Collector users and contributors |
| 49 | +* Organizations using both Prometheus and OpenTelemetry in their observability stack |
| 50 | +* Distribution builders (e.g., Grafana Alloy, OllyGarden Rose, AWS ADOT, DataDog DDOT, Elastic EDOT) |
| 51 | + |
| 52 | +## Non-Goals |
| 53 | + |
| 54 | +* Replace the existing OpenTelemetry Prometheus receiver (which scrapes Prometheus endpoints). |
| 55 | +* Force all Prometheus exporters to implement embedding interfaces immediately. |
| 56 | +* Automatically remove existing OpenTelemetry receivers that overlap with Prometheus exporters (this follows OpenTelemetry's own component removal policy). |
| 57 | +* Mandate that OpenTelemetry collector-contrib includes embedded Prometheus exporters (distributions can be customized). |
| 58 | + |
| 59 | +## How |
| 60 | + |
| 61 | +### Overview |
| 62 | + |
| 63 | +Prometheus exporters function similarly to OpenTelemetry Collector receivers: they gather information from infrastructure and expose it as metrics. The key difference is the output format and collection mechanism. Prometheus exporters expose an HTTP endpoint (typically `/metrics`) that is scraped, while OpenTelemetry receivers push metrics into a pipeline. |
| 64 | + |
| 65 | +This proposal introduces a bridge between these two paradigms by: |
| 66 | + |
| 67 | +1. **Defining new interfaces** in the `exporter-toolkit` project that Prometheus exporters can implement |
| 68 | +2. **Providing adapter code** that converts Prometheus Registry metrics to OpenTelemetry's pmetric format |
| 69 | +3. **Implementing OpenTelemetry Collector receiver interfaces** that wrap the adapter and Prometheus exporter |
| 70 | + |
| 71 | +### Architecture |
| 72 | + |
| 73 | +``` |
| 74 | +┌─────────────────────────────────────────────────────────────┐ |
| 75 | +│ OpenTelemetry Collector │ |
| 76 | +│ │ |
| 77 | +│ ┌────────────────────────────────────────────────────┐ │ |
| 78 | +│ │ Prometheus Exporter Receiver │ │ |
| 79 | +│ │ │ │ |
| 80 | +│ │ ┌─────────────────────────────────────────────┐ │ │ |
| 81 | +│ │ │ Prometheus Exporter │ │ │ |
| 82 | +│ │ │ (implements ExporterLifecycleManager) │ │ │ |
| 83 | +│ │ │ │ │ │ |
| 84 | +│ │ │ ┌──────────────────────────────────────┐ │ │ │ |
| 85 | +│ │ │ │ Prometheus Registry │ │ │ │ |
| 86 | +│ │ │ │ (Collectors gathering metrics) │ │ │ │ |
| 87 | +│ │ │ └──────────────────────────────────────┘ │ │ │ |
| 88 | +│ │ └─────────────────────────────────────────────┘ │ │ |
| 89 | +│ │ │ │ │ |
| 90 | +│ │ ▼ │ │ |
| 91 | +│ │ ┌─────────────────────────────────────────────┐ │ │ |
| 92 | +│ │ │ Exporter-toolkit (Registry → pmetric) │ │ │ |
| 93 | +│ │ │ (Periodic collection + conversion) │ │ │ |
| 94 | +│ │ └─────────────────────────────────────────────┘ │ │ |
| 95 | +│ │ │ │ │ |
| 96 | +│ └──────────────────────┼─────────────────────────────┘ │ |
| 97 | +│ ▼ │ |
| 98 | +│ ┌──────────────────────┐ │ |
| 99 | +│ │ Consumer.Metrics │ │ |
| 100 | +│ │ (Pipeline data) │ │ |
| 101 | +│ └──────────────────────┘ │ |
| 102 | +│ │ │ |
| 103 | +│ ▼ │ |
| 104 | +│ ┌──────────────────────┐ │ |
| 105 | +│ │ Processors │ │ |
| 106 | +│ └──────────────────────┘ │ |
| 107 | +│ │ │ |
| 108 | +│ ▼ │ |
| 109 | +│ ┌──────────────────────┐ │ |
| 110 | +│ │ Exporters │ │ |
| 111 | +│ └──────────────────────┘ │ |
| 112 | +└─────────────────────────────────────────────────────────────┘ |
| 113 | +``` |
| 114 | + |
| 115 | +### New Interfaces in exporter-toolkit |
| 116 | + |
| 117 | +The `exporter-toolkit` project will provide new interfaces that Prometheus exporters can implement: |
| 118 | + |
| 119 | +#### ExporterLifecycleManager Interface |
| 120 | + |
| 121 | +```go |
| 122 | +// ExporterLifecycleManager is the interface that Prometheus exporters must implement |
| 123 | +// to be embedded in the OTel Collector. |
| 124 | +type ExporterLifecycleManager interface { |
| 125 | + // Start sets up the exporter and returns a prometheus.Registry |
| 126 | + // containing all the metrics collectors. |
| 127 | + Start(ctx context.Context, exporterConfig Config) (*prometheus.Registry, error) |
| 128 | + |
| 129 | + // Shutdown is used to release resources when the receiver is shutting down. |
| 130 | + Shutdown(ctx context.Context) error |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +#### Configuration Interfaces |
| 135 | + |
| 136 | +```go |
| 137 | +// ConfigUnmarshaler is the interface used to unmarshal the exporter-specific |
| 138 | +// configuration using mapstructure and struct tags. |
| 139 | +type ConfigUnmarshaler interface { |
| 140 | + // GetConfigStruct returns a pointer to the config struct that mapstructure |
| 141 | + // will populate. The struct should have appropriate mapstructure tags. |
| 142 | + GetConfigStruct() Config |
| 143 | +} |
| 144 | + |
| 145 | +// Config is the interface that exporter-specific configurations must implement. |
| 146 | +type Config interface { |
| 147 | + // Validate checks if the configuration is valid. |
| 148 | + Validate() error |
| 149 | +} |
| 150 | +``` |
| 151 | + |
| 152 | +#### Receiver Configuration |
| 153 | + |
| 154 | +```go |
| 155 | +// ReceiverConfig holds the common configuration for all Prometheus exporter receivers. |
| 156 | +type ReceiverConfig struct { |
| 157 | + // ScrapeInterval defines how often to collect metrics from the exporter. |
| 158 | + // Default: 30s |
| 159 | + ScrapeInterval time.Duration `mapstructure:"scrape_interval"` |
| 160 | + |
| 161 | + // ExporterConfig holds the exporter-specific configuration. |
| 162 | + ExporterConfig map[string]interface{} `mapstructure:"exporter_config"` |
| 163 | +} |
| 164 | +``` |
| 165 | + |
| 166 | +### Prometheus Registry to pmetric Conversion |
| 167 | + |
| 168 | +The adapter will include a scraper component that: |
| 169 | + |
| 170 | +1. Calls `registry.Gather()` to collect metrics from the Prometheus Registry |
| 171 | +2. Converts Prometheus metric families to OpenTelemetry's pmetric format |
| 172 | + |
| 173 | +This conversion logic can leverage or adapt existing conversion code from the OpenTelemetry Prometheus receiver. |
| 174 | + |
| 175 | +### OpenTelemetry Collector Receiver Implementation |
| 176 | + |
| 177 | +The `exporter-toolkit` will provide a complete implementation of OpenTelemetry's receiver interfaces: |
| 178 | + |
| 179 | +1. **component.Factory** - for component type and default configuration |
| 180 | +2. **component.Component** - for lifecycle management |
| 181 | +3. **receiver.Factory** - for creating receiver instances |
| 182 | +4. **receiver.Metrics** - for producing pmetric data |
| 183 | + |
| 184 | +This implementation will: |
| 185 | +- Start the Prometheus exporter and obtain its Registry |
| 186 | +- Run a periodic scrape loop based on the configured interval |
| 187 | +- Convert scraped metrics to pmetric format |
| 188 | +- Push metrics to the OpenTelemetry pipeline consumer |
| 189 | + |
| 190 | +### Using with OpenTelemetry Collector Builder |
| 191 | + |
| 192 | +Once a Prometheus exporter implements the new interfaces, it can be included in custom OpenTelemetry Collector distributions via OCB: |
| 193 | + |
| 194 | +```yaml |
| 195 | +# ocb-config.yaml |
| 196 | +receivers: |
| 197 | + - gomod: github.com/prometheus/node_exporter v1.x.x |
| 198 | +``` |
| 199 | +
|
| 200 | +The OCB will recognize the exporter as a valid receiver and include it in the built collector binary. |
| 201 | +
|
| 202 | +### Example Configuration |
| 203 | +
|
| 204 | +In the OpenTelemetry Collector configuration: |
| 205 | +
|
| 206 | +```yaml |
| 207 | +receivers: |
| 208 | + node_exporter: |
| 209 | + scrape_interval: 30s |
| 210 | + exporter_config: |
| 211 | + # Node exporter specific configuration |
| 212 | + collectors: |
| 213 | + - cpu |
| 214 | + - diskstats |
| 215 | + - filesystem |
| 216 | + |
| 217 | +exporters: |
| 218 | + otlp: |
| 219 | + endpoint: otelcol:4317 |
| 220 | + |
| 221 | +service: |
| 222 | + pipelines: |
| 223 | + metrics: |
| 224 | + receivers: [node_exporter] |
| 225 | + exporters: [otlp] |
| 226 | +``` |
| 227 | +
|
| 228 | +### Implementation Steps |
| 229 | +
|
| 230 | +1. **Extend exporter-toolkit** with the new interfaces and OpenTelemetry Collector receiver implementation |
| 231 | +2. **Implement the Prometheus to pmetric converter** (potentially adapting existing code) |
| 232 | +3. **Update one reference exporter** (e.g., node_exporter) to implement the new interfaces as a proof of concept |
| 233 | +4. **Validate** with OpenTelemetry Collector Builder |
| 234 | +5. **Document** the integration pattern for other exporters |
| 235 | +6. **Gradually adopt** across other Prometheus exporters based on community interest |
| 236 | +
|
| 237 | +### Migration and Compatibility |
| 238 | +
|
| 239 | +* **No breaking changes** to existing Prometheus exporters that don't adopt the interfaces |
| 240 | +* **Opt-in adoption** - exporters can choose when/if to implement embedding support |
| 241 | +* **Backward compatibility** - embedded exporters still work as standalone exporters |
| 242 | +
|
| 243 | +
|
| 244 | +### Known Problems |
| 245 | +
|
| 246 | +1. **Dependency conflicts**: Prometheus exporters and OpenTelemetry collector-contrib use different dependency versions. Building a distribution with both may require dependency alignment or replace directives. |
| 247 | +
|
| 248 | +2. **Scope of adoption**: It's unclear how many Prometheus exporters will adopt these interfaces. The proposal targets exporters in the `prometheus` and `prometheus-community` GitHub organizations initially. |
| 249 | + |
| 250 | +3. **Metric semantics**: Subtle differences in how Prometheus and OpenTelemetry handle certain metric types may require careful mapping. |
| 251 | + |
| 252 | +## Alternatives |
| 253 | + |
| 254 | +### 1. Continue parallel implementation |
| 255 | + |
| 256 | +Continue the current approach where OpenTelemetry community reimplements Prometheus exporter functionality. |
| 257 | + |
| 258 | +**Rejected because**: This perpetuates the duplication of effort, maintenance burden, and quality issues that prompted this proposal. |
| 259 | + |
| 260 | +### 2. Separate process managed by collector |
| 261 | + |
| 262 | +Start Prometheus exporters as separate processes managed by the OpenTelemetry Collector, OpenAMP supervisor, or Kubernetes operator. The collector would scrape these processes' `/metrics` endpoints. |
| 263 | + |
| 264 | +**Trade-offs**: |
| 265 | +- Pros: No code changes needed to exporters; simpler dependency management |
| 266 | +- Cons: Loses the "single binary" promise; increased operational complexity; higher resource usage; more complex deployment |
| 267 | + |
| 268 | +This could serve as a complementary approach for exporters that cannot be embedded (e.g., those with complex dependencies or those requiring special privileges or written in languages that are not Go). |
| 269 | + |
| 270 | +### 3. Use OpenTelemetry receiver exclusively |
| 271 | + |
| 272 | +Rely solely on OpenTelemetry's existing Prometheus receiver that scrapes Prometheus exporters. |
| 273 | + |
| 274 | +**Rejected because**: This doesn't solve the "single binary" goal and still requires running separate exporter processes. It also doesn't address the underlying duplication problem for new infrastructure integrations. |
| 275 | + |
| 276 | +### 4. Wait for universal OTLP adoption |
| 277 | + |
| 278 | +Wait until all infrastructure components natively export OTLP metrics. |
| 279 | + |
| 280 | +**Rejected because**: This will take many years and may never fully happen. Prometheus exporters represent significant existing investment and will continue to be developed. |
| 281 | + |
| 282 | +## Action Plan |
| 283 | + |
| 284 | +The tasks to implement this proposal: |
| 285 | + |
| 286 | +* [ ] |
0 commit comments