Description
Metric Aggregation Processor Proposal
Objectives
The objective of the metric aggregation processor is to provide OpenTelemetry (OT) Collector users the ability to use exporters to send cumulative metrics to backend systems like Prometheus.
Background
Currently, the Prometheus exporter for the OT Collector is not functioning as expected. This is because OTLP Exporter in the SDK is pass through, and currently the Collector doesn't have any aggregation functionality, so the Collector receives deltas aggregations and instantaneous metric events and export them directly without converting them to cumulative values(#1255). In the Go SDK, aggregation of delta values is performed by the Accumulator and the Integrator(SDK Processor), but there is no similar component in the Collector. Any cumulative exporter that may be implemented in the future will encounter the same problem (a proposed exporter: #1150). A processor that maintains states of metrics (time series) and applies incoming deltas to cumulative values solves this problem.
Requirements
The processor should convert instantaneous and delta OTLP metrics to cumulative OTLP metrics. The following table proposes a mapping from instantaneous and delta metric kinds to cumulative metric kinds based on OTel-Proto PR open-telemetry/opentelemetry-collector#168. In the table, red entries are invalid type and kind combinations, white entries correspond to OTLP metrics that are already cumulative(or gauge), and should pass through the processor, and green entries correspond to OTLP metrics that the processor should aggregate.
*Grouping scalar indicates the most recent value to occur in a collection interval.
**Discussions around PR open-telemetry/opentelemetry-collector#168 are still ongoing.This table reflects a current status; will need to be updated when status changes
For green entries, the processor should maintain the cumulative value as a state. Metric state should be identified by a OTLP metric descriptor and a label set. For each metric of a stateful kind, the processor should aggregate cumulative values for every data point by labels.
Design Idea
The metric aggregation processor should be similar to the SDK Controller; it should contain:
- Accumulator, which is the entry point of OTLP metric events and manages incremental changes of each metric
- Integrator, which takes checkpoints of the incremental changes from Accumulator and apply them to cumulative values
- Ticker, which initiates the process of collection and sends metrics to the next metric consumer and
- Custom Aggregators, which maintain the state of metrics; a unique metric is defined by the combination of a metric name, a kind, a type and a label set.
The metric aggregation processor should accept concurrent updates of the metric state by taking in OTLP metrics passed from the pipeline, and perform collection periodically to send cumulative metrics to exporters, acting as a SDK push Controller. The following diagram illustrates the workflow.