Skip to content

Commit 5e033a8

Browse files
fionaliaotacole02
andauthored
Allow primitive OTEL delta ingestion to be enabled (#11631)
Introduces the `-distributor.otel-native-delta-ingestion` flag (and corresponding per-tenant setting), which enables primitive OTEL delta metrics ingestion via the OTLP endpoint. This feature was implemented in Prometheus in prometheus/prometheus#16360. This PR allows Mimir users to enable this feature too. As per the Prometheus PR: > This allows otlp metrics with delta temporality to be ingested and stored as-is, with metric type unknown. To get "increase" or "rate", `sum_over_time(metric[<interval>])` (`/ <interval>`) can be used. > This is the first step towards implementing prometheus/proposals#48. That proposal has additional suggestions around type-aware functions and making the rate() and increase() functions work for deltas too. However, there are some questions around the best way to do querying over deltas, so having this simple implementation without changing any PromQL functions allow us to get some form of delta ingestion out there gather some feedback to decide the best way to go further. --------- Co-authored-by: Taylor C <41653732+tacole02@users.noreply.github.com>
1 parent c220554 commit 5e033a8

File tree

9 files changed

+232
-0
lines changed

9 files changed

+232
-0
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
* [FEATURE] Query-frontend: Allow use of Mimir Query Engine (MQE) via the experimental CLI flags `-query-frontend.query-engine` or `-query-frontend.enable-query-engine-fallback` or corresponding YAML. #11417 #11775
3737
* [FEATURE] Querier, query-frontend, ruler: Enable experimental support for duration expressions in PromQL, which are simple arithmetics on numbers in offset and range specification. #11344
3838
* [FEATURE] You can configure Mimir to export traces in OTLP exposition format through the standard `OTEL_` environment variables. #11618
39+
* [FEATURE] Distributor: Add experimental `-distributor.otel-native-delta-ingestion` option to allow primitive delta metrics ingestion via the OTLP endpoint. #11631
3940
* [FEATURE] distributor: Allow configuring tenant-specific HA tracker failover timeouts. #11774
4041
* [FEATURE] OTLP: Add experimental support for promoting OTel scope metadata (name, version, schema URL, attributes) to metric labels, prefixed with `otel_scope_`. Enable via the `-distributor.otel-promote-scope-metadata` flag. #11795
4142
* [ENHANCEMENT] Ingester: Display user grace interval in the tenant list obtained through the `/ingester/tenants` endpoint. #11961

cmd/mimir/config-descriptor.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5818,6 +5818,17 @@
58185818
"fieldType": "boolean",
58195819
"fieldCategory": "experimental"
58205820
},
5821+
{
5822+
"kind": "field",
5823+
"name": "otel_native_delta_ingestion",
5824+
"required": false,
5825+
"desc": "Whether to enable native ingestion of delta OTLP metrics, which will store the raw delta sample values without conversion. If disabled, delta metrics will be rejected. Delta support is in an early stage of development. The ingestion and querying process is likely to change over time.",
5826+
"fieldValue": null,
5827+
"fieldDefaultValue": false,
5828+
"fieldFlag": "distributor.otel-native-delta-ingestion",
5829+
"fieldType": "boolean",
5830+
"fieldCategory": "experimental"
5831+
},
58215832
{
58225833
"kind": "field",
58235834
"name": "ingest_storage_read_consistency",

cmd/mimir/help-all.txt.tmpl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1425,6 +1425,8 @@ Usage of ./cmd/mimir/mimir:
14251425
[experimental] Whether to keep identifying OTel resource attributes in the target_info metric on top of converting to job and instance labels.
14261426
-distributor.otel-metric-suffixes-enabled
14271427
Whether to enable automatic suffixes to names of metrics ingested through OTLP.
1428+
-distributor.otel-native-delta-ingestion
1429+
[experimental] Whether to enable native ingestion of delta OTLP metrics, which will store the raw delta sample values without conversion. If disabled, delta metrics will be rejected. Delta support is in an early stage of development. The ingestion and querying process is likely to change over time.
14281430
-distributor.otel-promote-resource-attributes comma-separated-list-of-strings
14291431
[experimental] Optionally specify OTel resource attributes to promote to labels.
14301432
-distributor.otel-promote-scope-metadata

docs/sources/mimir/configure/about-versioning.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,8 @@ The following features are currently experimental:
115115
- `-distributor.otel-convert-histograms-to-nhcb`
116116
- Enable promotion of OTel scope metadata to metric labels
117117
- `-distributor.otel-promote-scope-metadata`
118+
- Enable native ingestion of delta OTLP metrics. This means storing the raw delta sample values without converting them to cumulative values and having the metric type set to "Unknown". Delta support is in an early stage of development. The ingestion and querying process is likely to change over time. You can find considerations around querying and gotchas in the [corresponding Prometheus documentation](https://prometheus.io/docs/prometheus/3.4/feature_flags/#otlp-native-delta-support).
119+
- `distributor.otel-native-delta-ingestion`
118120
- Hash ring
119121
- Disabling ring heartbeat timeouts
120122
- `-distributor.ring.heartbeat-timeout=0`

docs/sources/mimir/configure/configuration-parameters/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4191,6 +4191,13 @@ ruler_alertmanager_client_config:
41914191
# CLI flag: -distributor.otel-promote-scope-metadata
41924192
[otel_promote_scope_metadata: <boolean> | default = false]
41934193

4194+
# (experimental) Whether to enable native ingestion of delta OTLP metrics, which
4195+
# will store the raw delta sample values without conversion. If disabled, delta
4196+
# metrics will be rejected. Delta support is in an early stage of development.
4197+
# The ingestion and querying process is likely to change over time.
4198+
# CLI flag: -distributor.otel-native-delta-ingestion
4199+
[otel_native_delta_ingestion: <boolean> | default = false]
4200+
41944201
# (experimental) The default consistency level to enforce for queries when using
41954202
# the ingest storage. Supports values: strong, eventual.
41964203
# CLI flag: -ingest-storage.read-consistency

pkg/distributor/otel.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ type OTLPHandlerLimits interface {
6060
OTelKeepIdentifyingResourceAttributes(id string) bool
6161
OTelConvertHistogramsToNHCB(id string) bool
6262
OTelPromoteScopeMetadata(id string) bool
63+
OTelNativeDeltaIngestion(id string) bool
6364
}
6465

6566
// OTLPHandler is an http.Handler accepting OTLP write requests.
@@ -283,6 +284,7 @@ func newOTLPParser(
283284
keepIdentifyingResourceAttributes := limits.OTelKeepIdentifyingResourceAttributes(tenantID)
284285
convertHistogramsToNHCB := limits.OTelConvertHistogramsToNHCB(tenantID)
285286
promoteScopeMetadata := limits.OTelPromoteScopeMetadata(tenantID)
287+
allowDeltaTemporality := limits.OTelNativeDeltaIngestion(tenantID)
286288

287289
pushMetrics.IncOTLPRequest(tenantID)
288290
pushMetrics.ObserveUncompressedBodySize(tenantID, float64(uncompressedBodySize))
@@ -299,6 +301,7 @@ func newOTLPParser(
299301
convertHistogramsToNHCB: convertHistogramsToNHCB,
300302
promoteScopeMetadata: promoteScopeMetadata,
301303
promoteResourceAttributes: promoteResourceAttributes,
304+
allowDeltaTemporality: allowDeltaTemporality,
302305
},
303306
spanLogger,
304307
)
@@ -509,6 +512,7 @@ type conversionOptions struct {
509512
convertHistogramsToNHCB bool
510513
promoteScopeMetadata bool
511514
promoteResourceAttributes []string
515+
allowDeltaTemporality bool
512516
}
513517

514518
func otelMetricsToTimeseries(
@@ -526,6 +530,7 @@ func otelMetricsToTimeseries(
526530
KeepIdentifyingResourceAttributes: opts.keepIdentifyingResourceAttributes,
527531
ConvertHistogramsToNHCB: opts.convertHistogramsToNHCB,
528532
PromoteScopeMetadata: opts.promoteScopeMetadata,
533+
AllowDeltaTemporality: opts.allowDeltaTemporality,
529534
}
530535
mimirTS := converter.ToTimeseries(ctx, md, settings, logger)
531536

pkg/distributor/otel_test.go

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -508,6 +508,202 @@ func TestConvertOTelHistograms(t *testing.T) {
508508
}
509509
}
510510

511+
func TestOTelDeltaIngestion(t *testing.T) {
512+
ts := time.Unix(100, 0)
513+
514+
testCases := []struct {
515+
name string
516+
allowDelta bool
517+
input pmetric.Metrics
518+
expected mimirpb.TimeSeries
519+
expectedErr string
520+
}{
521+
{
522+
name: "delta counter not allowed",
523+
allowDelta: false,
524+
input: func() pmetric.Metrics {
525+
md := pmetric.NewMetrics()
526+
rm := md.ResourceMetrics().AppendEmpty()
527+
il := rm.ScopeMetrics().AppendEmpty()
528+
m := il.Metrics().AppendEmpty()
529+
m.SetName("test_metric")
530+
sum := m.SetEmptySum()
531+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
532+
dp := sum.DataPoints().AppendEmpty()
533+
dp.SetTimestamp(pcommon.NewTimestampFromTime(time.Now()))
534+
dp.Attributes().PutStr("metric-attr", "metric value")
535+
return md
536+
}(),
537+
expectedErr: `otlp parse error: invalid temporality and type combination for metric "test_metric"`,
538+
},
539+
{
540+
name: "delta counter allowed",
541+
allowDelta: true,
542+
input: func() pmetric.Metrics {
543+
md := pmetric.NewMetrics()
544+
rm := md.ResourceMetrics().AppendEmpty()
545+
il := rm.ScopeMetrics().AppendEmpty()
546+
m := il.Metrics().AppendEmpty()
547+
m.SetName("test_metric")
548+
sum := m.SetEmptySum()
549+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
550+
dp := sum.DataPoints().AppendEmpty()
551+
dp.SetTimestamp(pcommon.NewTimestampFromTime(ts))
552+
dp.SetIntValue(5)
553+
dp.Attributes().PutStr("metric-attr", "metric value")
554+
return md
555+
}(),
556+
expected: mimirpb.TimeSeries{
557+
Labels: []mimirpb.LabelAdapter{{Name: "__name__", Value: "test_metric"}, {Name: "metric_attr", Value: "metric value"}},
558+
Samples: []mimirpb.Sample{{TimestampMs: ts.UnixMilli(), Value: 5}},
559+
},
560+
},
561+
{
562+
name: "delta exponential histogram not allowed",
563+
allowDelta: false,
564+
input: func() pmetric.Metrics {
565+
md := pmetric.NewMetrics()
566+
rm := md.ResourceMetrics().AppendEmpty()
567+
il := rm.ScopeMetrics().AppendEmpty()
568+
m := il.Metrics().AppendEmpty()
569+
m.SetName("test_metric")
570+
sum := m.SetEmptyExponentialHistogram()
571+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
572+
dp := sum.DataPoints().AppendEmpty()
573+
dp.SetCount(1)
574+
dp.SetSum(5)
575+
dp.SetTimestamp(pcommon.NewTimestampFromTime(time.Now()))
576+
dp.Attributes().PutStr("metric-attr", "metric value")
577+
return md
578+
}(),
579+
expectedErr: `otlp parse error: invalid temporality and type combination for metric "test_metric"`,
580+
},
581+
{
582+
name: "delta exponential histogram allowed",
583+
allowDelta: true,
584+
input: func() pmetric.Metrics {
585+
md := pmetric.NewMetrics()
586+
rm := md.ResourceMetrics().AppendEmpty()
587+
il := rm.ScopeMetrics().AppendEmpty()
588+
m := il.Metrics().AppendEmpty()
589+
m.SetName("test_metric")
590+
sum := m.SetEmptyExponentialHistogram()
591+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
592+
dp := sum.DataPoints().AppendEmpty()
593+
dp.SetCount(1)
594+
dp.SetSum(5)
595+
dp.SetTimestamp(pcommon.NewTimestampFromTime(ts))
596+
dp.Attributes().PutStr("metric-attr", "metric value")
597+
return md
598+
}(),
599+
expected: mimirpb.TimeSeries{
600+
Labels: []mimirpb.LabelAdapter{{Name: "__name__", Value: "test_metric"}, {Name: "metric_attr", Value: "metric value"}},
601+
Histograms: []mimirpb.Histogram{
602+
{
603+
Count: &mimirpb.Histogram_CountInt{CountInt: 1},
604+
Sum: 5,
605+
Schema: 0,
606+
ZeroThreshold: 1e-128,
607+
ZeroCount: &mimirpb.Histogram_ZeroCountInt{ZeroCountInt: 0},
608+
Timestamp: ts.UnixMilli(),
609+
ResetHint: mimirpb.Histogram_GAUGE,
610+
},
611+
},
612+
},
613+
},
614+
{
615+
name: "delta histogram as nhcb not allowed",
616+
allowDelta: false,
617+
input: func() pmetric.Metrics {
618+
md := pmetric.NewMetrics()
619+
rm := md.ResourceMetrics().AppendEmpty()
620+
il := rm.ScopeMetrics().AppendEmpty()
621+
m := il.Metrics().AppendEmpty()
622+
m.SetName("test_metric")
623+
sum := m.SetEmptyHistogram()
624+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
625+
dp := sum.DataPoints().AppendEmpty()
626+
dp.SetCount(20)
627+
dp.SetSum(30)
628+
dp.BucketCounts().FromRaw([]uint64{10, 10, 0})
629+
dp.ExplicitBounds().FromRaw([]float64{1, 2})
630+
dp.SetTimestamp(pcommon.NewTimestampFromTime(time.Now()))
631+
dp.Attributes().PutStr("metric-attr", "metric value")
632+
return md
633+
}(),
634+
expectedErr: `otlp parse error: invalid temporality and type combination for metric "test_metric"`,
635+
},
636+
{
637+
name: "delta histogram as nhcb allowed",
638+
allowDelta: true,
639+
input: func() pmetric.Metrics {
640+
md := pmetric.NewMetrics()
641+
rm := md.ResourceMetrics().AppendEmpty()
642+
il := rm.ScopeMetrics().AppendEmpty()
643+
m := il.Metrics().AppendEmpty()
644+
m.SetName("test_metric")
645+
sum := m.SetEmptyHistogram()
646+
sum.SetAggregationTemporality(pmetric.AggregationTemporalityDelta)
647+
dp := sum.DataPoints().AppendEmpty()
648+
dp.SetCount(20)
649+
dp.SetSum(30)
650+
dp.BucketCounts().FromRaw([]uint64{10, 10, 0})
651+
dp.ExplicitBounds().FromRaw([]float64{1, 2})
652+
dp.SetTimestamp(pcommon.NewTimestampFromTime(ts))
653+
dp.Attributes().PutStr("metric-attr", "metric value")
654+
return md
655+
}(),
656+
expected: mimirpb.TimeSeries{
657+
Labels: []mimirpb.LabelAdapter{{Name: "__name__", Value: "test_metric"}, {Name: "metric_attr", Value: "metric value"}},
658+
Histograms: []mimirpb.Histogram{
659+
{
660+
Count: &mimirpb.Histogram_CountInt{CountInt: 20},
661+
Sum: 30,
662+
Schema: -53,
663+
ZeroThreshold: 0,
664+
ZeroCount: nil,
665+
PositiveSpans: []mimirpb.BucketSpan{
666+
{
667+
Length: 3,
668+
},
669+
},
670+
PositiveDeltas: []int64{10, 0, -10},
671+
CustomValues: []float64{1, 2},
672+
Timestamp: ts.UnixMilli(),
673+
ResetHint: mimirpb.Histogram_GAUGE,
674+
},
675+
},
676+
},
677+
},
678+
}
679+
680+
for _, tc := range testCases {
681+
t.Run(tc.name, func(t *testing.T) {
682+
converter := newOTLPMimirConverter()
683+
mimirTS, dropped, err := otelMetricsToTimeseries(
684+
context.Background(),
685+
converter,
686+
tc.input,
687+
conversionOptions{
688+
convertHistogramsToNHCB: true,
689+
allowDeltaTemporality: tc.allowDelta,
690+
},
691+
log.NewNopLogger(),
692+
)
693+
if tc.expectedErr != "" {
694+
require.EqualError(t, err, tc.expectedErr)
695+
require.Len(t, mimirTS, 0)
696+
require.Equal(t, 1, dropped)
697+
} else {
698+
require.NoError(t, err)
699+
require.Len(t, mimirTS, 1)
700+
require.Equal(t, 0, dropped)
701+
require.Equal(t, tc.expected, *mimirTS[0].TimeSeries)
702+
}
703+
})
704+
}
705+
}
706+
511707
func BenchmarkOTLPHandler(b *testing.B) {
512708
var samples []prompb.Sample
513709
for i := 0; i < 1000; i++ {

pkg/distributor/push_test.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1545,6 +1545,8 @@ func (o otlpLimitsMock) OTelPromoteScopeMetadata(string) bool {
15451545
return false
15461546
}
15471547

1548+
func (o otlpLimitsMock) OTelNativeDeltaIngestion(string) bool { return false }
1549+
15481550
func promToMimirHistogram(h *prompb.Histogram) mimirpb.Histogram {
15491551
pSpans := make([]mimirpb.BucketSpan, 0, len(h.PositiveSpans))
15501552
for _, span := range h.PositiveSpans {

pkg/util/validation/limits.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,7 @@ type Limits struct {
290290
OTelKeepIdentifyingResourceAttributes bool `yaml:"otel_keep_identifying_resource_attributes" json:"otel_keep_identifying_resource_attributes" category:"experimental"`
291291
OTelConvertHistogramsToNHCB bool `yaml:"otel_convert_histograms_to_nhcb" json:"otel_convert_histograms_to_nhcb" category:"experimental"`
292292
OTelPromoteScopeMetadata bool `yaml:"otel_promote_scope_metadata" json:"otel_promote_scope_metadata" category:"experimental"`
293+
OTelNativeDeltaIngestion bool `yaml:"otel_native_delta_ingestion" json:"otel_native_delta_ingestion" category:"experimental"`
293294

294295
// Ingest storage.
295296
IngestStorageReadConsistency string `yaml:"ingest_storage_read_consistency" json:"ingest_storage_read_consistency" category:"experimental"`
@@ -337,6 +338,7 @@ func (l *Limits) RegisterFlags(f *flag.FlagSet) {
337338
f.BoolVar(&l.OTelKeepIdentifyingResourceAttributes, "distributor.otel-keep-identifying-resource-attributes", false, "Whether to keep identifying OTel resource attributes in the target_info metric on top of converting to job and instance labels.")
338339
f.BoolVar(&l.OTelConvertHistogramsToNHCB, "distributor.otel-convert-histograms-to-nhcb", false, "Whether to convert OTel explicit histograms into native histograms with custom buckets.")
339340
f.BoolVar(&l.OTelPromoteScopeMetadata, "distributor.otel-promote-scope-metadata", false, "Whether to promote OTel scope metadata (scope name, version, schema URL, attributes) to corresponding metric labels, prefixed with otel_scope_.")
341+
f.BoolVar(&l.OTelNativeDeltaIngestion, "distributor.otel-native-delta-ingestion", false, "Whether to enable native ingestion of delta OTLP metrics, which will store the raw delta sample values without conversion. If disabled, delta metrics will be rejected. Delta support is in an early stage of development. The ingestion and querying process is likely to change over time.")
340342

341343
f.Var(&l.IngestionArtificialDelay, "distributor.ingestion-artificial-delay", "Target ingestion delay to apply to all tenants. If set to a non-zero value, the distributor will artificially delay ingestion time-frame by the specified duration by computing the difference between actual ingestion and the target. There is no delay on actual ingestion of samples, it is only the response back to the client.")
342344
f.IntVar(&l.IngestionArtificialDelayConditionForTenantsWithLessThanMaxSeries, "distributor.ingestion-artificial-delay-condition-for-tenants-with-less-than-max-series", 0, "Condition to select tenants for which -distributor.ingestion-artificial-delay-duration-for-tenants-with-less-than-max-series should be applied.")
@@ -1329,6 +1331,10 @@ func (o *Overrides) OTelPromoteScopeMetadata(tenantID string) bool {
13291331
return o.getOverridesForUser(tenantID).OTelPromoteScopeMetadata
13301332
}
13311333

1334+
func (o *Overrides) OTelNativeDeltaIngestion(tenantID string) bool {
1335+
return o.getOverridesForUser(tenantID).OTelNativeDeltaIngestion
1336+
}
1337+
13321338
// DistributorIngestionArtificialDelay returns the artificial ingestion latency for a given user.
13331339
func (o *Overrides) DistributorIngestionArtificialDelay(tenantID string) time.Duration {
13341340
overrides := o.getOverridesForUser(tenantID)

0 commit comments

Comments
 (0)