Deduplicate identical series labels and track their memory consumption in QueryLimiter.AddSeries #13806

lamida · 2025-12-10T17:03:39Z

https://github.com/grafana/mimir-squad/issues/3280

Reuse the labels when we new the series use the same labels seen before.

Benchmark

Details

 go test -bench='BenchmarkQueryLimiter_AddSeries_WithCallerDedup_(NoDuplicates|50pct|90pct)$' \
    -benchmem \
    -benchtime=5000x \
    -count=5 \
    -run=^$ \
    ./pkg/util/limiter/
goos: darwin
goarch: arm64
pkg: github.com/grafana/mimir/pkg/util/limiter
cpu: Apple M4 Pro
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22913 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22727 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22658 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22818 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22935 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25453 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25028 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     24982 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25298 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25941 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     33348 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22389 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     23021 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22539 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22392 ns/op	      10 B/op	       0 allocs/op
PASS
ok  	github.com/grafana/mimir/pkg/util/limiter	2.534s

The benchmark shows that with more duplicated labels B/op metrics are becoming lower.

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
about-versioning.md updated with experimental features.

Note

Introduces label de-duplication and memory tracking in query limiting and propagates it through the read path.

Refactors QueryLimiter.AddSeries to AddSeries(lbls, tracker) (labels.Labels, error), storing canonical labels per fingerprint, enforcing limits, and accounting memory via MemoryConsumptionTracker
Threads a MemoryConsumptionTracker through contexts and callers: distributor (MetricsForLabelMatchers, QueryStream), querier (distributor and store-gateway streaming, series set wrapping), and ingester/store-gateway stream readers (now take *limiter.MemoryConsumptionTracker)
Callers updated to use returned canonical labels; tests adjusted to add trackers to context and assert label instance reuse; adds benchmarks for duplicate/unique series and a compatibility test for stringlabels
CHANGELOG: adds ENHANCEMENT for label de-dup in querier limiter and a default note for -compactor.upload-sparse-index-headers

^{Written by Cursor Bugbot for commit dcf8840. This will update automatically on new commits. Configure here.}

pkg/querier/blocks_store_queryable.go

pkg/util/limiter/query_limiter.go

pkg/util/limiter/query_limiter_test.go

lamida · 2025-12-15T10:19:27Z

pkg/distributor/distributor.go

 	result := make([]labels.Labels, 0, len(metrics))
 	for _, m := range metrics {
-		if err := queryLimiter.AddSeries(m); err != nil {
+		uniqueSeriesLabels, err := queryLimiter.AddSeries(m, mimir_limiter.NoopMemoryTracker{})


This is called from /series endpoint in which we are not tracking memory consumption, hence we use NoopMemoryTracker here.

We need to use the memory consumption tracker from the context. If the /series endpoint doesn't provide one, then we'll need to add it in for that endpoint, just like we did for the other endpoints.

Updated in f75c488

pkg/util/limiter/query_limiter.go

charleskorn · 2025-12-16T08:57:40Z

pkg/util/limiter/query_limiter.go

+type MemoryTracker interface {
+	IncreaseMemoryConsumptionForLabels(labels labels.Labels) error
+}
+
+type NoopMemoryTracker struct{}


This isn't needed, we can create a MemoryConsumptionTracker with limit set to 0 to disable enforcing a memory consumption limit.

Updated in 3107044

charleskorn · 2025-12-16T08:58:04Z

pkg/distributor/distributor.go

 	result := make([]labels.Labels, 0, len(metrics))
 	for _, m := range metrics {
-		if err := queryLimiter.AddSeries(m); err != nil {
+		uniqueSeriesLabels, err := queryLimiter.AddSeries(m, mimir_limiter.NoopMemoryTracker{})


We need to use the memory consumption tracker from the context. If the /series endpoint doesn't provide one, then we'll need to add it in for that endpoint, just like we did for the other endpoints.

charleskorn · 2025-12-16T09:40:33Z

pkg/util/limiter/query_limiter.go

-	uniqueSeriesBefore := len(ql.uniqueSeries)
-	ql.uniqueSeries[fingerprint] = struct{}{}
-	uniqueSeriesAfter := len(ql.uniqueSeries)
+	uniqueSeriesBefore := len(ql.uniqueSeries) + countConflictSeries(ql.conflictSeries)
+	var found bool
+	var existing labels.Labels
+	var newSeriesButHashCollided bool
+	if existing, found = ql.uniqueSeries[fingerprint]; !found || labels.Equal(existing, seriesLabels) {
+		if !found {
+			// This is unique new series.
+			ql.uniqueSeries[fingerprint] = seriesLabels
+			err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
+			if err != nil {
+				return labels.EmptyLabels(), err
+			}
+		}
+	} else {
+		// Conflicted hash is found.
+		if ql.conflictSeries == nil {
+			ql.conflictSeries = make(map[uint64][]labels.Labels)
+		}
+		l := ql.conflictSeries[fingerprint]
+		for _, prev := range l {
+			// Labels matches with previous series, return the same labels instance.
+			if labels.Equal(prev, seriesLabels) {
+				return prev, nil
+			}
+		}
+		newSeriesButHashCollided = true
+		ql.conflictSeries[fingerprint] = append(l, seriesLabels)
+		err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
+		if err != nil {
+			return labels.EmptyLabels(), err
+		}
+	}


I think we can simplify this a bit given hash conflicts are rare and we were previously OK with undercounting in the event of a hash conflict:

if the hash has never been seen before: add it to uniqueSeries and return the labels passed in

if the hash has been seen before:

check if the corresponding labels in uniqueSeries are the same, and if so, return the labels from uniqueSeries

if they're not the same (ie. this is a conflict where two different sets of labels return the same hash), return the labels passed in

Applied as suggested 370860f

charleskorn · 2025-12-16T09:41:32Z

pkg/util/limiter/query_limiter_test.go

+	// Manually set up the collision state in the limiter
+	limiter.conflictSeries = make(map[uint64][]labels.Labels)
+	// The first series is in uniqueSeries map.
+	limiter.uniqueSeries[collisionHash] = series1
+	// If another series hash is colliding, AddSeries will add it to conflictSeries map.
+	limiter.conflictSeries[collisionHash] = []labels.Labels{series2}


Why do we need to do this? Couldn't we call limiter.AddSeries(series1), limiter.AddSeries(series2) etc.?

In that case we won't test hash collision won't we? Nevertheless I refactored the hash check implementation to simplify it as suggested here.

charleskorn · 2025-12-16T09:43:26Z

pkg/util/limiter/query_limiter_test.go

+	// Try stringlabels implementation (default)
+	// stringlabels stores data in a "data" field of type string
+	if aData := aVal.FieldByName("data"); aData.IsValid() && aData.Kind() == reflect.String {


Rather than doing this runtime detection of which implementation is in use, what if we used the slicelabels build tag to include a different implementation based on which build tag is enabled?

This is how we handle the different implementations in other places.

Refactored this into different build tagged files in 885b3d1. However I still use reflection to get the internal field so that we can compare the sameness of internal reference.

cursor · 2025-12-22T15:29:22Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Memory consumption tracked before duplicate check, causing over-counting

The IncreaseMemoryConsumptionForLabels call at line 87 occurs unconditionally BEFORE the duplicate label check at lines 94-102. The comment on lines 95-96 states this branch is "not counting up the memory consumption" for duplicates, but memory has already been counted. This causes duplicate series labels to be counted multiple times in the memory tracker, potentially causing queries to be rejected with memory limit errors even when actual unique memory usage is within limits. The memory consumption increase belongs after the duplicate check, only for newly-seen series.

This is intentional. If we have to track the unique labels only, there is no easy way to reduce memory consumption as we iterate the series when processing the series only for that unique labels.

Could you please add a comment near the IncreaseMemoryConsumptionForLabels call above explaining why we always call it, even if the series is a duplicate, and why this is important for handling hash collisions?

Alternatively, I think we can solve this by moving the wrapping of the SeriesSet in MemoryTrackingSeriesSet so that a single MemoryTrackingSeriesSet wraps the merged result from all ingesters and store-gateways, rather than separately wrapping the SeriesSet for all ingesters with a MemoryTrackingSeriesSet and then wrapping the SeriesSet for all store-gateways in a separate MemoryTrackingSeriesSet. If we do this, then we'll need to think about how we handle hash conflicts.

CHANGELOG.md

tcp13equals2 · 2025-12-30T00:44:03Z

pkg/util/limiter/query_limiter.go

+	// labels and not counting up the memory consumption.
+	if found && labels.Equal(existing, seriesLabels) {
+		// Still return error if the duplicated labels had been exceeding the limit.
+		if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {


[nit] You could re-use the uniqueSeriesBefore rather then call len(ql.uniqueSeries) again.

Updated in 00b9117

tcp13equals2 · 2025-12-30T01:21:10Z

pkg/util/limiter/compat_dedupelabels_test.go

+		aPtr := aData.Pointer()
+		bPtr := bData.Pointer()
+		assert.Equal(t, aPtr, bPtr, "labels should share the same data slice (dedupelabels)")
+	}


Is it worth adding a failure assertion to the end of each of the assertSameLabels functions, so that if another user calls these expecting a full label comparison the function will fail if the given labels have different lengths/values etc?

Replaced assert with require.

tcp13equals2 · 2025-12-30T01:21:49Z

pkg/util/limiter/compat_dedupelabels_test.go

+	if aData.Len() > 0 && bData.Len() > 0 && aData.Len() == bData.Len() {
+		aPtr := aData.Pointer()
+		bPtr := bData.Pointer()
+		assert.Equal(t, aPtr, bPtr, "labels should share the same data slice (dedupelabels)")


Should this be a require.Equal ?

tcp13equals2 · 2025-12-30T01:22:02Z

pkg/util/limiter/compat_slicelabels_test.go

+	if aSlice.Len() > 0 && bSlice.Len() > 0 && aSlice.Leng() == bSlice.Len() {
+		aPtr := aSlice.Pointer()
+		bPtr := bSlice.Pointer()
+		assert.Equal(t, aPtr, bPtr, "labels should share the same slice backing array (slicelabels)")


Yes I think for this require is the better use. Updated in 4ebb43e

tcp13equals2 · 2025-12-30T01:22:12Z

pkg/util/limiter/compat_stringlabels_test.go

+	if len(aStr) > 0 && len(bStr) > 0 && len(aStr) == len(bStr) {
+		aPtr := unsafe.Pointer(unsafe.StringData(aStr))
+		bPtr := unsafe.Pointer(unsafe.StringData(bStr))
+		assert.Equal(t, aPtr, bPtr, "labels should share the same internal data pointer (stringlabels)")


Yes I think for this require is the better use. Updated in 4ebb43e

tcp13equals2 · 2025-12-30T01:24:53Z

pkg/util/limiter/query_limiter_test.go

+	returnedSeries1Dup, err := limiter.AddSeries(series1, memoryTracker)
 	assert.NoError(t, err)
+	assertSameLabels(t, returnedSeries1Dup, series1)
+	assert.Equal(t, 2, limiter.uniqueSeriesCount())


Should these be require. ?

Yes I think for this require is the better use. Updated in 4ebb43e

tcp13equals2

Hey @lamida - A few comments and questions added ... also note the CHANGELOG.md conflict.

charleskorn · 2026-01-05T01:47:15Z

pkg/util/limiter/compat_dedupelabels_test.go

+	if aData.Len() == 0 && bData.Len() == 0 {
+		return
+	}


If we do this, then we can drop the if condition on line 24 below, and it also covers Andrew's suggestion:

Suggested change

if aData.Len() == 0 && bData.Len() == 0 {

return

}

require.Equal(t, aData.Len(), bData.Len())

(similar feedback applies for other implementations for slicelabels and stringlabels)

Updated in 547b4e9

charleskorn · 2026-01-05T01:49:17Z

pkg/util/limiter/query_limiter.go

 	"github.com/grafana/mimir/pkg/util/validation"
 )

+type LabelsMemoryTrackerIncreaser interface {


Is this interface still needed?

Removed in 074265a. I also removed other memoryTracker interfaces which I find it unnecessary.

charleskorn · 2026-01-05T01:51:18Z

pkg/util/limiter/compat_dedupelabels_test.go

+	"github.com/prometheus/prometheus/model/labels"
+)
+
+func assertSameLabels(t *testing.T, a, b labels.Labels) {


Do we need to check the value of the syms field as well?

Just removed compat_dedudupelabels_test.go and compat_slidelabels_test.go as suggested in #13806 (comment)

charleskorn · 2026-01-05T01:52:00Z

pkg/util/limiter/compat_slicelabels_test.go

+	aSlice := aVal.FieldByName("labels")
+	bSlice := bVal.FieldByName("labels")


Does this work? labels.Labels is a slice for slicelabels.

Given we only ever use stringlabels for Mimir now, we could also just drop this file and compat_dedupelabels_test.go - if we ever build Mimir with the other implementations, we can add these methods then.

Let's just remove those files for now 4956d88.

charleskorn · 2026-01-05T01:55:16Z

pkg/querier/distributor_queryable.go

-		return series.LabelsToSeriesSet(ms)
+		return series.NewMemoryTrackingSeriesSet(series.LabelsToSeriesSet(ms), memoryTracker)


Is this change needed? Don't /series requests always use an unlimited memory consumption tracker?

charleskorn · 2026-01-05T02:02:38Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Could you please add a comment near the IncreaseMemoryConsumptionForLabels call above explaining why we always call it, even if the series is a duplicate, and why this is important for handling hash collisions?

Alternatively, I think we can solve this by moving the wrapping of the SeriesSet in MemoryTrackingSeriesSet so that a single MemoryTrackingSeriesSet wraps the merged result from all ingesters and store-gateways, rather than separately wrapping the SeriesSet for all ingesters with a MemoryTrackingSeriesSet and then wrapping the SeriesSet for all store-gateways in a separate MemoryTrackingSeriesSet. If we do this, then we'll need to think about how we handle hash conflicts.

cursor · 2026-01-07T04:52:48Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Memory consumption tracked before duplicate check causes over-counting

High Severity

In AddSeries, tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) is called at line 87 before checking if the series is a duplicate at lines 94-102. This means memory consumption is increased for every call, even when the series is a duplicate and the existing labels are returned. The comment at line 95-96 states "not counting up the memory consumption" for duplicates, but that's incorrect since memory was already tracked before the check. This causes memory to be over-counted for duplicate series, potentially causing queries to fail prematurely with memory limit errors when they shouldn't.

cursor · 2026-01-07T10:13:51Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Memory tracked for duplicates despite comment saying otherwise

High Severity

The tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) call at line 87 is executed before the duplicate check at lines 94-102. The comment at lines 95-96 states duplicates should not count toward memory consumption, but the memory has already been tracked by that point. This causes memory to be counted for every call including duplicates, potentially triggering premature memory limit errors when the goal was to deduplicate and save memory. The memory tracking call needs to happen only for non-duplicate series.

cursor · 2026-01-07T10:13:51Z

pkg/util/limiter/compat_stringlabels_test.go

+	aStr := aVal.FieldByName("data").String()
+	bStr := bVal.FieldByName("data").String()
+
+	require.Equal(t, len(aStr), len(aStr))


Test compares string length to itself instead of other

Low Severity

The assertion require.Equal(t, len(aStr), len(aStr)) compares aStr length to itself rather than comparing len(aStr) to len(bStr). This appears to be a copy-paste error that causes the test to always pass regardless of whether the two label strings have the same length, making this assertion ineffective at catching length mismatches.

…Matcher

cursor · 2026-01-07T10:40:06Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && uniqueSeriesBefore > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Memory consumption increased before duplicate check, causing over-counting

High Severity

In AddSeries, tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) is called at line 83 before checking if the labels are duplicates at lines 90-93. For duplicates, the function returns existing labels (not seriesLabels), but memory was already increased for seriesLabels. The comment at lines 91-92 explicitly states duplicates should NOT count memory consumption, but the code contradicts this. When MemoryTrackingSeriesSet.At() later decreases memory for the returned existing labels, the increase for the unused seriesLabels is never offset. This causes memory over-counting proportional to duplicate series count, potentially causing queries to be rejected prematurely for exceeding memory limits.

cursor · 2026-01-07T10:40:06Z

pkg/util/limiter/compat_stringlabels_test.go

+	aStr := aVal.FieldByName("data").String()
+	bStr := bVal.FieldByName("data").String()
+
+	require.Equal(t, len(aStr), len(aStr))


Test assertion compares variable to itself

Medium Severity

The requireSameLabels test helper function has a typo at line 25: require.Equal(t, len(aStr), len(aStr)) compares len(aStr) to itself instead of comparing len(aStr) to len(bStr). This assertion always passes, which means the test will never catch cases where two labels have different lengths. Combined with the conditional check on line 27, if labels have different lengths, the function silently returns without any assertion failure, allowing incorrect deduplication behavior to go undetected.

# Conflicts: # CHANGELOG.md

It will be difficult in the call to decrease to find which one is duplicated labels.

…bels memory consumption is decreased as the series iterated

…ll memory tracking cases

cursor · 2026-01-07T10:49:58Z

pkg/util/limiter/query_limiter.go

+	err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
+	if err != nil {
+		return labels.EmptyLabels(), err
+	}


Memory consumption counted for duplicate series despite comment

Medium Severity

The IncreaseMemoryConsumptionForLabels call on line 83 runs unconditionally before checking if the series is a duplicate. However, the comment on lines 91-92 explicitly states the intent is "not counting up the memory consumption" for duplicates. Since duplicate series reuse existing labels rather than allocating new ones, memory consumption is over-counted when duplicates are encountered. The IncreaseMemoryConsumptionForLabels call needs to be moved after the duplicate check to only count memory for new unique series.

Additional Locations (1)

pkg/util/limiter/query_limiter.go#L90-L92

cursor · 2026-01-07T10:49:58Z

pkg/util/limiter/compat_stringlabels_test.go

+	aStr := aVal.FieldByName("data").String()
+	bStr := bVal.FieldByName("data").String()
+
+	require.Equal(t, len(aStr), len(aStr))


Test assertion compares variable to itself, always passes

Low Severity

The assertion require.Equal(t, len(aStr), len(aStr)) compares len(aStr) to itself rather than comparing len(aStr) to len(bStr). This check always passes regardless of whether the two labels have the same length, making it a meaningless assertion. The requireSameLabels helper function is meant to verify two labels share the same internal data, but this broken assertion won't catch length mismatches between a and b.

cursor · 2026-01-07T11:13:52Z

pkg/util/limiter/query_limiter.go

+		if ql.maxSeriesPerQuery != 0 && uniqueSeriesBefore > ql.maxSeriesPerQuery {
+			return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
+		}
+		return existing, nil


Memory tracked for duplicate labels before duplicate check

High Severity

The IncreaseMemoryConsumptionForLabels call on line 83 happens unconditionally before checking if the labels are duplicates (lines 90-98). When a duplicate is found, the function returns the existing labels but memory was already tracked for the input labels. The comment on line 91-92 states "not counting up the memory consumption" for duplicates, but the code has already counted it. This causes memory over-counting for duplicate series, defeating the PR's optimization goal of properly tracking memory consumption for deduplicated labels. The IncreaseMemoryConsumptionForLabels call needs to be moved after the duplicate check, only for newly-seen labels.

charleskorn reviewed Dec 11, 2025

View reviewed changes

pkg/querier/blocks_store_queryable.go Outdated Show resolved Hide resolved

charleskorn reviewed Dec 11, 2025

View reviewed changes

pkg/util/limiter/query_limiter.go Outdated Show resolved Hide resolved

charleskorn reviewed Dec 11, 2025

View reviewed changes

pkg/util/limiter/query_limiter.go Outdated Show resolved Hide resolved

charleskorn reviewed Dec 11, 2025

View reviewed changes

pkg/util/limiter/query_limiter.go Outdated Show resolved Hide resolved

charleskorn reviewed Dec 14, 2025

View reviewed changes

lamida commented Dec 15, 2025

View reviewed changes

pkg/util/limiter/query_limiter.go Outdated Show resolved Hide resolved

charleskorn reviewed Dec 16, 2025

View reviewed changes

lamida force-pushed the lamida/dedupSeries branch from d0b33ee to ac6c728 Compare December 22, 2025 00:59

lamida marked this pull request as ready for review December 22, 2025 15:22

lamida requested a review from a team as a code owner December 22, 2025 15:22

cursor bot reviewed Dec 22, 2025

View reviewed changes

tacole02 reviewed Dec 22, 2025

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

tcp13equals2 reviewed Dec 30, 2025

View reviewed changes

charleskorn reviewed Jan 5, 2026

View reviewed changes

cursor bot reviewed Jan 7, 2026

View reviewed changes

lamida added 7 commits January 7, 2026 18:39

Deduplicate labels in AddSeries

e692543

Update test for duplicated series handling

3123bb3

Add comment explaining why we skip duplicate check in MetricsForLabel…

601e4db

…Matcher

Move comment closer to the line being commented

2661f3f

Add benchmark

033a254

Move memory consumption tracker after duplicate check

725c8ab

Use labels.Labels as pointer

3dea2e5

lamida added 8 commits January 7, 2026 18:39

Properly count conflict series

b991547

Make sure series with hashCollision also go through series limit check

3c69aec

Fix uniqueSeriesCount with conflictSeries map

944eb1f

Use reflection to assert that labels are the same instance

f820259

Tidy up benchmark

cee4d78

Put labels memory tracking inside AddSeries

83b82ac

Remove adding series result to slice

1d567f1

Add test to see how hash collision is handled

6f74bc7

cursor bot reviewed Jan 7, 2026

View reviewed changes

lamida added 17 commits January 7, 2026 18:40

Clean white noise

d58c2b5

# Conflicts: # CHANGELOG.md

Simplify hash collision check

6a4aa43

Move assertSameLabels to build tag compat files

1896741

No need to check symbol table for assertSameLabels dedupe labels

b7ab9c3

Track memory for labels despite it is duplicate.

b97e1b0

It will be difficult in the call to decrease to find which one is duplicated labels.

Remove unnecessary NoopMemoryTracker

da450ee

Use proper memory tracker in Distributor.MetricsForLabelMatchers

b5f0bfd

Remove NoopMemoryTracker from tests

f5368ec

Wrap series response with MemoryTrackingSeriesSet to make sure the la…

3aa1e3d

…bels memory consumption is decreased as the series iterated

Get memoryTracker from context one time and reuse the usage

4d67a78

Even optimise more

5d80393

Add UnlimitedMemoryConsumptionTracker to some tests

ea18ae8

Use require instead of assert in labels sameness check

4865e5b

Use require instead of if check of the data length

18912ba

Remove memoryTracker interfaces and just use the struct directly in a…

7c607e8

…ll memory tracking cases

Remove unused compat_<labels>_test for now

9b9a439

Reuse local variable

3534363

lamida force-pushed the lamida/dedupSeries branch from 00b9117 to 3534363 Compare January 7, 2026 10:41

cursor bot reviewed Jan 7, 2026

View reviewed changes

Fix require

dcf8840

cursor bot reviewed Jan 7, 2026

View reviewed changes

		aSlice := aVal.FieldByName("labels")
		bSlice := bVal.FieldByName("labels")

		return series.LabelsToSeriesSet(ms)
		return series.NewMemoryTrackingSeriesSet(series.LabelsToSeriesSet(ms), memoryTracker)

Deduplicate identical series labels and track their memory consumption in QueryLimiter.AddSeries #13806

Are you sure you want to change the base?

Deduplicate identical series labels and track their memory consumption in QueryLimiter.AddSeries #13806

Conversation

lamida commented Dec 10, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Dec 22, 2025

Choose a reason for hiding this comment

Memory consumption tracked before duplicate check, causing over-counting

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tcp13equals2 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lamida commented Dec 10, 2025 •

edited by cursor bot

Loading

tcp13equals2 left a comment •

edited

Loading