Skip to content

Conversation

@lamida
Copy link
Contributor

@lamida lamida commented Dec 10, 2025

https://github.com/grafana/mimir-squad/issues/3280

Reuse the labels when we new the series use the same labels seen before.

Benchmark

Details
 go test -bench='BenchmarkQueryLimiter_AddSeries_WithCallerDedup_(NoDuplicates|50pct|90pct)$' \
    -benchmem \
    -benchtime=5000x \
    -count=5 \
    -run=^$ \
    ./pkg/util/limiter/
goos: darwin
goarch: arm64
pkg: github.com/grafana/mimir/pkg/util/limiter
cpu: Apple M4 Pro
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22913 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22727 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22658 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22818 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_NoDuplicates-14    	    5000	     22935 ns/op	      21 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25453 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25028 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     24982 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25298 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_90pct-14           	    5000	     25941 ns/op	       1 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     33348 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22389 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     23021 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22539 ns/op	      10 B/op	       0 allocs/op
BenchmarkQueryLimiter_AddSeries_WithCallerDedup_50pct-14           	    5000	     22392 ns/op	      10 B/op	       0 allocs/op
PASS
ok  	github.com/grafana/mimir/pkg/util/limiter	2.534s

The benchmark shows that with more duplicated labels B/op metrics are becoming lower.

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

Note

Introduces label de-duplication and memory tracking in query limiting and propagates it through the read path.

  • Refactors QueryLimiter.AddSeries to AddSeries(lbls, tracker) (labels.Labels, error), storing canonical labels per fingerprint, enforcing limits, and accounting memory via MemoryConsumptionTracker
  • Threads a MemoryConsumptionTracker through contexts and callers: distributor (MetricsForLabelMatchers, QueryStream), querier (distributor and store-gateway streaming, series set wrapping), and ingester/store-gateway stream readers (now take *limiter.MemoryConsumptionTracker)
  • Callers updated to use returned canonical labels; tests adjusted to add trackers to context and assert label instance reuse; adds benchmarks for duplicate/unique series and a compatibility test for stringlabels
  • CHANGELOG: adds ENHANCEMENT for label de-dup in querier limiter and a default note for -compactor.upload-sparse-index-headers

Written by Cursor Bugbot for commit dcf8840. This will update automatically on new commits. Configure here.

result := make([]labels.Labels, 0, len(metrics))
for _, m := range metrics {
if err := queryLimiter.AddSeries(m); err != nil {
uniqueSeriesLabels, err := queryLimiter.AddSeries(m, mimir_limiter.NoopMemoryTracker{})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called from /series endpoint in which we are not tracking memory consumption, hence we use NoopMemoryTracker here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use the memory consumption tracker from the context. If the /series endpoint doesn't provide one, then we'll need to add it in for that endpoint, just like we did for the other endpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in f75c488

Comment on lines 19 to 23
type MemoryTracker interface {
IncreaseMemoryConsumptionForLabels(labels labels.Labels) error
}

type NoopMemoryTracker struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed, we can create a MemoryConsumptionTracker with limit set to 0 to disable enforcing a memory consumption limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 3107044

result := make([]labels.Labels, 0, len(metrics))
for _, m := range metrics {
if err := queryLimiter.AddSeries(m); err != nil {
uniqueSeriesLabels, err := queryLimiter.AddSeries(m, mimir_limiter.NoopMemoryTracker{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use the memory consumption tracker from the context. If the /series endpoint doesn't provide one, then we'll need to add it in for that endpoint, just like we did for the other endpoints.

Comment on lines 87 to 99
uniqueSeriesBefore := len(ql.uniqueSeries)
ql.uniqueSeries[fingerprint] = struct{}{}
uniqueSeriesAfter := len(ql.uniqueSeries)
uniqueSeriesBefore := len(ql.uniqueSeries) + countConflictSeries(ql.conflictSeries)
var found bool
var existing labels.Labels
var newSeriesButHashCollided bool
if existing, found = ql.uniqueSeries[fingerprint]; !found || labels.Equal(existing, seriesLabels) {
if !found {
// This is unique new series.
ql.uniqueSeries[fingerprint] = seriesLabels
err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
if err != nil {
return labels.EmptyLabels(), err
}
}
} else {
// Conflicted hash is found.
if ql.conflictSeries == nil {
ql.conflictSeries = make(map[uint64][]labels.Labels)
}
l := ql.conflictSeries[fingerprint]
for _, prev := range l {
// Labels matches with previous series, return the same labels instance.
if labels.Equal(prev, seriesLabels) {
return prev, nil
}
}
newSeriesButHashCollided = true
ql.conflictSeries[fingerprint] = append(l, seriesLabels)
err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
if err != nil {
return labels.EmptyLabels(), err
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this a bit given hash conflicts are rare and we were previously OK with undercounting in the event of a hash conflict:

  • if the hash has never been seen before: add it to uniqueSeries and return the labels passed in
  • if the hash has been seen before:
    • check if the corresponding labels in uniqueSeries are the same, and if so, return the labels from uniqueSeries
    • if they're not the same (ie. this is a conflict where two different sets of labels return the same hash), return the labels passed in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied as suggested 370860f

Comment on lines 163 to 168
// Manually set up the collision state in the limiter
limiter.conflictSeries = make(map[uint64][]labels.Labels)
// The first series is in uniqueSeries map.
limiter.uniqueSeries[collisionHash] = series1
// If another series hash is colliding, AddSeries will add it to conflictSeries map.
limiter.conflictSeries[collisionHash] = []labels.Labels{series2}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? Couldn't we call limiter.AddSeries(series1), limiter.AddSeries(series2) etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we won't test hash collision won't we? Nevertheless I refactored the hash check implementation to simplify it as suggested here.

Comment on lines 410 to 412
// Try stringlabels implementation (default)
// stringlabels stores data in a "data" field of type string
if aData := aVal.FieldByName("data"); aData.IsValid() && aData.Kind() == reflect.String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing this runtime detection of which implementation is in use, what if we used the slicelabels build tag to include a different implementation based on which build tag is enabled?

This is how we handle the different implementations in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored this into different build tagged files in 885b3d1. However I still use reflection to get the internal field so that we can compare the sameness of internal reference.

@lamida lamida force-pushed the lamida/dedupSeries branch from d0b33ee to ac6c728 Compare December 22, 2025 00:59
@lamida lamida marked this pull request as ready for review December 22, 2025 15:22
@lamida lamida requested a review from a team as a code owner December 22, 2025 15:22
if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory consumption tracked before duplicate check, causing over-counting

The IncreaseMemoryConsumptionForLabels call at line 87 occurs unconditionally BEFORE the duplicate label check at lines 94-102. The comment on lines 95-96 states this branch is "not counting up the memory consumption" for duplicates, but memory has already been counted. This causes duplicate series labels to be counted multiple times in the memory tracker, potentially causing queries to be rejected with memory limit errors even when actual unique memory usage is within limits. The memory consumption increase belongs after the duplicate check, only for newly-seen series.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. If we have to track the unique labels only, there is no easy way to reduce memory consumption as we iterate the series when processing the series only for that unique labels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a comment near the IncreaseMemoryConsumptionForLabels call above explaining why we always call it, even if the series is a duplicate, and why this is important for handling hash collisions?

Alternatively, I think we can solve this by moving the wrapping of the SeriesSet in MemoryTrackingSeriesSet so that a single MemoryTrackingSeriesSet wraps the merged result from all ingesters and store-gateways, rather than separately wrapping the SeriesSet for all ingesters with a MemoryTrackingSeriesSet and then wrapping the SeriesSet for all store-gateways in a separate MemoryTrackingSeriesSet. If we do this, then we'll need to think about how we handle hash conflicts.

// labels and not counting up the memory consumption.
if found && labels.Equal(existing, seriesLabels) {
// Still return error if the duplicated labels had been exceeding the limit.
if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] You could re-use the uniqueSeriesBefore rather then call len(ql.uniqueSeries) again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 00b9117

aPtr := aData.Pointer()
bPtr := bData.Pointer()
assert.Equal(t, aPtr, bPtr, "labels should share the same data slice (dedupelabels)")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding a failure assertion to the end of each of the assertSameLabels functions, so that if another user calls these expecting a full label comparison the function will fail if the given labels have different lengths/values etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced assert with require.

if aData.Len() > 0 && bData.Len() > 0 && aData.Len() == bData.Len() {
aPtr := aData.Pointer()
bPtr := bData.Pointer()
assert.Equal(t, aPtr, bPtr, "labels should share the same data slice (dedupelabels)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a require.Equal ?

if aSlice.Len() > 0 && bSlice.Len() > 0 && aSlice.Leng() == bSlice.Len() {
aPtr := aSlice.Pointer()
bPtr := bSlice.Pointer()
assert.Equal(t, aPtr, bPtr, "labels should share the same slice backing array (slicelabels)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think for this require is the better use. Updated in 4ebb43e

if len(aStr) > 0 && len(bStr) > 0 && len(aStr) == len(bStr) {
aPtr := unsafe.Pointer(unsafe.StringData(aStr))
bPtr := unsafe.Pointer(unsafe.StringData(bStr))
assert.Equal(t, aPtr, bPtr, "labels should share the same internal data pointer (stringlabels)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think for this require is the better use. Updated in 4ebb43e

returnedSeries1Dup, err := limiter.AddSeries(series1, memoryTracker)
assert.NoError(t, err)
assertSameLabels(t, returnedSeries1Dup, series1)
assert.Equal(t, 2, limiter.uniqueSeriesCount())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be require. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think for this require is the better use. Updated in 4ebb43e

Copy link
Contributor

@tcp13equals2 tcp13equals2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lamida - A few comments and questions added ... also note the CHANGELOG.md conflict.

Comment on lines 21 to 23
if aData.Len() == 0 && bData.Len() == 0 {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, then we can drop the if condition on line 24 below, and it also covers Andrew's suggestion:

Suggested change
if aData.Len() == 0 && bData.Len() == 0 {
return
}
require.Equal(t, aData.Len(), bData.Len())

(similar feedback applies for other implementations for slicelabels and stringlabels)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 547b4e9

"github.com/grafana/mimir/pkg/util/validation"
)

type LabelsMemoryTrackerIncreaser interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this interface still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 074265a. I also removed other memoryTracker interfaces which I find it unnecessary.

"github.com/prometheus/prometheus/model/labels"
)

func assertSameLabels(t *testing.T, a, b labels.Labels) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the value of the syms field as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removed compat_dedudupelabels_test.go and compat_slidelabels_test.go as suggested in #13806 (comment)

Comment on lines 19 to 20
aSlice := aVal.FieldByName("labels")
bSlice := bVal.FieldByName("labels")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? labels.Labels is a slice for slicelabels.

Given we only ever use stringlabels for Mimir now, we could also just drop this file and compat_dedupelabels_test.go - if we ever build Mimir with the other implementations, we can add these methods then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just remove those files for now 4956d88.

Comment on lines -123 to +127
return series.LabelsToSeriesSet(ms)
return series.NewMemoryTrackingSeriesSet(series.LabelsToSeriesSet(ms), memoryTracker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed? Don't /series requests always use an unlimited memory consumption tracker?

if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a comment near the IncreaseMemoryConsumptionForLabels call above explaining why we always call it, even if the series is a duplicate, and why this is important for handling hash collisions?

Alternatively, I think we can solve this by moving the wrapping of the SeriesSet in MemoryTrackingSeriesSet so that a single MemoryTrackingSeriesSet wraps the merged result from all ingesters and store-gateways, rather than separately wrapping the SeriesSet for all ingesters with a MemoryTrackingSeriesSet and then wrapping the SeriesSet for all store-gateways in a separate MemoryTrackingSeriesSet. If we do this, then we'll need to think about how we handle hash conflicts.

if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory consumption tracked before duplicate check causes over-counting

High Severity

In AddSeries, tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) is called at line 87 before checking if the series is a duplicate at lines 94-102. This means memory consumption is increased for every call, even when the series is a duplicate and the existing labels are returned. The comment at line 95-96 states "not counting up the memory consumption" for duplicates, but that's incorrect since memory was already tracked before the check. This causes memory to be over-counted for duplicate series, potentially causing queries to fail prematurely with memory limit errors when they shouldn't.

Fix in Cursor Fix in Web

if ql.maxSeriesPerQuery != 0 && len(ql.uniqueSeries) > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory tracked for duplicates despite comment saying otherwise

High Severity

The tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) call at line 87 is executed before the duplicate check at lines 94-102. The comment at lines 95-96 states duplicates should not count toward memory consumption, but the memory has already been tracked by that point. This causes memory to be counted for every call including duplicates, potentially triggering premature memory limit errors when the goal was to deduplicate and save memory. The memory tracking call needs to happen only for non-duplicate series.

Fix in Cursor Fix in Web

aStr := aVal.FieldByName("data").String()
bStr := bVal.FieldByName("data").String()

require.Equal(t, len(aStr), len(aStr))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test compares string length to itself instead of other

Low Severity

The assertion require.Equal(t, len(aStr), len(aStr)) compares aStr length to itself rather than comparing len(aStr) to len(bStr). This appears to be a copy-paste error that causes the test to always pass regardless of whether the two label strings have the same length, making this assertion ineffective at catching length mismatches.

Fix in Cursor Fix in Web

if ql.maxSeriesPerQuery != 0 && uniqueSeriesBefore > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory consumption increased before duplicate check, causing over-counting

High Severity

In AddSeries, tracker.IncreaseMemoryConsumptionForLabels(seriesLabels) is called at line 83 before checking if the labels are duplicates at lines 90-93. For duplicates, the function returns existing labels (not seriesLabels), but memory was already increased for seriesLabels. The comment at lines 91-92 explicitly states duplicates should NOT count memory consumption, but the code contradicts this. When MemoryTrackingSeriesSet.At() later decreases memory for the returned existing labels, the increase for the unused seriesLabels is never offset. This causes memory over-counting proportional to duplicate series count, potentially causing queries to be rejected prematurely for exceeding memory limits.

Fix in Cursor Fix in Web

aStr := aVal.FieldByName("data").String()
bStr := bVal.FieldByName("data").String()

require.Equal(t, len(aStr), len(aStr))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertion compares variable to itself

Medium Severity

The requireSameLabels test helper function has a typo at line 25: require.Equal(t, len(aStr), len(aStr)) compares len(aStr) to itself instead of comparing len(aStr) to len(bStr). This assertion always passes, which means the test will never catch cases where two labels have different lengths. Combined with the conditional check on line 27, if labels have different lengths, the function silently returns without any assertion failure, allowing incorrect deduplication behavior to go undetected.

Fix in Cursor Fix in Web

@lamida lamida force-pushed the lamida/dedupSeries branch from 00b9117 to 3534363 Compare January 7, 2026 10:41
err := tracker.IncreaseMemoryConsumptionForLabels(seriesLabels)
if err != nil {
return labels.EmptyLabels(), err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory consumption counted for duplicate series despite comment

Medium Severity

The IncreaseMemoryConsumptionForLabels call on line 83 runs unconditionally before checking if the series is a duplicate. However, the comment on lines 91-92 explicitly states the intent is "not counting up the memory consumption" for duplicates. Since duplicate series reuse existing labels rather than allocating new ones, memory consumption is over-counted when duplicates are encountered. The IncreaseMemoryConsumptionForLabels call needs to be moved after the duplicate check to only count memory for new unique series.

Additional Locations (1)

Fix in Cursor Fix in Web

aStr := aVal.FieldByName("data").String()
bStr := bVal.FieldByName("data").String()

require.Equal(t, len(aStr), len(aStr))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertion compares variable to itself, always passes

Low Severity

The assertion require.Equal(t, len(aStr), len(aStr)) compares len(aStr) to itself rather than comparing len(aStr) to len(bStr). This check always passes regardless of whether the two labels have the same length, making it a meaningless assertion. The requireSameLabels helper function is meant to verify two labels share the same internal data, but this broken assertion won't catch length mismatches between a and b.

Fix in Cursor Fix in Web

if ql.maxSeriesPerQuery != 0 && uniqueSeriesBefore > ql.maxSeriesPerQuery {
return labels.EmptyLabels(), NewMaxSeriesHitLimitError(uint64(ql.maxSeriesPerQuery))
}
return existing, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory tracked for duplicate labels before duplicate check

High Severity

The IncreaseMemoryConsumptionForLabels call on line 83 happens unconditionally before checking if the labels are duplicates (lines 90-98). When a duplicate is found, the function returns the existing labels but memory was already tracked for the input labels. The comment on line 91-92 states "not counting up the memory consumption" for duplicates, but the code has already counted it. This causes memory over-counting for duplicate series, defeating the PR's optimization goal of properly tracking memory consumption for deduplicated labels. The IncreaseMemoryConsumptionForLabels call needs to be moved after the duplicate check, only for newly-seen labels.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants