Skip to content

Conversation

@colega
Copy link
Contributor

@colega colega commented Dec 10, 2025

What this PR does

Builds a validationConfig struct that is populated once per request instead of calling the validation.Overrides methods with user ID multiple times for each one of the samples, series, and metadata.

We've seen that we're doing millions of these calls per distributor. @seizethedave did a huge improvement in grafana/dskit#843 that made the underlying GetConfig call cheaper, however I wanted to go deeper and avoid the atomic call and the map lookup at all.

Which issue(s) this PR fixes or relates to

Fixes N/A

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

Note

Build and reuse a per-request validation config to avoid repeated overrides lookups, refactoring distributor validation paths and tests accordingly.

  • Performance/Distributor:
    • Build validationConfig once per request via newValidationConfig(userID, overrides) and reuse across series/metadata validation.
    • Refactor validation flow to pass config through validateSeriesvalidateSamples/validateHistograms/validateLabels/cleanAndValidateMetadata.
  • Refactor:
    • Replace interface-based configs with concrete structs: sampleValidationConfig, labelValidationConfig, metadataValidationConfig using pre-resolved fields.
    • Update internal validators (validateSample, validateSampleHistogram, validateLabels, cleanAndValidateMetadata) to read fields directly from struct.
  • Tests/Benchmarks:
    • Update tests/benchmarks to create and pass validationConfig.
    • Add TestNewValidationConfigFieldCompleteness to ensure all validationConfig fields are populated.
  • Changelog:
    • Add enhancement: Distributor performance improvement for config retrieval in validation middleware.

Written by Cursor Bugbot for commit 8e46084. This will update automatically on new commits. Configure here.

@colega colega marked this pull request as ready for review December 10, 2025 17:20
@colega colega requested a review from a team as a code owner December 10, 2025 17:20
Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog LGTM

Signed-off-by: Oleg Zaytsev <[email protected]>
@colega colega changed the title Retrieve validation config once per request, not per sample distributor: retrieve validation config once per request, not per sample Dec 11, 2025
Copy link
Contributor

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

// newValidationConfig builds a validationConfig based on the passed overrides.
// TODO: This could still be more efficient, as each overrides call performs an atomic pointer retrieval and a lookup in a map,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on possibly making this more efficient in the future? Are you thinking about a low TTL cache to avoid the map calls? I was thinking about this but it could get tricky with invalidations if one of the fields changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about adding an Overrides.Visit(userID string, func(limits *Limits)) that would retrieve the *Limits from runtime config once, and call the callback which would be used to fill our config.

However, some configs have some extra logic in Overrides, and letting this code access fields of *Limits seems like leaking responsibilities.

It would make sense to refactor the entire Overrides so methods would be on the Limits struct, not on the overrides itself, and Overrides would delegate the calls to *Limits, then we use that here.

Anyway, this is good enough, let's revisit in 2032.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants