Add Extended range selectors proposal by roidelapluie · Pull Request #52 · prometheus/proposals

roidelapluie · 2025-05-02T10:36:44Z

No description provided.

free · 2025-05-02T15:04:00Z

proposals/2025-04-04_extended-range-selectors-semantics.md

+compromise aggregability. Instead, by duplicating the first data point at the
+beginning of the range, we maintain accuracy without introducing artificial data calculations.
+
+For queries at the current time ("now"), we also deliberately avoid extrapolation at the end of the range. While this **will always** show a slight drop for metrics with constant rates and "smoothed", this accurately reflects that the data is incomplete at the range boundary. In future iterations, we could introduce a special annotation to indicate this boundary condition if users find it problematic in practice.


smoothed works fine in the middle of a time series (i.e. for historical queries). But the fact that it will produce different results in a recording rule vs later on (beyond the usual race conditions that may lead to a couple of missing samples every now and then at the time of rule evaluation) means that it is really only useful for charts and such.

I suppose that's fine, since it's an optional modifier, but I would consider strongly discouraging its use in rules. E.g. if you scrape instance="A" with an offset of 1 second and instance="B" with an offset of 14 seconds (on a scrape interval of 15 seconds), then increase(foo{instance="A"}) will be consistently ~30% higher than increase(foo{instance="B"}) for a foo that increases at the same rate on both instances.

Also, metric smoothed won't even produce an output in real time, since there's never a next sample at the time of evaluation.

Yes, you are correct. In fact, rules would need a offset $scrape_interval to work as correctly as possible, which is something we already see at some locations.

But interpolating would also mean that if we have no new samples and the target is down, we would again extrapolate "wrongly" and invent data that will never happen.

In the same way, I did not want to add heuristics around staleness marker because: 1. They do not mean that counter has stopped, just that we can't get data. 2. Push use cases (like otel or other monitoring systems) will not produce staleness markers.

Leaving aside the fact that it shifts the resulting timestamps, AFAICT the offset approach is only a best-effort hack. In order for it to work reliably it would have to be something like offset <lookback_window>. An offset <scrape_interval> (or a multiple thereof; also a bit of a hack, BTW, because you'd be hardcoding the scrape interval in your query) breaks if you miss a scrape (or more). Whereas, even with a missing sample(s), the metric did actually experience an increase. So not producing an output under those conditions is very much incorrect behavior.

As for interpolation / extrapolation, I agree with you that they only make up misleading data. E.g. why should it be that a query like increase(foo[1ms]) would produce a non-zero output when foo has orders of magnitude lower resolution? How can anyone tell that the underlying metric actually increased smoothly within that 1 ms? (Vs it staying flat or experiencing the full increase all at once.)

IMHO, one should embrace the fact that the data has the resolution that it does, not whitewash it for the sake of smooth lines on a chart (while at the same time, not generating weird artifacts at the edges or around step increases, as the current implementation does).

proposals/2025-04-04_extended-range-selectors-semantics.md

sathieu · 2025-05-06T12:51:43Z

Hi @roidelapluie, I fail to see how this handle the following usecase (see prometheus/prometheus#3806): trivy operator provides a metric for vulnerabilities of a given Kubernetes pod in a given namespace by severity. For a new pod with vulnerabilities, the metric appears with the namespace,pod,severity labels and a non-zero value. How to have an alert for new vulnerabilities (probably per-namespace) with the new semantics?

beorn7

Thank you very much for doing this.

I generally like it. I'm not so sure about some edge cases, but we can go forward with an implementation and see how it works in practice. (It will be an experimental feature for a while, I assume, so we can iterate on it.)

I see in your Q&A that you decided to ignore staleness markers. Maybe you are right. I'm not so sure, but see above. I would still recommend to mention the fact that we plan to ignore staleness markers more prominently (where I added comments asking what to do with staleness markers).

The big remaining question is if anchored should work if there is no sample before the range. I was originally thinking it should only return anything if there is actually a sample before the range (or precisely at the beginning of the range). After reading everything, I'm not so sure anymore. I can imagine that the use case I'm often mentioning ("Give my disk usage growth over the last month" – and then I get the growth over the last hour without noticing) will mostly affect gauges, so foo - foo offset 30d will work just fine.

Essentially, I'm willing to give it a go pretty much precisely as you propose it here.

proposals/2025-04-04_extended-range-selectors-semantics.md

roidelapluie · 2025-05-13T13:46:55Z

The big remaining question is if anchored should work if there is no sample before the range. I was originally thinking it should only return anything if there is actually a sample before the range (or precisely at the beginning of the range). After reading everything, I'm not so sure anymore. I can imagine that the use case I'm often mentioning ("Give my disk usage growth over the last month" – and then I get the growth over the last hour without noticing) will mostly affect gauges, so foo - foo offset 30d will work just fine.

If you want to know how many HTTP requests you have over 1h:

sum(increase(http_requests_total[1h] anchored))

ignoring metrics without samples before the range would drop a lot of results (all newly created labels, which could be e.g. paths. so it would completely underestimate the result.

The ideal thing is CT: we insert a 0 and so the 0 for newly created labels is reported at the beginning of the range, so we do not miss anything.

beorn7 · 2025-05-29T21:15:38Z

Thanks for addressing my comments. I'm still caught between several events at work and in my usual life, but I will finally be able to tackle my review backlog next week, which will allow me to address the open questions.

In the meantime, it would be good to get some feedback from other interested Prometheus developers. I would much prefer more approvals on this than just mine.

beorn7

A few comments are still unaddressed, see my follow-ups.

proposals/2025-04-04_extended-range-selectors-semantics.md

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

beorn7

I would like to ask for a two-word addition, but otherwise, this LGTM.

Still waiting for the addition about minimum range that you talked about.

proposals/2025-04-04_extended-range-selectors-semantics.md

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

proposals/2025-04-04_extended-range-selectors-semantics.md

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

beorn7

I'm fine with the current state. As a lot of work is already ongoing and none of the team members have raised concerns so far, I would recommend merging this rather sooner than later to get out of this "formally still not approved" state. (The design doc does not have to represent the final implementation. In fact, most non-trivial design docs don't. We can still iterate on implementation details.)

bwplotka

Thanks for this; I'm supportive, LGTM

Thanks for considering the CreatedTimestamp feature in the design!

@realschwa would like to double check additionally from the GCM standpoint?

juliusv

Just took a first look at this as well. Overall the approach seems well thought through and reasonable, good work! A few editorial comments...

proposals/2025-04-04_extended-range-selectors-semantics.md

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

roidelapluie · 2025-07-03T14:33:29Z

@juliusv I have updated the proposal based on your comments.

juliusv · 2025-07-03T16:17:38Z

Thanks, looks good now regarding my comments :) No idea if it makes sense to include the anchoring for instant vector selectors, but fine either way.

roidelapluie · 2025-07-04T08:02:44Z

Thank you all for this. Based on the three approvals, I will merge this. We will probably update this or revisit it later, based on the results we gather from the field.

ringerc · 2025-08-06T02:01:51Z

Regarding

What about influence from datapoints far outside the query range?

Are ranges always both left-anchored and right-anchored with the anchored keyword? My reading suggests so.

How far will an anchored range look back and/or forward for the anchoring point(s)? The delta lookback (staleness threshold)?

What happens if no anchoring point exists in the series? Will the behaviour modifications discussed for anchored series still apply when no anchoring point is found?

frittentheke · 2025-08-06T09:54:43Z

Am I right in reading that the observations about handling of e.g. a single missing data point (prometheus/prometheus#16891) is also covered in this proposal here?

…#13398) #### What this PR does This PR adds support for the `anchored` and `smoothed` range selector modifiers into the Mimir query engine. ie `eval instant at 60s metric[1m] anchored` or `eval instant at 60s metric[1m] smoothed` This is the equivalent implementation to [prometheus/pull/16457](prometheus/prometheus#16457 ) and the original proposal for this extension can be [found here](prometheus/proposals#52). A new CLI argument has been added to enable these extensions; ```-query-frontend.enabled-promql-extended-range-selectors=smoothed,anchored``` The `experimental_functions.go` middleware has been extended (and renamed) to also manage the tenant access to use these extensions. As part of adding support for this, the error messages when a function/aggregate/modifier is not enabled has been updated to correctly identify if the denied keyword is a function, aggregate or extended range modifier. #### Which issue(s) this PR fixes or relates to Fixes #3260 #### Checklist - [ x] Tests updated. - [ ] Documentation added. - [x ] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry is not needed, please add the `changelog-not-needed` label to the PR. - [x ] [`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md) updated with experimental features. Notes to reviewers; * there is a prometheus PR which will need to be merged and vendored into mimir - prometheus/prometheus#17479. This is not a blocker, but once merged will enable full use of upstream test files. The other note is to call attention the oddity in the use of `smoothed` ranges in rate/increase functions. Prometheus calculates the range boundary values differently when the range is used in one of these functions. In the prometheus implementation, the entire matrix is re-walked to do these different calculations. See https://github.com/prometheus/prometheus/blob/main/promql/functions.go#L128-L173 To avoid this, you will see that I pre-calculate these alternate values on the range boundary and pass these along with the step data. These are interpolated calculations and it would seem more efficient to pre-calculate these and not use them rather then to have to later re-walk the data. I also did not attempt to determine if the parent of the range vector was a rate/increase function and use this to decide which boundary point calculation to use. The reason for this was in case this range vector result was ever cached / re-used then this implementation may be confusing.  --- > [!NOTE] > Adds MQE support for experimental `smoothed` and `anchored` range selector modifiers with per-tenant enablement, updated planning/execution, and docs/flags. > > - **MQE/Execution**: > - Implement extended range selector modifiers `smoothed` and `anchored` in range and instant selectors, including extended-boundary handling and smoothed basis points for rate/increase. > - Update rate/increase/delta logic to honor modifiers; disallow with native histograms; validate modifier-function compatibility. > - Enable parser extended-range selectors in frontend/querier; clone `Anchored`/`Smoothed` in AST. > - **Planning/Protocol**: > - Propagate modifier flags through planning (core proto), raise minimum plan version to `V4`. > - Add planner errors for unsupported functions with these modifiers. > - **Middleware/Gating**: > - Replace experimental functions middleware with unified experimental features middleware to gate functions, aggregations, and extended range modifiers; improve error messages per feature type. > - **Config/CLI/Docs**: > - Add `-query-frontend.enabled-promql-extended-range-selectors` (per-tenant) with defaults, help text, config descriptor, and operations defaults. > - Update docs (`about-versioning`, configuration parameters) and CHANGELOG entries. > - **Tests/Benchmarks**: > - Add extensive unit/plan tests, testdata, and benchmark cases for new modifiers. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 49b1131. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>  --------- Co-authored-by: Taylor C <41653732+tacole02@users.noreply.github.com> Co-authored-by: Charles Korn <charles.korn@grafana.com> Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com>

roidelapluie added the proposal label May 2, 2025

This was referenced May 2, 2025

Proposal for improving rate/increase prometheus/prometheus#3806

Closed

Add anchored and smoothed to vector selectors. prometheus/prometheus#16457

Merged

free reviewed May 2, 2025

View reviewed changes

charleskorn reviewed May 4, 2025

View reviewed changes

proposals/2025-04-04_extended-range-selectors-semantics.md Show resolved Hide resolved

beorn7 requested changes May 11, 2025

View reviewed changes

beorn7 requested changes Jun 5, 2025

View reviewed changes

roidelapluie and others added 13 commits June 10, 2025 12:30

Add Extended range selectors proposal

fc26880

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Link to Proof of Concept

e6ea535

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Formating

b25ec46

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Improve rate issue

08731a8

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Make path consistent

5dbf737

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Add note about recording rules

165de73

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Remove smoothed and instant selectors

71f5548

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

Update proposals/2025-04-04_extended-range-selectors-semantics.md

edf2fbd

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Update proposals/2025-04-04_extended-range-selectors-semantics.md

ac21c9e

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Update proposals/2025-04-04_extended-range-selectors-semantics.md

79b412f

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Update proposals/2025-04-04_extended-range-selectors-semantics.md

3bea948

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Update proposals/2025-04-04_extended-range-selectors-semantics.md

0366017

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Apply feedback

65df650

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

roidelapluie force-pushed the extended-range-selectors branch from f93cfae to 65df650 Compare June 10, 2025 10:31

wilfriedroset mentioned this pull request Jun 16, 2025

New proposal: Add support for x-functions in MQE grafana/mimir#10305

Closed

4 tasks

beorn7 mentioned this pull request Jun 17, 2025

PromQL: Warn about rate and similar calculations that fail for a lack of samples covered by the used range. prometheus/prometheus#14818

Open

beorn7 requested changes Jun 18, 2025

View reviewed changes

proposals/2025-04-04_extended-range-selectors-semantics.md Outdated Show resolved Hide resolved

roidelapluie and others added 2 commits June 18, 2025 15:53

Update proposals/2025-04-04_extended-range-selectors-semantics.md

6ec0739

Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>

Add a note about range interval shorter than step for smoothed

18a64c9

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

beorn7 reviewed Jun 18, 2025

View reviewed changes

proposals/2025-04-04_extended-range-selectors-semantics.md Outdated Show resolved Hide resolved

Remove note about step.

7f65192

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

This was referenced Jun 27, 2025

promql: implement integral() function prometheus/prometheus#14440

Open

Add step(), min() and max() in promql duration expressions prometheus/prometheus#16777

Merged

beorn7 approved these changes Jul 1, 2025

View reviewed changes

bwplotka approved these changes Jul 2, 2025

View reviewed changes

juliusv reviewed Jul 3, 2025

View reviewed changes

xrate: Update based on comments

cda3a9c

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

roidelapluie merged commit 9f74f25 into prometheus:main Jul 4, 2025
2 checks passed

tcp13equals2 mentioned this pull request Nov 7, 2025

Add support for extended range selector modifiers smoothed & anchored grafana/mimir#13398

Merged

1 task

Comments

Conversation

roidelapluie commented May 2, 2025

Uh oh!

free May 2, 2025

Choose a reason for hiding this comment

Uh oh!

roidelapluie May 2, 2025

Choose a reason for hiding this comment

Uh oh!

free May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sathieu commented May 6, 2025

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roidelapluie commented May 13, 2025

Uh oh!

beorn7 commented May 29, 2025

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

juliusv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roidelapluie commented Jul 3, 2025

Uh oh!

juliusv commented Jul 3, 2025

Uh oh!

roidelapluie commented Jul 4, 2025

Uh oh!

Uh oh!

ringerc commented Aug 6, 2025

Uh oh!

frittentheke commented Aug 6, 2025

Uh oh!

free May 5, 2025 •

edited

Loading