Skip to content

Proposal: OTEL delta temporality support #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

fionaliao
Copy link

@fionaliao fionaliao commented Mar 27, 2025

@fionaliao fionaliao force-pushed the fionaliao/delta-proposal branch from 1c3370c to 3c9ea52 Compare March 27, 2025 16:34
@fionaliao fionaliao marked this pull request as ready for review March 27, 2025 16:35
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
Signed-off-by: Fiona Liao <[email protected]>
@fionaliao fionaliao force-pushed the fionaliao/delta-proposal branch from 3c9ea52 to 8094034 Compare March 27, 2025 16:36
@fionaliao
Copy link
Author

I'm planning to make some updates to the proposal after the dev summit (the consensus was that we want to support push-based usage cases and continue to explore delta support) and some additional discussion with @beorn7 .

Based on these discussions, I also have a PR up for basic native delta support (behind a feature flag) in Prometheus - this just stores the sample value at TimeUnixNano without additional labels: prometheus/prometheus#16360, and for querying, advising people to use sum_over_time (/ interval) for now. The idea is that we'd first have this simple case without changing PromQL function and get some feedback, that could help figure out how to go forward in terms of temporality-aware functions. I'll update the proposal with this.

Additional updates to make:

  • Flesh out pros/cons of __temporality__ label vs delta_ types (I am actually leaning more into the temporality label now)
  • Add an example of query interval < collection interval which could mess up rate calculations
  • Add some stuff about the serverless/ephemeral jobs use case. This is not specific to OTEL per-se, but this was the use case that kept coming up when talking about deltas to various people/users during kubecon.

I'm out for the next week, but will apply the updates after that

* Simplified proposal - moved CT-per-sample to possible future extension instead of embedding within proposal
* Changed proposal to have a new `__temporality__` label instead of extending `__type__` - probably better to keep metric type concept distinct from metric temporality. This also aligns with how OTEL models it.
* Updated remote-write section - delta ingestion will actually be fully supported via remote write (since CT-per-sample is moved out of main proposal for now)
* Moved temporary `delta_rate()` and `delta_increase()` functions suggestion to discarded alternative - not sure this is actually necessary if we have feature flag for temporality-aware functions anyway
* Fleshed out implementation plan

Signed-off-by: Fiona Liao <[email protected]>
@fionaliao
Copy link
Author

Updates:

  • Simplified proposal - moved CT-per-sample to possible future extension instead of embedding within proposal
  • Changed proposal to have a new __temporality__ label instead of extending __type__ - probably better to keep metric type concept distinct from metric temporality. This also aligns with how OTEL models it.
  • Updated remote-write section - delta ingestion will actually be fully supported via remote write (since CT-per-sample is moved out of main proposal for now)
  • Moved temporary delta_rate() and delta_increase() functions suggestion to discarded alternative - not sure this is actually necessary if we have feature flag for temporality-aware functions anyway
  • Fleshed out implementation plan


#### rate() calculation

In general: `sum of second to last sample values / (last sample ts - first sample ts)) * range`. We skip the value of the first sample as we do not know its interval.
Copy link

@enisoc enisoc Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We skip the value of the first sample as we do not know its interval.

Perhaps we could get utility out of the first sample's value by guessing that each sample's interval is equal to the average time between samples in the window. One motivation for this is a case that we see often with non-sparse deltas produced by an intermediate processor.

Suppose the actual instrumented app is sending delta samples at a regular 60s interval. We'll assume for simplicity that these deltas are always greater than zero. Then there is an intermediate processor that's configured to accumulate data and flush every 30s. To avoid spareness, it's configured to flush a 0 value if nothing was seen.

The data stream will look like this, with a sample every 30s:

5 0 2 0 10 0 8 0

Note that every other value is 0 because of the mismatch between the flush intervals of the instrumented app and the intermediate processor.

If we then do a rate(...[1m]) on this timeseries, with the current proposal, we might end up with the 1m windows broken up like this:

5 0 | 2 0 | 10 0 | 8 0

If we can't make use of the value from the first sample in each window, we will end up computing a rate of 0 for all of these windows. That feels like it fails to make use of all the available information, since as humans we can clearly see that the rate was not zero.

If instead we guess that each sample represents the delta for a 30s interval, because that's the average distance between the two datapoints in the window, then we will compute the correct rates. Of course it was only a guess and you could contrive a scenario that would fool the guess, but the idea would be to assume that the kind of scenario described here is more likely than those weird ones.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cumulative series could generate the same values (with the zeroes being where the series resets). And in that case rate() would return no results. So though this doesn't accurately capture the rate, the behaviour would be consistent for cumulative and delta metrics.

However, the first point in a window is more meaningful in the delta case - you know it's a delta from the preceeding sample while in the cumulative case you have to look outside the window to get the same information, so maybe we should do better because of that. That example is leaning me more towards "just do sum_over_time() / range for delta rate()" - in this case that would probably give more useful information. Or at least do that before CT-per-sample available, at which point we'd have more accurate interval data.


Downsides:

* This will not work if there is only a single sample in the range, which is more likely with delta metrics (due to sparseness, or being used in short-lived jobs).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cumulative samples, it makes sense that with a single sample in the window you can't guess anything at all about how much increase happened in that window. With a single delta sample, even if we don't know the start time, we should be able to make a better guess than "no increase happened".

For example, we could guess that the interval is equal to the window size -- in other words return the single delta value as is with no extrapolation. The argument would be that you picked an arbitrary window of size T and found 1 sample, so the best guess for the frequency of the samples is 1/T. This seems like it would be more useful on average than returning no value in the case of a single sample.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is with mixing extrapolation and non-extrapolation logic because that might end up surprising users.

if we do decide to generally extrapolate to fill the whole window, but have this special case for a single datapoint, someone might rely on the non-extrapolation behaviour and get surprised when there are two points and it changes .

@fionaliao fionaliao force-pushed the fionaliao/delta-proposal branch from d0474da to f2433c8 Compare April 18, 2025 16:08
@fionaliao
Copy link
Author

Next steps for this proposal:

  • Wait for type and unit metadata proposal to be finalised, which might result in updates to how exactly the temporality label will be implemented
  • Get the primitive otel delta support PR merged, hopefully having that out will help get some feedback on querying should be done
  • Write some code to experiment with delta rate implementations, see what edge cases there are for each option

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants