Skip to content

Conversation

jlambatl
Copy link
Contributor

Summary

This change adds a new AggregateMode distribution that aggregates individual metrics into a distribution. This doesn't reduce the accuracy of the metrics; instead, it reduces the volume of payloads being emitted from Vector by reducing the overhead of duplicate tags and other fields in exchange for a single larger payload.

This pattern suits real-time analytics databases, where histograms and other statistical functions can be calculated on the fly. It reduces the need to manually codify histogram buckets, which can become an onerous task on systems dealing with large metric volumes.

The distribution is sorted in the current implementation. I have been unsure whether that is a good idea, but I'm happy to change that based on your feedback.

Vector configuration

[transforms.my_transform_id]
type = "aggregate"
inputs = [ "my-source-or-transform-id" ]
interval_ms = 10_000
mode = "distribution"

How did you test this PR?

I have added many different unit tests that cover the primary aggregation types, histograms, distributions, etc.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user-facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

Notes

N/A

@jlambatl jlambatl requested review from a team as code owners October 13, 2025 04:04
@github-actions github-actions bot added domain: transforms Anything related to Vector's transform components domain: external docs Anything related to Vector's external, public documentation labels Oct 13, 2025
@jlambatl jlambatl changed the title enhancement(transform aggregate): Add Distribution as an aggregation mode enhancement(aggregate): Add Distribution as an aggregation mode Oct 13, 2025
kind: MetricKind::Absolute,
value: MetricValue::Distribution {
samples,
statistic: StatisticKind::Histogram,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense if this was producing a Summary? Since we are aggregating raw metrics.

…improve the documentation that is generated for the distribution aggregation mode in the configuration parameter
@jlambatl
Copy link
Contributor Author

I've pushed up a commit that should resolve the comments/feedback provided. It's hard to find the balance between being too verbose and not providing enough clarity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants