Skip to content

Aggregates computation #463

Open
Open
@DifferentialOrange

Description

@DifferentialOrange

See Notion RFC (private) for more detailed info.

Since aggregates are related to the core package, it is proposed to implement them there. 'aggregates' is a new metrics section enabled through metrics.cfg/metrics.enable_default_metrics. Since all sections are enabled by default with metrics.cfg{include='all'}, it will be enabled by default too.

Enabling 'aggregates' adds a new callback to callback registry. The callback iterates through existing collectors and computes their aggregates. The following aggregates will be computed:

  • rate for counter collectors: per second rate of value change for the last two observations;
  • min for gauge collectors: minimal value for the history of observations;
  • max for gauge collectors: maximal value for the history of observations;
  • average for histogram and summary collectors: observations average value (over all history of observations).

See [1] for implementation example.

The results are stored in corresponding gauge collectors (in the common registry). The name of the collector is <base_name> .. '_per_second' for rate (where counter name is <base_name> .. '_count'), <base_name> .. '_min' and <base_name> .. '_max' for min and max (where gauge name is <base_name>), <base_name> .. _average for average (where histogram/summary name is <base_name>). (Names for v2 naming policy will be discussed during solving v2 naming policy issue.) Each collector is labeled with metainfo.aggregate = true. Module stores a single copy of previous observations to compute aggregates.

Each callback is triggered on export collect. It is possible to have two consecutive collects. If aggregates are computed on each callback trigger, it is possible to have confusing rate values. For example, for two consecutive collects it is highly likely that rate of change will always be zero (when it is non-zero overall). So callback must have some kind of rate limiter: do not recompute aggregates (or rate only) if they were last computed 1 second/1 minute/etc ago.

To preserve backward compatibility, existing export handles (prometheus plugin, graphite plugin, json plugin, metrics.collect) ignore metainfo.aggregate = true by default. It sounds reasonable for export plugin since such aggregates are mostly computed at the backend. New API is added to return a plugin handle/to collect with metrics.collect which includes aggregates, if someones want to. Similarly, new OTLP collect will include with_aggregates option to include aggregate values, similar to already proposed defauts_only option.

  1. plugin: flight recorder export #437

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions