Skip to content

[FEATURE] Start GetMetricData calls before all ListMetrics calls finish #1094

Open
@kgeckhart

Description

@kgeckhart

Is there an existing issue for this?

  • I have searched the existing issues

Feature description

After the most recent batch of performance fixes, discovery jobs which pull metrics from large AWS environments have very reasonable resource utilization 🎉 . The currently limitation which can still cause longer scrape times is that all ListMetrics calls must complete before any GetMetricData calls are made. Since the APIs have independent rate limits (ListMetrics is 25 TPS and GetMetricData is ~50 TPS) we can safely start calling GetMetricData before ListMetrics completes.

I think the most idiomatic go way of going about this is via introducing channels to runDiscoveryJob. The current challenge with doing this is the current code is really complex and relatively untestable. I think before doing this it needs to be refactored it to dramatically reduce the risk of such an impactful change. I would like to start by decomposing the main steps of runDiscoveryJob in to smaller composable/testable "dataflows" listed below

  1. GetResources
  2. ListMetrics
  3. AssociateMetricsToResources
  4. GetMetricData

At this point we should have the ability to have solid test coverage across the complex logic used by each flow and that runDiscoveryJob is going to flow the data appropriately. After this introducing channels should hopefully be as simple as introducing a new strategy for how runDiscoveryJob composes the flow of data which can be gated behind a feature flag. This level of decoupling will make it much easier to to test the complex test cases channels require like shutdown, and error propagation.

If this pattern works out well it I think it should be adaptable to reduce the amount of code copy CustomNamespace require. A CustomNamespace job should be a composition of the ListMetrics and GetMetricData dataflows.

What might the configuration look like?

Ideally, no configuration changes are required

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions