Skip to content

fix: Peon metrics emit same value#19484

Open
GWphua wants to merge 2 commits into
apache:masterfrom
GWphua:peon-metrics
Open

fix: Peon metrics emit same value#19484
GWphua wants to merge 2 commits into
apache:masterfrom
GWphua:peon-metrics

Conversation

@GWphua
Copy link
Copy Markdown
Contributor

@GWphua GWphua commented May 20, 2026

Description

This PR fixes incorrect Peon sink-level query metric emission in SinkQuerySegmentWalker. This bug exists since v32, introduced by #17170

The Bug

Currently, the following Peon metrics will be emitted with misleading values, making the three metrics appear identical:

  • query/segment/time
  • query/segmentAndCache/time
  • query/wait/time
image-20260519165853832

Cause

The existing code iterated over METRICS_TO_REPORT and used a switch without break statements. Because reportMetric.getValue() is bound to the current metric reporter, fallthrough caused later accumulator values to be reported through the wrong reporter. For example, the query/segment/time reporter could be called with segment time, then wait time, then segment-and-cache time before emit(). Since DefaultQueryMetrics stores metric values by metric name before emission, the last value overwrote earlier ones.

Release note

Fix incorrect reporting of query/segment/time, query/wait/time and query/segmentAndCache/time on Peon.


Key changed/added classes in this PR
  • SinkQuerySegmentWalker

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@jtuglu1 jtuglu1 self-requested a review May 20, 2026 04:41
Copy link
Copy Markdown
Contributor

@jtuglu1 jtuglu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will this test change catch any regressions as well?

@GWphua
Copy link
Copy Markdown
Contributor Author

GWphua commented May 20, 2026

Unfortunately, no.

Will find some time to add that

@GWphua
Copy link
Copy Markdown
Contributor Author

GWphua commented May 20, 2026

Hi @jtuglu1

I looked at trying to test the regression, but this may break abstraction. I am looking to bring these 3 lines of code into a function reportSegmentMetrics(QueryMetrics<?>, SegmentMetrics).

queryMetrics.reportSegmentTime(segmentMetrics.getSegmentTime());
queryMetrics.reportWaitTime(segmentMetrics.getWaitTime());
queryMetrics.reportSegmentAndCacheTime(segmentMetrics.getSegmentAndCacheTime());

However, the SegmentMetrics class is nested within the private query runner.

Given that the fix removes the fall-through switch and replaces it with direct reporter-to-value calls, I think that it is ok to omit the tests. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants