Skip to content

Conversation

@banmoy
Copy link
Contributor

@banmoy banmoy commented Dec 24, 2025

Why I'm doing:

Merge commit latency metrics use bvar LatencyRecorder, and the unit is nanoseconds, but if the latency is larger than 2 seconds, the bvar will report overflow as the following

image

The reason is that although LatencyRecorder accepts int64_t as input (https://github.com/apache/brpc/blob/master/src/bvar/latency_recorder.h#L100
), but it actually uses IntRecorder to store the latency as int (https://github.com/apache/brpc/blob/master/src/bvar/latency_recorder.h#L54), so the allowed latency in nanosecond is 2147483647 (about 2.1s)

class LatencyRecorder : public detail::LatencyRecorderBase {
public:
    // Record the latency.
    LatencyRecorder& operator<<(int64_t latency);
class LatencyRecorderBase {
public:
    explicit LatencyRecorderBase(time_t window_size);
    time_t window_size() const { return _latency_window.window_size(); }
protected:
    IntRecorder _latency;

What I'm doing:

bvar does not provide other recorders to support int64_t. For merge commit latency, it's not necessary to use nanosecond precision, so just change the latency unit to microsecond, and the max latency can be 35.8 minutes which should be enough.

Note changing metric unit introduces behavior change, and add the document to explain it

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Note

Fixes overflow in merge-commit latency metrics by switching from nanoseconds to microseconds and documenting the change.

  • Replace g_mc_*_latency_ns with g_mc_*_latency_us in isomorphic_batch_write.cpp, add NS_TO_US conversions, and update logs/recorders accordingly
  • No functional logic changes to write flow; only metric units and names updated
  • Add "Merge Commit BE Metrics" docs (EN/JA/ZH): new counters described; latency metrics now reported in microseconds with note about pre-v3.4.11/v3.5.12/v4.0.4 behavior

Written by Cursor Bugbot for commit 126d8f7. This will update automatically on new commits. Configure here.

Signed-off-by: PengFei Li <[email protected]>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 24, 2025
@mergify mergify bot assigned banmoy Dec 24, 2025
Signed-off-by: PengFei Li <[email protected]>
@banmoy banmoy force-pushed the fix_merge_commit_unit branch from 5f0a6d6 to 041ac9d Compare December 24, 2025 02:31
@banmoy banmoy marked this pull request as ready for review December 24, 2025 02:31
@banmoy banmoy requested a review from a team as a code owner December 24, 2025 02:31
@kevincai kevincai requested review from Copilot and kevincai December 24, 2025 04:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an overflow issue in merge commit latency metrics by converting the unit from nanoseconds to microseconds. The bvar LatencyRecorder uses an IntRecorder internally which can only store values up to ~2.1 seconds when using nanoseconds, causing overflows for longer operations. By switching to microseconds, the maximum supported latency increases to ~35.8 minutes.

Key changes:

  • Changed metric unit from nanoseconds to microseconds with a NS_TO_US macro conversion
  • Renamed all latency recorder variables from _ns suffix to _us suffix
  • Updated documentation in English, Chinese, and Japanese to reflect the unit change
  • Fixed a typo in metric name from "reqeust" to "request"

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
be/src/runtime/batch_write/isomorphic_batch_write.cpp Updated bvar latency recorders to use microseconds, added NS_TO_US conversion macro, and applied conversions throughout the code
docs/en/administration/management/monitoring/metrics.md Added documentation for merge commit metrics with microsecond units and version note
docs/zh/administration/management/monitoring/metrics.md Added Chinese documentation for merge commit metrics with microsecond units and version note
docs/ja/administration/management/monitoring/metrics.md Added Japanese documentation for merge commit metrics with microsecond units and version note

Signed-off-by: PengFei Li <[email protected]>
@alvin-celerdata
Copy link
Contributor

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!

Signed-off-by: PengFei Li <[email protected]>
@github-actions
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link

[BE Incremental Coverage Report]

pass : 21 / 22 (95.45%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/runtime/batch_write/isomorphic_batch_write.cpp 21 22 95.45% [469]

@alvin-celerdata
Copy link
Contributor

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!

@alvin-celerdata alvin-celerdata merged commit 39bc4ea into StarRocks:main Dec 27, 2025
81 of 82 checks passed
@github-actions
Copy link

@Mergifyio backport branch-3.4

@github-actions
Copy link

@Mergifyio backport branch-3.5

@github-actions github-actions bot removed the 3.4 label Dec 27, 2025
@github-actions
Copy link

@Mergifyio backport branch-4.0

@mergify
Copy link
Contributor

mergify bot commented Dec 27, 2025

backport branch-3.4

✅ Backports have been created

Details

@mergify
Copy link
Contributor

mergify bot commented Dec 27, 2025

backport branch-3.5

✅ Backports have been created

Details

@mergify
Copy link
Contributor

mergify bot commented Dec 27, 2025

backport branch-4.0

✅ Backports have been created

Details

mergify bot pushed a commit that referenced this pull request Dec 27, 2025
Signed-off-by: PengFei Li <[email protected]>
(cherry picked from commit 39bc4ea)

# Conflicts:
#	docs/en/administration/management/monitoring/metrics.md
#	docs/ja/administration/management/monitoring/metrics.md
#	docs/zh/administration/management/monitoring/metrics.md
mergify bot pushed a commit that referenced this pull request Dec 27, 2025
Signed-off-by: PengFei Li <[email protected]>
(cherry picked from commit 39bc4ea)
mergify bot pushed a commit that referenced this pull request Dec 27, 2025
Signed-off-by: PengFei Li <[email protected]>
(cherry picked from commit 39bc4ea)
wanpengfei-git pushed a commit that referenced this pull request Dec 27, 2025
wanpengfei-git pushed a commit that referenced this pull request Dec 27, 2025
banmoy added a commit to banmoy/starrocks that referenced this pull request Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.5-merged 4.0-merged behavior_changed documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants