Skip to content

[CELEBORN-2318] Miss increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData#3676

Closed
kaybhutani wants to merge 1 commit into
apache:mainfrom
kaybhutani:kartikay/missing-hard-split-metric
Closed

[CELEBORN-2318] Miss increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData#3676
kaybhutani wants to merge 1 commit into
apache:mainfrom
kaybhutani:kartikay/missing-hard-split-metric

Conversation

@kaybhutani
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Missing increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT

Why are the changes needed?

  • The post-restart detection branch in handlePushData (Case2: shuffleKey in storageManager but not in shuffleMapperAttempts) returns HARD_SPLIT without incrementing WRITE_DATA_HARD_SPLIT_COUNT
  • The sibling Case1 branch (line 398) and all other HARD_SPLIT return paths already increment it
  • This makes Case2 invisible to monitoring during rolling restarts

Does this PR resolve a correctness bug?

No

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UTs

@kaybhutani kaybhutani changed the title add missing metric Missing increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData May 7, 2026
@kaybhutani kaybhutani changed the title Missing increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData CELEBORN-2318 Missing increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData May 7, 2026
@SteNicholas SteNicholas changed the title CELEBORN-2318 Missing increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData [CELEBORN-2318] Miss increment to WRITE_DATA_HARD_SPLIT_COUNT on returning HARD_SPLIT in handlePushData May 8, 2026
@SteNicholas
Copy link
Copy Markdown
Member

@kaybhutani, thanks for fixing. LGTM.

@SteNicholas
Copy link
Copy Markdown
Member

Thanks. Merged to main(v0.7.0).

akpatnam25 pushed a commit to akpatnam25/incubator-celeborn that referenced this pull request May 15, 2026
…rning HARD_SPLIT in handlePushData

### What changes were proposed in this pull request?
Missing increment to `WRITE_DATA_HARD_SPLIT_COUNT` on returning HARD_SPLIT

### Why are the changes needed?
- The post-restart detection branch in `handlePushData` (Case2: shuffleKey in storageManager but not in shuffleMapperAttempts) returns HARD_SPLIT without incrementing `WRITE_DATA_HARD_SPLIT_COUNT`
- The sibling Case1 branch (line 398) and all other HARD_SPLIT return paths already increment it
- This makes Case2 invisible to monitoring during rolling restarts

### Does this PR resolve a correctness bug?
No

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing UTs

Closes apache#3676 from kaybhutani/kartikay/missing-hard-split-metric.

Authored-by: Kartikay Bhutani <kbhutani0001@gmail.com>
Signed-off-by: SteNicholas <programgeek@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants