-
Notifications
You must be signed in to change notification settings - Fork 3.3k
HBASE-29356 Incorrect split behavior when region information is missing #7035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the incorrect region split behavior when region size information is missing by skipping regions with unknown size during average size computation and preventing operations on them.
- Updated test mocks and added unit tests to verify that regions with unknown size are skipped.
- Modified the region size averaging logic in SimpleRegionNormalizer to account only for regions with known sizes.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java | Updated test mocks and added new test methods to handle regions with unknown sizes |
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java | Revised average region size calculation and added checks to skip regions with an unknown size |
Comments suppressed due to low confidence (1)
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java:778
- [nitpick] The test method name 'testSplitOfLargeRegionIfOneIsNotKnow' is unclear; it likely should be 'testSplitOfLargeRegionIfOneIsNotKnown' for better readability.
@Test public void testSplitOfLargeRegionIfOneIsNotKnow() {
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Ping @ndimiduk. Please take a look at this one? Seems reasonable. |
Heya @qqvpp thanks for the contribution. Can you please create for yourself a Jira account? We use Jira for project tracking. |
if (targetRegionCount > 0) { | ||
avgRegionSize = totalSizeMb / (double) targetRegionCount; | ||
avgRegionSize = | ||
totalSizeMb / (double) targetRegionCount - (regionCount - regionCountKnownSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to adjust the denominator here, I think that you also need to protect against a value <= 0. In that case, you can throw, like we do on entry into the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a condition to verify the denominator and a test.
} else { | ||
avgRegionSize = totalSizeMb / (double) regionCount; | ||
avgRegionSize = totalSizeMb / (double) regionCountKnownSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you also need to protect against a 0 value here, in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the condition, including the test modification.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Description
We have identified a bug in the
SimpleRegionNormalizer
logic that leads to incorrect region splits when region size information is missing. If the size cannot be determined for one or more regions (e.g. due to unavailable metrics fromRegionServers
), the average region size calculation becomes incorrect. This results in a scenario where all regions may be considered too large and get split unintentionally.Observed Behavior:
When region size data is not available (e.g.,
getRegionSizeMB()
returns -1), the computed average size does not account for that, and regions with valid size may appear excessively large compared to the average — resulting in multiple unnecessary splits.Expected Behavior:
If region size is unknown for some regions, those regions should be skipped during normalization. The average region size should be computed only from the regions for which the size is known. No region should be split or merged unless its size is known.
Patch:
Skips regions with unknown size from average size computation.
Prevents split and merge operations on regions with unknown size.
Adds unit tests for scenarios with partial or total absence of size data.
Patch author: Milan Vymazal [email protected]
Tests:
testSplitOfLargeRegionIfOneIsNotKnow
verifies correct behavior when one region has unknown size.testSplitOfAllUnknownSize
ensures that no split happens if size data is missing for all regions.Reproduction:
Unfortunately, we are unable to reliably reproduce this bug in a live environment, since we cannot easily simulate the condition where RegionServer metrics are missing. However, we have confirmed the behavior through code analysis and the added unit tests.