-
Notifications
You must be signed in to change notification settings - Fork 3.3k
HBASE-29006. The region assignment retry logic in unconstrained and may cause workload amplification #6983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-2.6
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
int retryCount = retryCountMap.getOrDefault(hri,0) + 1; | ||
retryCountMap.put(hri, retryCount); | ||
|
||
if(retryCount > MAX_RETRY_LIMIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better add a backoff here? If we go with this PR, the region will be in RIT state forever and only restarting master can trigger the reassign...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we can consider using backoff to solve the feedback loop as well. Simple blocking may be to harsh on the system.
See HBASE-29006
We add a map to record the number of retried in the region assignment. If the retry count exceeds the predefined threshold (
MAX_RETRY_LIMIT
), the request will be blocked, preventing overloading the cluster.