Skip to content

HBASE-29006. The region assignment retry logic in unconstrained and may cause workload amplification #6983

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: branch-2.6
Choose a base branch
from

Conversation

shangshu-qian
Copy link

See HBASE-29006

We add a map to record the number of retried in the region assignment. If the retry count exceeds the predefined threshold (MAX_RETRY_LIMIT), the request will be blocked, preventing overloading the cluster.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 3m 10s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ branch-2.6 Compile Tests _
+1 💚 mvninstall 3m 19s branch-2.6 passed
+1 💚 compile 2m 59s branch-2.6 passed
+1 💚 checkstyle 0m 38s branch-2.6 passed
+1 💚 spotbugs 1m 33s branch-2.6 passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 5s the patch passed
+1 💚 compile 2m 53s the patch passed
+1 💚 javac 2m 53s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 38s the patch passed
+1 💚 spotbugs 1m 42s the patch passed
+1 💚 hadoopcheck 16m 58s Patch does not cause any errors with Hadoop 2.10.2 or 3.3.6 3.4.0.
-1 ❌ spotless 0m 39s patch has 34 errors when running spotless:check, run spotless:apply to fix.
_ Other Tests _
+1 💚 asflicense 0m 11s The patch does not generate ASF License warnings.
40m 26s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6983
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux fef176c4734a 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2.6 / cbc9126
Default Java Eclipse Adoptium-11.0.23+9
spotless https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/artifact/yetus-general-check/output/patch-spotless.txt
Max. process+thread count 77 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 41s Docker mode activated.
-0 ⚠️ yetus 0m 6s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2.6 Compile Tests _
+1 💚 mvninstall 3m 16s branch-2.6 passed
+1 💚 compile 1m 0s branch-2.6 passed
+1 💚 javadoc 0m 28s branch-2.6 passed
+1 💚 shadedjars 6m 14s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 9s the patch passed
+1 💚 compile 0m 57s the patch passed
+1 💚 javac 0m 57s the patch passed
+1 💚 javadoc 0m 26s the patch passed
+1 💚 shadedjars 6m 35s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 214m 10s /patch-unit-hbase-server.txt hbase-server in the patch failed.
241m 40s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6983
Optional Tests javac javadoc unit compile shadedjars
uname Linux 22be28260db7 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2.6 / cbc9126
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/testReport/
Max. process+thread count 4507 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 43s Docker mode activated.
-0 ⚠️ yetus 0m 6s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2.6 Compile Tests _
+1 💚 mvninstall 3m 27s branch-2.6 passed
+1 💚 compile 0m 51s branch-2.6 passed
+1 💚 javadoc 0m 26s branch-2.6 passed
+1 💚 shadedjars 6m 26s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 7s the patch passed
+1 💚 compile 0m 51s the patch passed
+1 💚 javac 0m 51s the patch passed
+1 💚 javadoc 0m 26s the patch passed
+1 💚 shadedjars 6m 25s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 215m 21s hbase-server in the patch passed.
243m 10s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #6983
Optional Tests javac javadoc unit compile shadedjars
uname Linux a86a4318e65f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2.6 / cbc9126
Default Java Eclipse Adoptium-11.0.23+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/testReport/
Max. process+thread count 4361 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
-0 ⚠️ yetus 0m 5s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2.6 Compile Tests _
+1 💚 mvninstall 2m 40s branch-2.6 passed
+1 💚 compile 0m 42s branch-2.6 passed
+1 💚 javadoc 0m 25s branch-2.6 passed
+1 💚 shadedjars 5m 17s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 28s the patch passed
+1 💚 compile 0m 39s the patch passed
+1 💚 javac 0m 39s the patch passed
+1 💚 javadoc 0m 24s the patch passed
+1 💚 shadedjars 5m 23s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 228m 54s hbase-server in the patch passed.
252m 36s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR #6983
Optional Tests javac javadoc unit compile shadedjars
uname Linux 2d54f3610178 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2.6 / cbc9126
Default Java Temurin-1.8.0_412-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/testReport/
Max. process+thread count 4235 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6983/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

int retryCount = retryCountMap.getOrDefault(hri,0) + 1;
retryCountMap.put(hri, retryCount);

if(retryCount > MAX_RETRY_LIMIT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better add a backoff here? If we go with this PR, the region will be in RIT state forever and only restarting master can trigger the reassign...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we can consider using backoff to solve the feedback loop as well. Simple blocking may be to harsh on the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants