Skip to content

HBASE-29231 Throttles should support limits based on handler thread usage time #7000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ajkh88
Copy link
Contributor

@ajkh88 ajkh88 commented May 19, 2025

Upstream Issue

This PR implements thread handler usage throttling support in HBase, enabling administrators to limit the amount of thread handler time that can be consumed across all threads.

Features

  • New throttle type: REQUEST_HANDLER_USAGE_MS that limits thread handler time usage
  • Default throttle configuration via hbase.quota.default.user.machine.request_handler_usage_ms
  • Throttling integration with existing quota infrastructure

Implementation Details

  • Enhanced TimeBasedLimiter to track and limit thread handler usage time
  • Added handler usage time tracking to DefaultOperationQuota
  • Added test cases to verify throttling behaviour for both reads and writes

This throttling capability helps prevent individual users or applications from monopolising RegionServer handler threads, improving overall service stability and responsiveness for HBase deployments.

@rmdmattingly
Copy link
Contributor

lucky number 7000

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds thread handler usage throttling support to HBase by introducing a new throttle type (REQUEST_HANDLER_USAGE_MS) and integrating it with the quota infrastructure. Key changes include adding new parameters to quota‐checking methods, updating quota state and operation quota calculations, and extending proto definitions and client utilities to support the new throttle type.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
TestThreadHandlerUsageQuota.java Added tests to verify throttling for read and write operations using handler usage time.
TestQuotaState.java & TestDefaultOperationQuota.java Updated tests and quota constructors to include the new parameters.
TestDefaultHandlerUsageQuota.java Added tests for default handler usage quotas.
TimeBasedLimiter.java Introduced a new RateLimiter for request handler usage time and updated quota checking methods.
RegionServerRpcQuotaManager.java Integrated new requests-per-second supplier into quota instantiation.
QuotaUtil.java, QuotaLimiter.java, NoopQuotaLimiter.java, GlobalQuotaSettingsImpl.java Updated quota configuration and proto mappings to handle the new throttle type.
ExceedOperationQuota.java & DefaultOperationQuota.java Modified quota consumption logic to account for handler usage time and updated diff/consume methods.
Protobuf files and client quota classes Extended definitions and conversions to support REQUEST_HANDLER_USAGE_MS.
Comments suppressed due to low confidence (1)

hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/ExceedOperationQuota.java:104

  • Passing 0L for the estimated handler usage time when grabbing quota for secondary limiters may be a placeholder; please verify if a more representative value should be propagated to better reflect actual usage.
limiter.grabQuota(numWrites, writeConsumed, numReads + numScans, readConsumed, writeCapacityUnitConsumed, writeCapacityUnitConsumed, isAtomic, 0L);

// low.
return numHandlerThreads;
} else {
double requestsPerMillisecond = Math.ceil(requestsPerSecond / 1000);
Copy link
Preview

Copilot AI May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding documentation or comments to clarify the rationale for using Math.ceil(requestsPerSecond/1000) for estimating per-request handler usage time, ensuring it aligns with the intended behavior under low request rates.

Copilot uses AI. Check for mistakes.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 buf 0m 0s buf was not available.
+0 🆗 buf 0m 0s buf was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 3m 15s master passed
+1 💚 compile 4m 30s master passed
+1 💚 checkstyle 0m 58s master passed
+1 💚 spotbugs 4m 35s master passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 4m 24s the patch passed
+1 💚 cc 4m 24s the patch passed
+1 💚 javac 4m 24s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 56s the patch passed
+1 💚 spotbugs 4m 48s the patch passed
+1 💚 hadoopcheck 12m 2s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 hbaseprotoc 1m 34s the patch passed
+1 💚 spotless 0m 44s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 27s The patch does not generate ASF License warnings.
50m 56s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7000/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7000
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless cc buflint bufcompat hbaseprotoc
uname Linux 16300a5b6f0b 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 549e244
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 86 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7000/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 3m 14s master passed
+1 💚 compile 1m 51s master passed
+1 💚 javadoc 0m 54s master passed
+1 💚 shadedjars 5m 57s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 7s the patch passed
+1 💚 compile 1m 51s the patch passed
+1 💚 javac 1m 51s the patch passed
+1 💚 javadoc 0m 52s the patch passed
+1 💚 shadedjars 5m 57s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 0m 33s hbase-protocol-shaded in the patch passed.
+1 💚 unit 1m 34s hbase-client in the patch passed.
+1 💚 unit 210m 35s hbase-server in the patch passed.
242m 7s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7000/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7000
Optional Tests javac javadoc unit compile shadedjars
uname Linux 07b6ca684b0f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 549e244
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7000/1/testReport/
Max. process+thread count 4882 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7000/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@@ -288,4 +309,25 @@ private long calculateWriteCapacityUnitDiff(final long actualSize, final long es
private long calculateReadCapacityUnitDiff(final long actualSize, final long estimateSize) {
return calculateReadCapacityUnit(actualSize) - calculateReadCapacityUnit(estimateSize);
}

private long calculateHandlerUsageTimeEstimate(final double requestsPerSecond,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind explain a bit more about the algorithm here? I do not fully understand what does 'handler usage time' mean...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this context, handler usage time refers to the amount of time (in milliseconds) a handler thread will use.

We estimate the number of milliseconds we expect the thread to take to complete the request and deduct that amount from the quota. If there are enough milliseconds left in the quota, we proceed by taking that amount from the quota and continuing with the request. Otherwise, we throw a throttling exception.

When the request is complete, we calculate the actual amount of time used in the close() method and adjust the quota by either adding back time or further subtracting it, as appropriate.

Does this make things clearer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with the quota implementation, so for every operation, we will create a OperationQuota object and calculate the estimate time, and then compare it with the actual time usage, and see whether we should stop it?

@ajkh88
Copy link
Contributor Author

ajkh88 commented Jun 3, 2025

In addition to the tests in this PR, I have deployed this to a test cluster and ran a load test against it. Everything is functioning as expected

Copy link
Contributor

@rmdmattingly rmdmattingly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me — this should be a solid last line of defense for us when designing default throttles that will keep a wide variety of users in check

@rmdmattingly rmdmattingly changed the title HBASE-29231 - Thread handler usage throttling HBASE-29231 Throttles should support limits based on handler thread usage time Jun 3, 2025
@rmdmattingly rmdmattingly merged commit d82a591 into apache:master Jun 3, 2025
1 check passed
@rmdmattingly rmdmattingly deleted the HBASE-29231/thread-handler-usage-throttling branch June 3, 2025 20:04
rmdmattingly pushed a commit that referenced this pull request Jun 3, 2025
…sage time (#7000)

Co-authored-by: Alex Hughes <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit that referenced this pull request Jun 3, 2025
…sage time (#7000)

Co-authored-by: Alex Hughes <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit that referenced this pull request Jun 3, 2025
…sage time (#7000)

Co-authored-by: Alex Hughes <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit to HubSpot/hbase that referenced this pull request Jun 3, 2025
…n handler thread usage time (apache#7000) (will be in 2.6.3)

Co-authored-by: Alex Hughes <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly added a commit that referenced this pull request Jun 4, 2025
…sage time (#7000) (#7064)

Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Alex Hughes <[email protected]>
Co-authored-by: Alex Hughes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants