Skip to content

Make throttling on tlog disk space more gradual#12807

Open
tclinkenbeard-oai wants to merge 3 commits intoapple:mainfrom
tclinkenbeard-oai:dev/tclinkenbeard/gradual-tlog-space-throttle
Open

Make throttling on tlog disk space more gradual#12807
tclinkenbeard-oai wants to merge 3 commits intoapple:mainfrom
tclinkenbeard-oai:dev/tclinkenbeard/gradual-tlog-space-throttle

Conversation

@tclinkenbeard-oai
Copy link
Collaborator

Currently, if a tlog runs out of disk space, throttling ramps up extremely quickly once MIN_AVAILABLE_SPACE_RATIO is reached. This PR introduces a new TLOG_THROTTLE_START_AVAILABLE_SPACE_RATIO knob that supports more gradual throttling.

By default the knob has no effect, but if set to e.g. 0.2 (with MIN_AVAILABLE_SPACE_RATIO at the default 0.05), the target tlog queue size (used for the ratekeeper's tpsLimit calculation) scales linearly downward as available space decreases from 20% to 5%.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 2b70f7e
  • Duration 0:22:32
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 2b70f7e
  • Duration 0:35:06
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 2b70f7e
  • Duration 0:40:19
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 2b70f7e
  • Duration 0:48:46
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 2b70f7e
  • Duration 0:59:23
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 2b70f7e
  • Duration 1:17:14
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 2b70f7e
  • Duration 2:11:19
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

init( TARGET_BYTES_PER_TLOG_BATCH, 1400e6 ); if( smallTlogTarget ) TARGET_BYTES_PER_TLOG_BATCH = 1400e3;
init( SPRING_BYTES_TLOG_BATCH, 300e6 ); if( smallTlogTarget ) SPRING_BYTES_TLOG_BATCH = 150e3;
// Match MIN_AVAILABLE_SPACE by default; buggified simulations exercise the earlier ramp.
init( TLOG_THROTTLE_START_AVAILABLE_SPACE_RATIO, 0.05 ); if( randomize && isSimulated && BUGGIFY ) TLOG_THROTTLE_START_AVAILABLE_SPACE_RATIO = 0.20;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have START and STOP as configurable params?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STOP is effectively MIN_AVAILABLE_SPACE_RATIO, there isn't a separate knob for this yet, but I don't think we need one

Copy link

@gridem-openai gridem-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@gxglass
Copy link
Contributor

gxglass commented Mar 20, 2026

@tclinkenbeard-oai I'll have a look. I haven't looked at any of this stuff before so need a little time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants