Increase majc priority for tablets over file size threshold #5026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

dlmarion merged 8 commits into apache:2.1 from dlmarion:4610-increase-majc-prio

Nov 8, 2024

Contributor

dlmarion commented Oct 31, 2024

CompactionManager.mainLoop runs in the TabletServer looking for tablets that need to be compacted. It ends up calling CompactionService.submitCompaction at some interval for each hosted Tablet. CompactionService.getCompactionPlan will make a CompactionPlan for the tablet and log a warning if no CompactionPlan is created but the number of files is larger than TSERV_SCAN_MAX_OPENFILES. When there are no compactions running for the tablet and no plan is calculated, then DefaultCompactionPlanner.makePlan takes into account TABLE_FILE_MAX and TSERV_SCAN_MAX_OPENFILES and will create a system compaction that considers all of the files and calculates which ones need to be compacted to get below the limit. Finally, a priority is calculated by calling CompactionJobPrioritizer.createPriority. However, given that this compaction is a SYSTEM compaction it will have a lower priority than all current USER compactions and it's priority will still be based on the total number of files. Given that the TABLE_FILE_MAX is per-table it's possible to have two tablets from different tables and the tablet that is over the size threshold has a lower priority than the tablet that is not over the size threshold. This change modifies
CompactionJobPrioritizer.createPriority to take into account whether or not the tablet is over the threshold to give it a higher priority.

Closes #4610


          Increase majc priority for tablets over file size threshold

f29c322

CompactionManager.mainLoop runs in the TabletServer looking
for tablets that need to be compacted. It ends up calling
CompactionService.submitCompaction at some interval for each
hosted Tablet. CompactionService.getCompactionPlan will make
a CompactionPlan for the tablet and log a warning if no
CompactionPlan is created but the number of files is larger
than TSERV_SCAN_MAX_OPENFILES. When there are no compactions
running for the tablet and no plan is calculated, then
DefaultCompactionPlanner.makePlan takes into account
TABLE_FILE_MAX and TSERV_SCAN_MAX_OPENFILES and will create
a system compaction that considers all of the files and
calculates which ones need to be compacted to get below the
limit. Finally, a priority is calculated by calling
CompactionJobPrioritizer.createPriority. However, given that
this compaction is a SYSTEM compaction it will have a lower
priority than all current USER compactions and it's priority
will still be based on the total number of files. Given that
the TABLE_FILE_MAX is per-table it's possible to have two
tablets from different tables and the tablet that is over the
size threshold has a lower priority than the tablet that is
not over the size threshold. This change modifies
CompactionJobPrioritizer.createPriority to take into account
whether or not the tablet is over the threshold to give it a
higher priority.

Closes apache#4610

dlmarion added this to the 2.1.4 milestone

dlmarion requested review from keith-turner and ddanielr

October 31, 2024 14:16

dlmarion self-assigned this

dlmarion linked an issue

that may be closed by this pull request

Tablet with lots of file may not be readable on scan servers for long periods of time. #4610

Open

dlmarion commented

View reviewed changes

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated

+                      // Given that tablets with too many files cause several problems,
+                      // boost their priority to the maximum allowed value.
+                      if (condition == Condition.TABLET_OVER_SIZE) {
+                        return Short.MAX_VALUE;

Contributor Author

dlmarion Oct 31, 2024

Could possibly increment a metric for this condition, although I'm not sure how useful it would be. I think knowing that all of the external compactors and compaction threads in the tservers are fully utilized, along with a backlog of waiting compactions, is a more useful indication that more capacity is needed.

Contributor

keith-turner Nov 8, 2024

In your changes to the monitor its now scanning the metadata table periodically. In that scan it could compute the number of tablets over the max per table and display that on the monitor.

dlmarion mentioned this pull request

Add backpressure option for bulk import #5023

Closed

keith-turner reviewed

View reviewed changes

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved

Contributor

keith-turner commented Oct 31, 2024

May still want to leave #4610 open after merging this. This change is nice in that it helps compactions run faster for tablets over the threshold. There is still the problem that scan servers cache the file list. So if a scan server caches the list of files of a tablet w/ too many files for 5 mins then that tablet can not be scanned for that 5 min period even if it does compact. Not sure of the best way to handle this, scan servers could possibly poll the tablets files a bit more frequently when a tablet is in this condition.

dlmarion added 2 commits

November 4, 2024 13:03


          Wip changes

04d1130


          Backported priority ranges, implemented suggestion

30644f1

keith-turner reviewed

View reviewed changes

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/spi/compaction/CompactionPlanner.java Show resolved Hide resolved


          Implemented PR suggestions

dd1d80a

Contributor Author

dlmarion commented Nov 5, 2024

@keith-turner - I think I addressed all of your suggestions in the latest commit.


          fix javadoc

0cb0470

ctubbsii reviewed

View reviewed changes

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Outdated Show resolved Hide resolved


          Make ranges more explicit

ff4b3d1

dlmarion requested review from keith-turner and ctubbsii

November 8, 2024 16:09

keith-turner reviewed

View reviewed changes

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java

+                  Range<Short> range = null;
+                  Function<Range<Short>,Short> func = normalPriorityFunction;
+                  if (Namespace.ACCUMULO.id() == nsId) {
+                    // Handle system tables

Contributor

keith-turner Nov 8, 2024

There could be chop compactions when merging the metadata table and these may end w/ a null range causing an exception. Need to add handling for kind of chop. Will have to drop this in the 3.1 code.

Contributor Author

dlmarion Nov 8, 2024

In 74d66b2 I added code to treat CHOP as USER compaction kind.

core/src/test/java/org/apache/accumulo/core/util/compaction/CompactionPrioritizerTest.java Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Show resolved Hide resolved

core/src/test/java/org/apache/accumulo/core/util/compaction/CompactionPrioritizerTest.java Outdated Show resolved Hide resolved


          Implement PR suggestions

74d66b2

keith-turner approved these changes

View reviewed changes

core/src/test/java/org/apache/accumulo/core/util/compaction/CompactionPrioritizerTest.java Show resolved Hide resolved

core/src/main/java/org/apache/accumulo/core/util/compaction/CompactionJobPrioritizer.java Show resolved Hide resolved


          Implemented PR suggestions

4c24262

dlmarion merged commit ba9c376 into apache:2.1

8 checks passed

dlmarion deleted the 4610-increase-majc-prio branch

November 8, 2024 19:55

dlmarion mentioned this pull request

Broken test: ExternalCompaction_1_IT.testPartialCompaction timing out #5052

Closed

dlmarion added a commit to dlmarion/accumulo that referenced this pull request


          Fix ExternalCompaction_1_IT.testPartialCompaction failure

7f9f734

The test was timing out because it never started running compactions
created by the test method. Instead, the compactor process was
running compactions created by the previous test method because
the previous test created a table with a lot of files, started a
user compaction, then cancelled the user compaction. The recent
changes in apache#5026 caused a bunch of system compactions to be
generated for the table. The two test methods share the same
compaction queue, so the compactor was busy running the system
compactions.

To fix this issue I backported a property added in apache#3955 that
makes the compactor cancel check method time configurable and
I deleted the table in the test method that created a lot of files.

Closes apache#5052

dlmarion mentioned this pull request

Fix ExternalCompaction_1_IT.testPartialCompaction failure #5054

Merged

dlmarion added a commit that referenced this pull request


          Fix ExternalCompaction_1_IT.testPartialCompaction failure (#5054)

5715f06

The test was timing out because it never started running compactions
created by the test method. Instead, the compactor process was
running compactions created by the previous test method because
the previous test created a table with a lot of files, started a
user compaction, then cancelled the user compaction. The recent
changes in #5026 caused a bunch of system compactions to be
generated for the table. The two test methods share the same
compaction queue, so the compactor was busy running the system
compactions.

To fix this issue I backported a property added in #3955 that
makes the compactor cancel check method time configurable and
I deleted the table in the test method that created a lot of files.

Closes #5052

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet