Skip to content

Log more information about why compaction can not be planned #5532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import org.apache.accumulo.core.client.admin.compaction.CompactableFile;
import org.apache.accumulo.core.data.NamespaceId;
import org.apache.accumulo.core.data.TableId;
import org.apache.accumulo.core.data.TabletId;
import org.apache.accumulo.core.spi.common.ServiceEnvironment;

/**
Expand Down Expand Up @@ -94,6 +95,12 @@ public interface PlanningParameters {
*/
TableId getTableId();

/**
* @return the tablet for which a compaction is being planned
* @since 2.1.4
*/
TabletId getTabletId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue with adding this method in a bugfix release? This would cause a build and runtime issue for any user that has their own CompactionPlanner implementation written against 2.1.4 and deploys it to a 2.1.3 instance, right? I'm wondering if this should be a default method that returns null (not throws a UnsupportedOperationException).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this should be a default method that returns null (not throws a UnsupportedOperationException).

Returning null would not be the best long term behavior (like the behavior we would want in 4.0). I was worried about adding this, but chose this approach because a few lines up there is the following and I just followed its precedent.

    /**
     * @return The id of the namespace that the table is assigned to
     * @throws TableNotFoundException thrown when the namespace for a table cannot be calculated
     * @since 2.1.4
     */
    NamespaceId getNamespaceId() throws TableNotFoundException;

Whatever we do here, we probably need to do that consistently for all changes like this in 2.1.4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for a 3.1 and a 4.0 we could remove the default method that returns null and just leave the method declaration in the interface. I think this change (and the other that you pointed out) potentially make a 2.1.4 api implementation not backwards compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think leaving these methods unimplemented w/o a default impl is best because that is the most straightforward way for a developer to know they need to do something if they implemented this interface. If a user were to implement this interface it would most likely be for testing and it would be easy for a developer to deai with test code not implementing a method. If we want to add default method I think it would be best to throw unsupported op exception instead of returning null. Returning null has the potential to cause runtime exceptions that are far from the method that returned null, making it harder to track down. The unsupported op exception makes the issue easy to track down. We have returned unsupported op exception in other places in the past and that turned out to be bad, but that was a bit different. In this case its a completely new method that no 2.1.3 code could have called, we are not all of a sudden throwing that exception for an existing method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that leaving the methods unimplemented is better since a user would run into a compile time error vs a runtime error in a production deployment.


ServiceEnvironment getServiceEnvironment();

CompactionKind getKind();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,14 @@
import java.util.List;
import java.util.Objects;
import java.util.Set;
import java.util.stream.Collectors;

import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.admin.compaction.CompactableFile;
import org.apache.accumulo.core.conf.ConfigurationTypeHelper;
import org.apache.accumulo.core.conf.Property;
import org.apache.accumulo.core.spi.common.ServiceEnvironment;
import org.apache.accumulo.core.util.NumUtil;
import org.apache.accumulo.core.util.compaction.CompactionJobPrioritizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand Down Expand Up @@ -402,18 +404,34 @@ private Collection<CompactableFile> findFilesToCompactWithLowerRatio(PlanningPar
}

if (found.isEmpty() && lowRatio == 1.0) {
// in this case the data must be really skewed, operator intervention may be needed.
var examinedFiles = sortAndLimitByMaxSize(candidates, maxSizeToCompact);
var excludedBecauseMaxSize = candidates.size() - examinedFiles.size();
var tabletId = params.getTabletId();

log.warn(
"Attempted to lower compaction ration from {} to {} for {} because there are {} files "
+ "and the max tablet files is {}, however no set of files to compact were found.",
params.getRatio(), highRatio, params.getTableId(), params.getCandidates().size(),
maxTabletFiles);
"Unable to plan compaction for {} that has too many files. {}:{} num_files:{} "
+ "excluded_large_files:{} max_compaction_size:{} ratio_search_range:{},{} ",
tabletId, Property.TABLE_FILE_MAX.getKey(), maxTabletFiles, candidates.size(),
excludedBecauseMaxSize, NumUtil.bigNumberForSize(maxSizeToCompact), highRatio,
params.getRatio());
if (log.isDebugEnabled()) {
var sizesOfExamined = examinedFiles.stream()
.map(compactableFile -> NumUtil.bigNumberForSize(compactableFile.getEstimatedSize()))
.collect(Collectors.toList());
HashSet<CompactableFile> excludedFiles = new HashSet<>(candidates);
examinedFiles.forEach(excludedFiles::remove);
var sizesOfExcluded = excludedFiles.stream()
.map(compactableFile -> NumUtil.bigNumberForSize(compactableFile.getEstimatedSize()))
.collect(Collectors.toList());
log.debug("Failed planning details for {} examined_file_sizes:{} excluded_file_sizes:{}",
tabletId, sizesOfExamined, sizesOfExcluded);
}
}

log.info(
"For {} found {} files to compact lowering compaction ratio from {} to {} because the tablet "
+ "exceeded {} files, it had {}",
params.getTableId(), found.size(), params.getRatio(), lowRatio, maxTabletFiles,
params.getTabletId(), found.size(), params.getRatio(), lowRatio, maxTabletFiles,
params.getCandidates().size());

return found;
Expand Down Expand Up @@ -482,15 +500,18 @@ private Set<CompactableFile> getExpected(Collection<CompactionJob> compacting) {
return sortedFiles.subList(0, numToCompact);
}

static Collection<CompactableFile> findDataFilesToCompact(Set<CompactableFile> files,
double ratio, int maxFilesToCompact, long maxSizeToCompact) {
if (files.size() <= 1) {
return Collections.emptySet();
}

/**
* @return a list of the smallest files where the sum of the sizes is less than maxSizeToCompact
*/
static List<CompactableFile> sortAndLimitByMaxSize(Set<CompactableFile> files,
long maxSizeToCompact) {
// sort files from smallest to largest. So position 0 has the smallest file.
List<CompactableFile> sortedFiles = sortByFileSize(files);

if (maxSizeToCompact == Long.MAX_VALUE) {
return sortedFiles;
}

int maxSizeIndex = sortedFiles.size();
long sum = 0;
for (int i = 0; i < sortedFiles.size(); i++) {
Expand All @@ -502,10 +523,22 @@ static Collection<CompactableFile> findDataFilesToCompact(Set<CompactableFile> f
}

if (maxSizeIndex < sortedFiles.size()) {
sortedFiles = sortedFiles.subList(0, maxSizeIndex);
if (sortedFiles.size() <= 1) {
return Collections.emptySet();
}
return sortedFiles.subList(0, maxSizeIndex);
} else {
return sortedFiles;
}
}

static Collection<CompactableFile> findDataFilesToCompact(Set<CompactableFile> files,
double ratio, int maxFilesToCompact, long maxSizeToCompact) {

if (files.size() <= 1) {
return Collections.emptySet();
}

List<CompactableFile> sortedFiles = sortAndLimitByMaxSize(files, maxSizeToCompact);
if (sortedFiles.size() <= 1) {
return Collections.emptySet();
}

int windowStart = 0;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@
import org.apache.accumulo.core.conf.SiteConfiguration;
import org.apache.accumulo.core.data.NamespaceId;
import org.apache.accumulo.core.data.TableId;
import org.apache.accumulo.core.data.TabletId;
import org.apache.accumulo.core.dataImpl.KeyExtent;
import org.apache.accumulo.core.dataImpl.TabletIdImpl;
import org.apache.accumulo.core.spi.common.ServiceEnvironment;
import org.apache.accumulo.core.spi.common.ServiceEnvironment.Configuration;
import org.apache.accumulo.core.spi.compaction.CompactionPlan.Builder;
Expand Down Expand Up @@ -754,6 +757,11 @@ public TableId getTableId() {
return TableId.of("42");
}

@Override
public TabletId getTabletId() {
return new TabletIdImpl(new KeyExtent(getTableId(), null, null));
}

@Override
public ServiceEnvironment getServiceEnvironment() {
ServiceEnvironment senv = EasyMock.createMock(ServiceEnvironment.class);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
import org.junit.jupiter.api.Test;

public class NumUtilTest {

@Test
public void testBigNumberForSize() {
Locale.setDefault(Locale.US);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,9 @@
import org.apache.accumulo.core.conf.Property;
import org.apache.accumulo.core.data.NamespaceId;
import org.apache.accumulo.core.data.TableId;
import org.apache.accumulo.core.data.TabletId;
import org.apache.accumulo.core.dataImpl.KeyExtent;
import org.apache.accumulo.core.dataImpl.TabletIdImpl;
import org.apache.accumulo.core.spi.common.ServiceEnvironment;
import org.apache.accumulo.core.spi.compaction.CompactionExecutorId;
import org.apache.accumulo.core.spi.compaction.CompactionJob;
Expand Down Expand Up @@ -247,6 +249,11 @@ public TableId getTableId() {
return comp.getTableId();
}

@Override
public TabletId getTabletId() {
return new TabletIdImpl(comp.getExtent());
}

@Override
public ServiceEnvironment getServiceEnvironment() {
return senv;
Expand Down