Skip to content

Conversation

bryndenZh
Copy link

Purpose

In high-concurrency point query scenarios on the primary key table, we observed high CPU usage mainly caused by deserialization overhead of DV metadata. Currently, reading deletion vector metadata for a single bucket requires reading and deserializing a large number of entries from the index manifest, if the table has many partition and buckets.
image

This PR introduces a bucket-level dv meta cache which reduces CPU load and significantly improves QPS for single-bucket query scenarios on primary key tables.

Tests

API and Format

Documentation

public static final ConfigOption<Boolean> CACHE_DV_ENABLED =
key("cache.dv.enabled")
.booleanType()
.defaultValue(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should enable this by default.

"Controls the max number for snapshots per table in the catalog are cached.");

public static final ConfigOption<Boolean> CACHE_DV_ENABLED =
key("cache.dv.enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this one, just use max num to decide whether we should enable dv cache. (Max num is zero, disable)

.withDescription("Whether to enable deletion vector meta cache.");

public static final ConfigOption<Integer> CACHE_DV_MAX_NUM =
key("cache.dv.max-num")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache.deletion-vectors.max-num

result.put(
dvMeta.dataFileName(),
new DeletionFile(
dvIndex(partition, bucket).path(fileMeta).toString(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeletionVectorsIndexFile dvIndex = dvIndex(partition, bucket); to avoid create everytime.

// Construct DataFile -> DeletionFile based on IndexFileMeta
public Map<String, DeletionFile> extractDeletionFileByMeta(
BinaryRow partition, Integer bucket, IndexFileMeta fileMeta) {
if (fileMeta.dvRanges() != null && fileMeta.dvRanges().size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract a local field for fileMeta.dvRanges().


@Nullable
// Construct DataFile -> DeletionFile based on IndexFileMeta
public Map<String, DeletionFile> extractDeletionFileByMeta(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove public and add @VisibleForTesting.

DELETION_VECTORS_INDEX,
partitionBuckets.stream().map(Pair::getLeft).collect(Collectors.toSet()));
Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> result = new HashMap<>();
partitionBuckets.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use partitionFileMetas.forEach?

MemorySize dvTargetFileSize,
boolean dvBitmap64) {
boolean dvBitmap64,
boolean enableDVMetaCache) {
Copy link
Contributor

@JingsongLi JingsongLi Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just pass DVMetaCache here? You don't need use IndexManifestFile for cache at all.


// Scan DV Meta Cache first, if not exist, scan DV index file, returns the exact deletion file
// of the specified partition/buckets
public Map<String, DeletionFile> scanDVIndexWithCache(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just have one Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> scanDVIndex(Snapshot snapshot, Set<Pair<BinaryRow, Integer>> partitionBuckets), we can deal with cache in it.

// read from cache
String indexManifestName = snapshot.indexManifest();
Map<String, DeletionFile> result =
indexManifestFile.readFromDVMetaCache(indexManifestName, partition, bucket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline indexManifestFile.readFromDVMetaCache.

}
});
// bucketDeletionFiles can be empty
indexManifestFile.fillDVMetaCache(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline indexManifestFile.fillDVMetaCache.

this.cache.put(key, cacheValue);
}

private static class DVMetaCacheValue extends DeletionVectorMeta {
Copy link
Contributor

@JingsongLi JingsongLi Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't extends DeletionVectorMeta.


public DVMetaCacheValue(
String fileName,
String dataFileName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is deletionFilePath.

private final String fileName;

public DVMetaCacheValue(
String fileName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is dataFileName.


public DVMetaCache(long maxElementSize) {
this.cache =
Caffeine.newBuilder().maximumSize(maxElementSize).executor(Runnable::run).build();
Copy link
Contributor

@JingsongLi JingsongLi Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be set to max number of DVMetaCacheValue? And use softValues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants