-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[core] Introduce deletion vector meta cache at bucket level #6407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
public static final ConfigOption<Boolean> CACHE_DV_ENABLED = | ||
key("cache.dv.enabled") | ||
.booleanType() | ||
.defaultValue(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should enable this by default.
"Controls the max number for snapshots per table in the catalog are cached."); | ||
|
||
public static final ConfigOption<Boolean> CACHE_DV_ENABLED = | ||
key("cache.dv.enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this one, just use max num to decide whether we should enable dv cache. (Max num is zero, disable)
.withDescription("Whether to enable deletion vector meta cache."); | ||
|
||
public static final ConfigOption<Integer> CACHE_DV_MAX_NUM = | ||
key("cache.dv.max-num") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache.deletion-vectors.max-num
result.put( | ||
dvMeta.dataFileName(), | ||
new DeletionFile( | ||
dvIndex(partition, bucket).path(fileMeta).toString(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeletionVectorsIndexFile dvIndex = dvIndex(partition, bucket); to avoid create everytime.
// Construct DataFile -> DeletionFile based on IndexFileMeta | ||
public Map<String, DeletionFile> extractDeletionFileByMeta( | ||
BinaryRow partition, Integer bucket, IndexFileMeta fileMeta) { | ||
if (fileMeta.dvRanges() != null && fileMeta.dvRanges().size() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract a local field for fileMeta.dvRanges()
.
|
||
@Nullable | ||
// Construct DataFile -> DeletionFile based on IndexFileMeta | ||
public Map<String, DeletionFile> extractDeletionFileByMeta( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove public and add @VisibleForTesting
.
DELETION_VECTORS_INDEX, | ||
partitionBuckets.stream().map(Pair::getLeft).collect(Collectors.toSet())); | ||
Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> result = new HashMap<>(); | ||
partitionBuckets.forEach( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use partitionFileMetas.forEach
?
MemorySize dvTargetFileSize, | ||
boolean dvBitmap64) { | ||
boolean dvBitmap64, | ||
boolean enableDVMetaCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just pass DVMetaCache
here? You don't need use IndexManifestFile
for cache at all.
|
||
// Scan DV Meta Cache first, if not exist, scan DV index file, returns the exact deletion file | ||
// of the specified partition/buckets | ||
public Map<String, DeletionFile> scanDVIndexWithCache( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just have one Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> scanDVIndex(Snapshot snapshot, Set<Pair<BinaryRow, Integer>> partitionBuckets)
, we can deal with cache in it.
// read from cache | ||
String indexManifestName = snapshot.indexManifest(); | ||
Map<String, DeletionFile> result = | ||
indexManifestFile.readFromDVMetaCache(indexManifestName, partition, bucket); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline indexManifestFile.readFromDVMetaCache
.
} | ||
}); | ||
// bucketDeletionFiles can be empty | ||
indexManifestFile.fillDVMetaCache( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline indexManifestFile.fillDVMetaCache
.
this.cache.put(key, cacheValue); | ||
} | ||
|
||
private static class DVMetaCacheValue extends DeletionVectorMeta { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't extends DeletionVectorMeta
.
|
||
public DVMetaCacheValue( | ||
String fileName, | ||
String dataFileName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is deletionFilePath
.
private final String fileName; | ||
|
||
public DVMetaCacheValue( | ||
String fileName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is dataFileName.
|
||
public DVMetaCache(long maxElementSize) { | ||
this.cache = | ||
Caffeine.newBuilder().maximumSize(maxElementSize).executor(Runnable::run).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be set to max number of DVMetaCacheValue
? And use softValues
?
Purpose
In high-concurrency point query scenarios on the primary key table, we observed high CPU usage mainly caused by deserialization overhead of DV metadata. Currently, reading deletion vector metadata for a single bucket requires reading and deserializing a large number of entries from the index manifest, if the table has many partition and buckets.

This PR introduces a bucket-level dv meta cache which reduces CPU load and significantly improves QPS for single-bucket query scenarios on primary key tables.
Tests
API and Format
Documentation