[core] Introduce deletion vector meta cache at bucket level #6407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

bryndenZh wants to merge 4 commits into apache:master from bryndenZh:dv-cache

bryndenZh commented Oct 16, 2025

Purpose

In high-concurrency point query scenarios on the primary key table, we observed high CPU usage mainly caused by deserialization overhead of DV metadata. Currently, reading deletion vector metadata for a single bucket requires reading and deserializing a large number of entries from the index manifest, if the table has many partition and buckets.

This PR introduces a bucket-level dv meta cache which reduces CPU load and significantly improves QPS for single-bucket query scenarios on primary key tables.

Tests

API and Format

Documentation

岚谷 added 4 commits

October 15, 2025 16:33


          support dv meta cache

4d0e410


          update test

0667dac


          change CacheKey to inner class

e5811f4


          fix checkstyle

97ee458

JingsongLi reviewed

View reviewed changes

paimon-api/src/main/java/org/apache/paimon/options/CatalogOptions.java

    
                  public static final ConfigOption<Boolean> CACHE_DV_ENABLED =

                          key("cache.dv.enabled")

                                  .booleanType()

                                  .defaultValue(false)

Contributor

JingsongLi Oct 16, 2025

Maybe we should enable this by default.

JingsongLi reviewed

View reviewed changes

paimon-api/src/main/java/org/apache/paimon/options/CatalogOptions.java

    
                                          "Controls the max number for snapshots per table in the catalog are cached.");

                  public static final ConfigOption<Boolean> CACHE_DV_ENABLED =

                          key("cache.dv.enabled")

Contributor

JingsongLi Oct 16, 2025

We don't need this one, just use max num to decide whether we should enable dv cache. (Max num is zero, disable)

JingsongLi reviewed

View reviewed changes

paimon-api/src/main/java/org/apache/paimon/options/CatalogOptions.java

    
                                  .withDescription("Whether to enable deletion vector meta cache.");

                  public static final ConfigOption<Integer> CACHE_DV_MAX_NUM =

                          key("cache.dv.max-num")

Contributor

JingsongLi Oct 16, 2025

cache.deletion-vectors.max-num

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                              result.put(

                                      dvMeta.dataFileName(),

                                      new DeletionFile(

                                              dvIndex(partition, bucket).path(fileMeta).toString(),

Contributor

JingsongLi Oct 16, 2025

DeletionVectorsIndexFile dvIndex = dvIndex(partition, bucket); to avoid create everytime.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                  // Construct DataFile -> DeletionFile based on IndexFileMeta

                  public Map<String, DeletionFile> extractDeletionFileByMeta(

                          BinaryRow partition, Integer bucket, IndexFileMeta fileMeta) {

                      if (fileMeta.dvRanges() != null && fileMeta.dvRanges().size() > 0) {

Contributor

JingsongLi Oct 16, 2025

extract a local field for fileMeta.dvRanges().

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                  @Nullable

                  // Construct DataFile -> DeletionFile based on IndexFileMeta

                  public Map<String, DeletionFile> extractDeletionFileByMeta(

Contributor

JingsongLi Oct 16, 2025

Remove public and add @VisibleForTesting.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                                      DELETION_VECTORS_INDEX,

                                      partitionBuckets.stream().map(Pair::getLeft).collect(Collectors.toSet()));

                      Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> result = new HashMap<>();

                      partitionBuckets.forEach(

Contributor

JingsongLi Oct 16, 2025

Just use partitionFileMetas.forEach?

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                          MemorySize dvTargetFileSize,

                          boolean dvBitmap64) {

                          boolean dvBitmap64,

                          boolean enableDVMetaCache) {

Contributor

JingsongLi Oct 16, 2025 •

edited

Loading

Why not just pass DVMetaCache here? You don't need use IndexManifestFile for cache at all.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                  // Scan DV Meta Cache first, if not exist, scan DV index file, returns the exact deletion file

                  // of the specified partition/buckets

                  public Map<String, DeletionFile> scanDVIndexWithCache(

Contributor

JingsongLi Oct 16, 2025

We can just have one Map<Pair<BinaryRow, Integer>, Map<String, DeletionFile>> scanDVIndex(Snapshot snapshot, Set<Pair<BinaryRow, Integer>> partitionBuckets), we can deal with cache in it.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                      // read from cache

                      String indexManifestName = snapshot.indexManifest();

                      Map<String, DeletionFile> result =

                              indexManifestFile.readFromDVMetaCache(indexManifestName, partition, bucket);

Contributor

JingsongLi Oct 16, 2025

Inline indexManifestFile.readFromDVMetaCache.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java

    
                                          }

                                      });

                              // bucketDeletionFiles can be empty

                              indexManifestFile.fillDVMetaCache(

Contributor

JingsongLi Oct 16, 2025

Inline indexManifestFile.fillDVMetaCache.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/utils/DVMetaCache.java

    
                      this.cache.put(key, cacheValue);

                  }

                  private static class DVMetaCacheValue extends DeletionVectorMeta {

Contributor

JingsongLi Oct 16, 2025 •

edited

Loading

Don't extends DeletionVectorMeta.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/utils/DVMetaCache.java

    
                      public DVMetaCacheValue(

                              String fileName,

                              String dataFileName,

Contributor

JingsongLi Oct 16, 2025

It is deletionFilePath.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/utils/DVMetaCache.java

    
                      private final String fileName;

                      public DVMetaCacheValue(

                              String fileName,

Contributor

JingsongLi Oct 16, 2025

It is dataFileName.

JingsongLi reviewed

View reviewed changes

paimon-core/src/main/java/org/apache/paimon/utils/DVMetaCache.java

    
                  public DVMetaCache(long maxElementSize) {

                      this.cache =

                              Caffeine.newBuilder().maximumSize(maxElementSize).executor(Runnable::run).build();

Contributor

JingsongLi Oct 16, 2025 •

edited

Loading

Should it be set to max number of DVMetaCacheValue? And use softValues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet