Skip to content

Commit 51dc38c

Browse files
committed
fix
1 parent 62c39c6 commit 51dc38c

3 files changed

Lines changed: 65 additions & 3 deletions

File tree

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
---
22
title: "Local Disk Cache"
3-
weight: 7
3+
weight: 8
44
type: docs
55
aliases:
6+
- /program-api/file-cache.html
67
- /pypaimon/file-cache.html
78
---
89
<!--
@@ -26,7 +27,7 @@ under the License.
2627

2728
# Local Disk Cache
2829

29-
When reading files from remote storage (S3, OSS, HDFS, etc.), each seek+read goes over the network. PyPaimon provides a block-level local disk cache that transparently caches file reads on local disk, significantly reducing remote I/O for repeated access patterns.
30+
When reading files from remote storage (S3, OSS, HDFS, etc.), each seek+read goes over the network. Paimon provides a block-level local disk cache that transparently caches file reads on local disk, significantly reducing remote I/O for repeated access patterns.
3031

3132
## Cached File Types
3233

@@ -46,6 +47,33 @@ All file types can be added to the whitelist. The default whitelist is `meta,glo
4647

4748
Use `table.copy()` to pass cache options as dynamic parameters:
4849

50+
{{< tabs "enable-cache" >}}
51+
52+
{{< tab "Java" >}}
53+
54+
```java
55+
import org.apache.paimon.table.Table;
56+
57+
import java.util.HashMap;
58+
import java.util.Map;
59+
60+
Table table = catalog.getTable(Identifier.create("my_db", "my_table"));
61+
62+
Map<String, String> options = new HashMap<>();
63+
options.put("file-cache.enabled", "true");
64+
// optional: customize cache directory and limits
65+
options.put("file-cache.dir", "/tmp/paimon-file-cache");
66+
options.put("file-cache.max-size", "2gb");
67+
options.put("file-cache.block-size", "1mb");
68+
69+
// All subsequent reads on this table instance will use the cache
70+
table = table.copy(options);
71+
```
72+
73+
{{< /tab >}}
74+
75+
{{< tab "Python" >}}
76+
4977
```python
5078
table = catalog.get_table("db.my_table")
5179

@@ -61,6 +89,10 @@ table = table.copy({
6189
# All subsequent reads on this table instance will use the cache
6290
```
6391

92+
{{< /tab >}}
93+
94+
{{< /tabs >}}
95+
6496
## Cache Options
6597

6698
| Option | Type | Default | Description |

docs/content/pypaimon/global-index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,4 +133,4 @@ read = read_builder.new_read()
133133
data = read.to_arrow(scan.plan().splits)
134134
```
135135

136-
For better performance when reading from remote storage, consider enabling the [Local Disk Cache]({{< ref "pypaimon/file-cache" >}}).
136+
For better performance when reading from remote storage, consider enabling the [Local Disk Cache]({{< ref "program-api/file-cache" >}}).

docs/layouts/shortcodes/generated/core_configuration.html

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,36 @@
566566
<td>String</td>
567567
<td>Default aggregate function of all fields for partial-update and aggregate merge function.</td>
568568
</tr>
569+
<tr>
570+
<td><h5>file-cache.block-size</h5></td>
571+
<td style="word-wrap: break-word;">1 mb</td>
572+
<td>MemorySize</td>
573+
<td>Block size for local disk cache.</td>
574+
</tr>
575+
<tr>
576+
<td><h5>file-cache.dir</h5></td>
577+
<td style="word-wrap: break-word;">(none)</td>
578+
<td>String</td>
579+
<td>Directory for file block cache. Defaults to a 'paimon-file-cache' subdirectory under the system temp directory.</td>
580+
</tr>
581+
<tr>
582+
<td><h5>file-cache.enabled</h5></td>
583+
<td style="word-wrap: break-word;">false</td>
584+
<td>Boolean</td>
585+
<td>Whether to enable local disk block cache for file reads.</td>
586+
</tr>
587+
<tr>
588+
<td><h5>file-cache.max-size</h5></td>
589+
<td style="word-wrap: break-word;">9223372036854775807 bytes</td>
590+
<td>MemorySize</td>
591+
<td>Maximum total size of the local disk block cache. Unlimited by default.</td>
592+
</tr>
593+
<tr>
594+
<td><h5>file-cache.whitelist</h5></td>
595+
<td style="word-wrap: break-word;">"meta,global-index"</td>
596+
<td>String</td>
597+
<td>Comma-separated list of file types to cache. Supported values: meta, global-index, bucket-index, data, file-index.</td>
598+
</tr>
569599
<tr>
570600
<td><h5>file-index.in-manifest-threshold</h5></td>
571601
<td style="word-wrap: break-word;">500 bytes</td>

0 commit comments

Comments
 (0)