Skip to content

Commit 44c4c45

Browse files
committed
add docs
1 parent d91ba56 commit 44c4c45

File tree

1 file changed

+108
-0
lines changed

1 file changed

+108
-0
lines changed
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
---
2+
title: "PK Clustering Override"
3+
weight: 10
4+
type: docs
5+
aliases:
6+
- /primary-key-table/pk-clustering-override.html
7+
---
8+
<!--
9+
Licensed to the Apache Software Foundation (ASF) under one
10+
or more contributor license agreements. See the NOTICE file
11+
distributed with this work for additional information
12+
regarding copyright ownership. The ASF licenses this file
13+
to you under the Apache License, Version 2.0 (the
14+
"License"); you may not use this file except in compliance
15+
with the License. You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing,
20+
software distributed under the License is distributed on an
21+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
22+
KIND, either express or implied. See the License for the
23+
specific language governing permissions and limitations
24+
under the License.
25+
-->
26+
27+
# PK Clustering Override
28+
29+
By default, data files in a primary key table are physically sorted by the primary key. This is optimal for point
30+
lookups but can hurt scan performance when queries filter on non-primary-key columns.
31+
32+
**PK Clustering Override** mode changes the physical sort order of data files from the primary key to user-specified
33+
clustering columns. This significantly improves scan performance for queries that filter or group by clustering columns,
34+
while still maintaining primary key uniqueness through deletion vectors.
35+
36+
## Quick Start
37+
38+
```sql
39+
CREATE TABLE my_table (
40+
id BIGINT,
41+
dt STRING,
42+
city STRING,
43+
amount DOUBLE,
44+
PRIMARY KEY (id) NOT ENFORCED
45+
) WITH (
46+
'pk-clustering-override' = 'true',
47+
'clustering.columns' = 'city',
48+
'deletion-vectors.enabled' = 'true',
49+
'bucket' = '4'
50+
);
51+
```
52+
53+
After this, data files within each bucket will be physically sorted by `city` instead of `id`. Queries like
54+
`SELECT * FROM my_table WHERE city = 'Beijing'` can skip irrelevant data files by checking their min/max statistics
55+
on the clustering column.
56+
57+
## How It Works
58+
59+
PK Clustering Override replaces the default LSM compaction with a two-phase clustering compaction:
60+
61+
**Phase 1 — Sort by Clustering Columns**: Newly flushed (level 0) files are read, sorted by the configured clustering
62+
columns, and rewritten as sorted (level 1) files. A key index tracks each primary key's file and row position to
63+
maintain uniqueness.
64+
65+
**Phase 2 — Merge Overlapping Sections**: Sorted files are grouped into sections based on clustering column range
66+
overlap. Overlapping sections are merged together. Adjacent small sections are also consolidated to reduce file count
67+
and IO amplification. Non-overlapping large files are left untouched.
68+
69+
During both phases, deduplication is handled via deletion vectors:
70+
71+
- **Deduplicate mode**: When a key already exists in an older file, the old row is marked as deleted.
72+
- **First-row mode**: When a key already exists, the new row is marked as deleted, keeping the first-seen value.
73+
74+
When the number of files to merge exceeds `sort-spill-threshold`, smaller files are first spilled to row-based
75+
temporary files to reduce memory consumption, preventing OOM during multi-way merge.
76+
77+
## Requirements
78+
79+
| Option | Requirement |
80+
|--------|-------------|
81+
| `pk-clustering-override` | `true` |
82+
| `clustering.columns` | Must be set (one or more non-primary-key columns) |
83+
| `deletion-vectors.enabled` | Must be `true` |
84+
| `merge-engine` | `deduplicate` (default) or `first-row` only |
85+
| `sequence.fields` | Must **not** be set |
86+
| `record-level.expire-time` | Must **not** be set |
87+
88+
## Related Options
89+
90+
| Option | Default | Description |
91+
|--------|---------|-------------|
92+
| `clustering.columns` | (none) | Comma-separated column names used as the physical sort order for data files. |
93+
| `sort-spill-threshold` | (auto) | When the number of merge readers exceeds this value, smaller files are spilled to row-based temp files to reduce memory usage. |
94+
| `sort-spill-buffer-size` | `64 mb` | Buffer size used for external sort during Phase 1 rewrite. |
95+
96+
## When to Use
97+
98+
PK Clustering Override is beneficial when:
99+
100+
- Analytical queries frequently filter or aggregate on non-primary-key columns (e.g., `WHERE city = 'Beijing'`).
101+
- The table uses `deduplicate` or `first-row` merge engine.
102+
- You want data files physically co-located by a business dimension rather than the primary key.
103+
104+
It is **not** suitable when:
105+
106+
- Point lookups by primary key are the dominant access pattern (default LSM sort is already optimal).
107+
- You need `partial-update` or `aggregation` merge engine.
108+
- `sequence.fields` or `record-level.expire-time` is required.

0 commit comments

Comments
 (0)