You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For more details about Auto Partitioned (PrimaryKey/Log) Table, refer to [Auto Partitioning Options](table-design/data-distribution/partitioning.md#auto-partitioning-options).
177
+
For more details about Auto Partitioned (PrimaryKey/Log) Table, refer to [Auto Partitioning](table-design/data-distribution/partitioning.md#auto-partitioning).
Copy file name to clipboardExpand all lines: website/docs/engine-flink/options.md
+1-2Lines changed: 1 addition & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -81,7 +81,6 @@ ALTER TABLE log_table SET ('table.log.ttl' = '7d');
81
81
| bucket.num | int | The bucket number of Fluss cluster. | The number of buckets of a Fluss table. |
82
82
| bucket.key | String | (None) | Specific the distribution policy of the Fluss table. Data will be distributed to each bucket according to the hash value of bucket-key (It must be a subset of the primary keys excluding partition keys of the primary key table). If you specify multiple fields, delimiter is `,`. If the table has a primary key and a bucket key is not specified, the bucket key will be used as primary key(excluding the partition key). If the table has no primary key and the bucket key is not specified, the data will be distributed to each bucket randomly. |
83
83
| table.log.ttl | Duration | 7 days | The time to live for log segments. The configuration controls the maximum time we will retain a log before we will delete old segments to free up space. If set to -1, the log will not be deleted. |
84
-
| table.dynamic-partition.enabled | Boolean | true | Whether enable dynamic partition for the table. Enable by default. Dynamic partition strategy refers to creating partitions based on the data being written, if the partition does not exist while write data to the table. |
85
84
| table.auto-partition.enabled | Boolean | false | Whether enable auto partition for the table. Disable by default. When auto partition is enabled, the partitions of the table will be created automatically. |
86
85
| table.auto-partition.key | String | (None) | This configuration defines the time-based partition key to be used for auto-partitioning when a table is partitioned with multiple keys. Auto-partitioning utilizes a time-based partition key to handle partitions automatically, including creating new ones and removing outdated ones, by comparing the time value of the partition with the current system time. In the case of a table using multiple partition keys (such as a composite partitioning strategy), this feature determines which key should serve as the primary time dimension for making auto-partitioning decisions. And If the table has only one partition key, this config is not necessary. Otherwise, it must be specified. |
87
86
| table.auto-partition.time-unit | ENUM | DAY | The time granularity for auto created partitions. The default value is `DAY`. Valid values are `HOUR`, `DAY`, `MONTH`, `QUARTER`, `YEAR`. If the value is `HOUR`, the partition format for auto created is yyyyMMddHH. If the value is `DAY`, the partition format for auto created is yyyyMMdd. If the value is `MONTH`, the partition format for auto created is yyyyMM. If the value is `QUARTER`, the partition format for auto created is yyyyQ. If the value is `YEAR`, the partition format for auto created is yyyy. |
@@ -152,7 +151,7 @@ ALTER TABLE log_table SET ('table.log.ttl' = '7d');
152
151
| client.writer.retries | Integer | Integer.MAX_VALUE | Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. |
153
152
| client.writer.enable-idempotence | Boolean | true | Writer idempotence is enabled by default if no conflicting config are set. If conflicting config are set and writer idempotence is not explicitly enabled, idempotence is disabled. If idempotence is explicitly enabled and conflicting config are set, a ConfigException is thrown |
154
153
| client.writer.max-inflight-requests-per-bucket | Integer | 5 | The maximum number of unacknowledged requests per bucket for writer. This configuration can work only if `client.writer.enable-idempotence` is set to true. When the number of inflight requests per bucket exceeds this setting, the writer will wait for the inflight requests to complete before sending out new requests. |
155
-
| client.writer.dynamic-create-partition.enabled | Boolean | true | Whether enable dynamic create partition for client writer. Enable by default. Dynamic partition strategy refers to creating partitions based on the data being written for partitioned table if the wrote partition don't exists. |
154
+
| client.writer.dynamic-create-partition.enabled | Boolean | true | Whether to enable dynamic partition creation for the client writer. When enabled, new partitions are automatically created if they don't already exist during data writes. |
Copy file name to clipboardExpand all lines: website/docs/table-design/data-distribution/partitioning.md
+15-8Lines changed: 15 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,10 +24,12 @@ sidebar_position: 2
24
24
## Partitioned Tables
25
25
In Fluss, a **Partitioned Table** organizes data based on one or more partition keys, providing a way to improve query performance and manageability for large datasets. Partitions allow the system to divide data into distinct segments, each corresponding to specific values of the partition keys.
26
26
27
-
For partitioned tables, Fluss not only supports manage partitions by users, like create/drop partitions, but also supports automatic manage partitions.
28
-
- For manually managing partitions, user can create new partitions or drop exists partitions. Learn how to create or drop partitions please refer to [Add Partition](engine-flink/ddl.md#add-partition) and [Drop Partition](engine-flink/ddl.md#drop-partition).
29
-
- For automatically managing partitions, the partitions will be created based on the auto partitioning rules configured at the time of table creation, and expired partitions are automatically removed, ensuring data not expanding unlimited. See [Auto Partitioning Options](table-design/data-distribution/partitioning.md#auto-partitioning-options).
30
-
- Manual management and automated management are orthogonal and can coexist on the same table
27
+
For partitioned tables, Fluss supports three strategies of managing partitions.
28
+
-**Manual management partitions**, user can create new partitions or drop exists partitions. Learn how to create or drop partitions please refer to [Add Partition](engine-flink/ddl.md#add-partition) and [Drop Partition](engine-flink/ddl.md#drop-partition).
29
+
-**Auto management partitions**, the partitions will be created based on the auto partitioning rules configured at the time of table creation, and expired partitions are automatically removed, ensuring data not expanding unlimited. See [Auto Partitioning](table-design/data-distribution/partitioning.md#auto-partitioning).
30
+
-**Dynamic create partitions**, the partitions will be created automatically based on the data being written to the table. See [Dynamic Partitioning](table-design/data-distribution/partitioning.md#dynamic-partitioning).
31
+
32
+
These three strategies are orthogonal and can coexist on the same table.
31
33
32
34
### Key Benefits of Partitioned Tables
33
35
-**Improved Query Performance:** By narrowing down the query scope to specific partitions, the system reads fewer data, reducing query execution time.
@@ -40,7 +42,7 @@ For partitioned tables, Fluss not only supports manage partitions by users, like
40
42
- If the table is a primary key table, the partition key must be a subset of the primary key.
41
43
- Auto-partitioning rules can only be configured at the time of creating the partitioned table; modifying the auto-partitioning rules after table creation is not supported.
42
44
43
-
## Auto Partitioning Options
45
+
## Auto Partitioning
44
46
### Example
45
47
The auto-partitioning rules are configured through table options. The following example demonstrates creating a table named `site_access` that supports automatic partitioning using Flink SQL.
46
48
```sql title="Flink SQL"
@@ -62,7 +64,7 @@ CREATE TABLE site_access(
62
64
In this case, when automatic partitioning occurs (Fluss will periodically operate on all tables in the background), four partitions are pre-created with a partition granularity of YEAR, retaining two historical partitions. The time zone is set to Asia/Shanghai.
| table.auto-partition.enabled | Boolean | no | false | Whether enable auto partition for the table. Disable by default. When auto partition is enabled, the partitions of the table will be created automatically. |
@@ -90,8 +92,13 @@ Below are the configuration items related to Fluss cluster and automatic partiti
| auto-partition.check.interval | Duration | 10 minutes | The interval of auto partition check. The time interval for automatic partition checking is set to 10 minutes by default, meaning that it checks the table partition status every 10 minutes to see if it meets the automatic partitioning criteria. If it does not meet the criteria, partitions will be automatically created or deleted. |
92
94
95
+
## Dynamic Partitioning
93
96
97
+
**Dynamic partitioning** is a feature that is enabled by default on client, allowing the client to automatically create partitions based on the data being written to the table. This feature is especially valuable when the set of partitions is not known in advance, eliminating the need for manual partition creation. It is also particularly useful when working with multi-field partitions, as auto-partitioning currently only supports single-field partitioning creation.
94
98
99
+
Please note that the number of dynamically created partitions is also subject to the `max.partition.num` and `max.bucket.num` limit configured on the Fluss cluster.
| client.writer.dynamic-create-partition.enabled | Boolean | no | true | Whether to enable dynamic partition creation for the client writer. When enabled, new partitions are automatically created if they don't already exist during data writes. |
0 commit comments