Skip to content

Commit 73d1e0f

Browse files
committed
[docs] Improve the doc structure of Merge Engines (#241)
1 parent 9e6db34 commit 73d1e0f

File tree

8 files changed

+81
-40
lines changed

8 files changed

+81
-40
lines changed

website/docs/table-design/table-types/pk-table/overview.md renamed to website/docs/table-design/table-types/pk-table/index.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
sidebar_position: 1
33
---
44

5-
# Overview
5+
# PrimaryKey Table
66

77
## Basic Concept
88

@@ -22,8 +22,8 @@ CREATE TABLE pk_table
2222
total_amount INT,
2323
PRIMARY KEY (shop_id, user_id) NOT ENFORCED
2424
) WITH (
25-
'bucket.num' = '4'
26-
);
25+
'bucket.num' = '4'
26+
);
2727
```
2828

2929
In Fluss primary key table, each row of data has a unique primary key.
@@ -73,6 +73,18 @@ follows:
7373
| 1 | 2.0 | t1 |
7474
| 2 | 3.0 | t2 |
7575

76+
## Merge Engines
77+
78+
The **Merge Engine** in Fluss is a core component designed to efficiently handle and consolidate data updates for PrimaryKey Tables.
79+
It offers users the flexibility to define how incoming data records are merged with existing records sharing the same primary key.
80+
The default merge engine in Fluss retains the latest record for a given primary key.
81+
However, users can specify a different merge engine to customize the merging behavior according to their specific use cases
82+
83+
The following merge engines are supported:
84+
85+
1. [FirstRow Merge Engine](/docs/table-design/table-types/pk-table/merge-engines/first-row)
86+
2. [Versioned Merge Engine](/docs/table-design/table-types/pk-table/merge-engines/versioned)
87+
7688
## Data Queries
7789

7890
For primary key tables, Fluss supports querying data directly based on the key. Please refer to

website/docs/table-design/table-types/pk-table/merge-engine/_category_.json

Lines changed: 0 additions & 4 deletions
This file was deleted.

website/docs/table-design/table-types/pk-table/merge-engine/first-row.md

Lines changed: 0 additions & 16 deletions
This file was deleted.

website/docs/table-design/table-types/pk-table/merge-engine/overview.md

Lines changed: 0 additions & 16 deletions
This file was deleted.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"label": "Merge Engines",
3+
"position": 2
4+
}
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
sidebar_label: FirstRow
3+
sidebar_position: 2
4+
---
5+
6+
# FirstRow Merge Engine
7+
8+
By setting `'table.merge-engine' = 'first_row'` in the table properties, users can retain the first record for each primary key.
9+
This configuration generates an insert-only changelog, allowing downstream Flink jobs to treat the PrimaryKey Table as an append-only Log Table.
10+
As a result, downstream transformations that do not support retractions/changelogs, such as [Window Aggregations](https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/table/sql/queries/window-agg/)
11+
and [Interval Joins](https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/table/sql/queries/joins/#interval-joins), can be applied seamlessly.
12+
13+
This feature is particularly valuable for replacing log deduplication in streaming computations, reducing complexity and improving overall efficiency.
14+
15+
:::note
16+
When using `first_row` merge engine, there are the following limits:
17+
18+
- `UPDATE` and `DELETE` SQL statements are not supported
19+
- Partial Update is not supported
20+
- `UPDATE_BEFORE` and `DELETE` changelog events are ignored automatically
21+
:::
22+
23+
## Example
24+
25+
```sql title="Flink SQL"
26+
CREATE TABLE T (
27+
k INT,
28+
v1 DOUBLE,
29+
v2 STRING,
30+
PRIMARY KEY (k) NOT ENFORCED
31+
) WITH (
32+
'table.merge-engine' = 'first_row'
33+
);
34+
35+
INSERT INTO T VALUES (1, 2.0, 't1');
36+
INSERT INTO T VALUES (1, 3.0, 't2');
37+
38+
SELECT * FROM T;
39+
40+
-- Output
41+
-- +---+-----+------+
42+
-- | k | v1 | v2 |
43+
-- +---+-----+------+
44+
-- | 1 | 2.0 | t1 |
45+
-- +---+-----+------+
46+
```
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
sidebar_position: 1
3+
---
4+
5+
# Merge Engines
6+
7+
The **Merge Engine** in Fluss is a core component designed to efficiently handle and consolidate data updates for PrimaryKey Tables.
8+
It offers users the flexibility to define how incoming data records are merged with existing records sharing the same primary key.
9+
The default merge engine in Fluss retains the latest record for a given primary key.
10+
However, users can specify a different merge engine to customize the merging behavior according to their specific use cases
11+
12+
The following merge engines are supported:
13+
14+
1. [FirstRow Merge Engine](/docs/table-design/table-types/pk-table/merge-engines/first-row)
15+
2. [Versioned Merge Engine](/docs/table-design/table-types/pk-table/merge-engines/versioned)

website/docs/table-design/table-types/pk-table/merge-engine/versioned.md renamed to website/docs/table-design/table-types/pk-table/merge-engines/versioned.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,6 @@ sidebar_label: Versioned
33
sidebar_position: 3
44
---
55

6-
# Versioned
6+
# Versioned Merge Engine
77

88
TODO: Fill me #459

0 commit comments

Comments
 (0)