Skip to content

Commit 02a8f55

Browse files
authored
[docs] Update datalake related doc for support iceberg and lance (#1686)
1 parent 8b9d293 commit 02a8f55

File tree

7 files changed

+40
-34
lines changed

7 files changed

+40
-34
lines changed

fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1299,7 +1299,8 @@ public class ConfigOptions {
12991299
.enumType(DataLakeFormat.class)
13001300
.noDefaultValue()
13011301
.withDescription(
1302-
"The data lake format of the table specifies the tiered Lakehouse storage format, such as Paimon, Iceberg, DeltaLake, or Hudi. Currently, only `paimon` is supported. "
1302+
"The data lake format of the table specifies the tiered Lakehouse storage format. Currently, supported formats are `paimon`, `iceberg`, and `lance`. "
1303+
+ "In the future, more kinds of data lake format will be supported, such as DeltaLake or Hudi. "
13031304
+ "Once the `table.datalake.format` property is configured, Fluss adopts the key encoding and bucketing strategy used by the corresponding data lake format. "
13041305
+ "This ensures consistency in key encoding and bucketing, enabling seamless **Union Read** functionality across Fluss and Lakehouse. "
13051306
+ "The `table.datalake.format` can be pre-defined before enabling `table.datalake.enabled`. This allows the data lake feature to be dynamically enabled on the table without requiring table recreation. "
@@ -1646,8 +1647,8 @@ public class ConfigOptions {
16461647
.enumType(DataLakeFormat.class)
16471648
.noDefaultValue()
16481649
.withDescription(
1649-
"The datalake format used by Fluss to be as lake storage, such as Paimon, Iceberg, Hudi. "
1650-
+ "Now, only support Paimon.");
1650+
"The datalake format used by of Fluss to be as lakehouse storage. Currently, supported formats are Paimon, Iceberg, and Lance. "
1651+
+ "In the future, more kinds of data lake format will be supported, such as DeltaLake or Hudi.");
16511652

16521653
// ------------------------------------------------------------------------
16531654
// ConfigOptions for fluss kafka

website/docs/engine-flink/options.md

Lines changed: 23 additions & 23 deletions
Large diffs are not rendered by default.

website/docs/install-deploy/overview.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,9 @@ We have listed them in the table below the figure.
116116
by query engines such as Flink, Spark, StarRocks, Trino.
117117
</td>
118118
<td>
119-
<li>[Paimon](maintenance/tiered-storage/lakehouse-storage.md)</li>
120-
<li>[Iceberg (Roadmap)](/roadmap/)</li>
119+
<li>[Paimon](streaming-lakehouse/integrate-data-lakes/paimon.md)</li>
120+
<li>[Iceberg](streaming-lakehouse/integrate-data-lakes/iceberg.md)</li>
121+
<li>[Lance](streaming-lakehouse/integrate-data-lakes/lance.md)</li>
121122
</td>
122123
</tr>
123124
<tr>

website/docs/maintenance/configuration.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -164,9 +164,9 @@ during the Fluss cluster working.
164164

165165
## Lakehouse
166166

167-
| Option | Type | Default | Description |
168-
|-----------------|------|---------|---------------------------------------------------------------------------------------------------------------------------|
169-
| datalake.format | Enum | (None) | The datalake format used by of Fluss to be as lakehouse storage, such as Paimon, Iceberg, Hudi. Now, only support Paimon. |
167+
| Option | Type | Default | Description |
168+
|-----------------|------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
169+
| datalake.format | Enum | (None) | The datalake format used by of Fluss to be as lakehouse storage. Currently, supported formats are Paimon, Iceberg, and Lance. In the future, more kinds of data lake format will be supported, such as DeltaLake or Hudi. |
170170

171171
## Kafka
172172

website/docs/maintenance/tiered-storage/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Fluss organizes data into different storage layers based on its access patterns,
1414
Fluss ensures the recent data is stored in local for higher write/read performance and the historical data is stored in [remote storage](remote-storage.md) for lower cost.
1515

1616
What's more, since the native format of Fluss's data is optimized for real-time write/read which is inevitable unfriendly to batch analytics, Fluss also introduces a [lakehouse storage](lakehouse-storage.md) which stores the data
17-
in the well-known open data lake format for better analytics performance. Currently, only Paimon is supported, but more kinds of data lake support are on the way. Keep eyes on us!
17+
in the well-known open data lake format for better analytics performance. Currently, supported formats are Paimon, Iceberg, and Lance. In the future, more kinds of data lake support are on the way. Keep eyes on us!
1818

1919
The overall tiered storage architecture is shown in the following diagram:
2020

website/docs/streaming-lakehouse/integrate-data-lakes/paimon.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,14 @@ sidebar_position: 1
55

66
# Paimon
77

8+
## Introduction
9+
810
[Apache Paimon](https://paimon.apache.org/) innovatively combines a lake format with an LSM (Log-Structured Merge-tree) structure, bringing efficient updates into the lake architecture.
911
To integrate Fluss with Paimon, you must enable lakehouse storage and configure Paimon as the lakehouse storage. For more details, see [Enable Lakehouse Storage](maintenance/tiered-storage/lakehouse-storage.md#enable-lakehouse-storage).
1012

11-
## Introduction
13+
## Configure Paimon as LakeHouse Storage
14+
15+
For general guidance on configuring Paimon as the lakehouse storage, you can refer to [Lakehouse Storage](maintenance/tiered-storage/lakehouse-storage.md) documentation. When starting the tiering service, make sure to use Paimon-specific configurations as parameters.
1216

1317
When a table is created or altered with the option `'table.datalake.enabled' = 'true'`, Fluss will automatically create a corresponding Paimon table with the same table path.
1418
The schema of the Paimon table matches that of the Fluss table, except for the addition of three system columns at the end: `__bucket`, `__offset`, and `__timestamp`.

website/docs/streaming-lakehouse/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,4 @@ Some powerful features it provided are:
4444
- **Analytical Streams**: The union reads help data streams to have the powerful analytics capabilities. This reduces complexity when developing streaming applications, simplifies debugging, and allows for immediate access to live data insights.
4545
- **Connect to Lakehouse Ecosystem**: Fluss keeps the table metadata in sync with data lake catalogs while compacting data into Lakehouse. This allows external engines like Spark, StarRocks, Flink, Trino to read the data directly by connecting to the data lake catalog.
4646

47-
Currently, Fluss supports [Paimon](integrate-data-lakes/paimon.md) and [Lance](integrate-data-lakes/lance.md) as Lakehouse Storage, more kinds of data lake formats are on the roadmap.
47+
Currently, Fluss supports [Paimon](integrate-data-lakes/paimon.md), [Iceberg](integrate-data-lakes/iceberg.md), and [Lance](integrate-data-lakes/lance.md) as Lakehouse Storage, more kinds of data lake formats are on the roadmap.

0 commit comments

Comments
 (0)