Skip to content

Commit 9479eb6

Browse files
committed
[docs] Minor improvement on the Lakehouse Storage page
(cherry picked from commit d04e840)
1 parent cbf2b18 commit 9479eb6

File tree

1 file changed

+6
-7
lines changed

1 file changed

+6
-7
lines changed

website/docs/maintenance/tiered-storage/lakehouse-storage.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ datalake.paimon.metastore: filesystem
4848
datalake.paimon.warehouse: /tmp/paimon_data_warehouse
4949
```
5050
51-
Fluss processes Paimon configurations by removing the `datalake.paimon.` prefix and then use the remaining configuration (without the prefix `datalake.paimon.`) to create the Paimon catalog.
51+
Fluss processes Paimon configurations by removing the `datalake.paimon.` prefix and then use the remaining configuration (without the prefix `datalake.paimon.`) to create the Paimon catalog. Checkout the [Paimon documentation](https://paimon.apache.org/docs/1.1/maintenance/configurations/) for more details on the available configurations.
5252

5353
For example, if you want to configure to use Hive catalog, you can configure like following:
5454
```yaml
@@ -59,7 +59,7 @@ datalake.paimon.warehouse: hdfs:///path/to/warehouse
5959
```
6060
#### Add other jars required by datalake
6161
While Fluss includes the core Paimon library, additional jars may still need to be manually added to `${FLUSS_HOME}/plugins/paimon/` according to your needs.
62-
For example, for OSS filesystem support, you need to put `paimon-oss`.jar into directory `${FLUSS_HOME}/plugins/paimon/`.
62+
For example, for OSS filesystem support, you need to put `paimon-oss-<paimon_version>.jar` into directory `${FLUSS_HOME}/plugins/paimon/`.
6363

6464
### Start The Datalake Tiering Service
6565
Then, you must start the datalake tiering service to tier Fluss's data to the lakehouse storage.
@@ -68,12 +68,12 @@ Then, you must start the datalake tiering service to tier Fluss's data to the la
6868
- Download [fluss-flink-tiering-$FLUSS_VERSION$.jar](https://repo1.maven.org/maven2/com/alibaba/fluss/fluss-flink-tiering/$FLUSS_VERSION$/fluss-flink-tiering-$FLUSS_VERSION$.jar)
6969

7070
#### Prepare required jars
71-
- Put [fluss-flink connector jar](/downloads) into `${FLINK_HOME}/lib`, you should choose a connector version matching your Flink version. If you're using Flink 1.20, please use `fluss-flink-1.20-$FLUSS_VERSION$.jar`
72-
- If you use [Amazon S3](http://aws.amazon.com/s3/), [Aliyun OSS](https://www.aliyun.com/product/oss) or [HDFS(Hadoop Distributed File System)](https://hadoop.apache.org/docs/stable/) as Fluss's [remote storage](maintenance/tiered-storage/remote-storage.md),
71+
- Put [fluss-flink connector jar](/downloads) into `${FLINK_HOME}/lib`, you should choose a connector version matching your Flink version. If you're using Flink 1.20, please use [fluss-flink-1.20-$FLUSS_VERSION$.jar](https://repo1.maven.org/maven2/com/alibaba/fluss/fluss-flink-1.20/$FLUSS_VERSION$/fluss-flink-1.20-$FLUSS_VERSION$.jar)
72+
- If you are using [Amazon S3](http://aws.amazon.com/s3/), [Aliyun OSS](https://www.aliyun.com/product/oss) or [HDFS(Hadoop Distributed File System)](https://hadoop.apache.org/docs/stable/) as Fluss's [remote storage](maintenance/tiered-storage/remote-storage.md),
7373
you should download the corresponding [Fluss filesystem jar](/downloads#filesystem-jars) and also put it into `${FLINK_HOME}/lib`
7474
- Put [fluss-lake-paimon jar](https://repo1.maven.org/maven2/com/alibaba/fluss/fluss-lake-paimon/$FLUSS_VERSION$/fluss-lake-paimon-$FLUSS_VERSION$.jar) into `${FLINK_HOME}/lib`, currently only paimon is supported, so you can only choose `fluss-lake-paimon`
7575
- [Download](https://flink.apache.org/downloads/) pre-bundled Hadoop jar `flink-shaded-hadoop-2-uber-*.jar` and put into `${FLINK_HOME}/lib`
76-
- Put Paimon's filesystem jar into `${FLINK_HOME}/lib`, if you use s3 to store paimon data, please put `paimon-s3` jar into `${FLINK_HOME}/lib`
76+
- Put Paimon's [filesystem jar](https://paimon.apache.org/docs/1.1/project/download/) into `${FLINK_HOME}/lib`, if you use s3 to store paimon data, please put `paimon-s3` jar into `${FLINK_HOME}/lib`
7777
- The other jars that Paimon may require, for example, if you use HiveCatalog, you will need to put hive related jars
7878

7979

@@ -88,8 +88,7 @@ After the Flink Cluster has been started, you can execute the `fluss-flink-tieri
8888
```
8989

9090
**Note:**
91-
- The `fluss.bootstrap.servers` should be the bootstrap server address of your Fluss cluster. You must configure all options with the `datalake.` prefix in the [server.yaml](#modify-serveryaml) file to run the tiering service.
92-
to run the tiering service. In this case, these parameters are `--datalake.format`, `--datalake.paimon.metastore`, and `--datalake.paimon.warehouse`.
91+
- The `fluss.bootstrap.servers` should be the bootstrap server address of your Fluss cluster. You must configure all options with the `datalake.` prefix in the [server.yaml](#modify-serveryaml) file to run the tiering service. In this case, these parameters are `--datalake.format`, `--datalake.paimon.metastore`, and `--datalake.paimon.warehouse`.
9392
- The Flink tiering service is stateless, and you can run multiple tiering services simultaneously to tier tables in Fluss.
9493
These tiering services are coordinated by the Fluss cluster to ensure exactly-once semantics when tiering data to the lake storage. This means you can freely scale the service up or down according to your workload.
9594
- This follows the standard practice for [submitting jobs to Flink](https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/deployment/cli/), where you can use the `-D` parameter to specify Flink-related configurations.

0 commit comments

Comments
 (0)