|
| 1 | +--- |
| 2 | +title: Lance |
| 3 | +sidebar_position: 1 |
| 4 | +--- |
| 5 | + |
| 6 | +# Lance |
| 7 | + |
| 8 | +[Apache Paimon](https://paimon.apache.org/) innovatively combines a lake format with an LSM (Log-Structured Merge-tree) structure, bringing efficient updates into the lake architecture. |
| 9 | +To integrate Fluss with Lance, you must enable lakehouse storage and configure Lance as the lakehouse storage. For more details, see [Enable Lakehouse Storage](maintenance/tiered-storage/lakehouse-storage.md#enable-lakehouse-storage). |
| 10 | + |
| 11 | +## Introduction |
| 12 | + |
| 13 | +When a table is created or altered with the option `'table.datalake.enabled' = 'true'`, Fluss will automatically create a corresponding Lance table with the same table path. |
| 14 | +The schema of the Paimon table matches that of the Fluss table. |
| 15 | + |
| 16 | +```sql title="Flink SQL" |
| 17 | +USE CATALOG fluss_catalog; |
| 18 | + |
| 19 | +CREATE TABLE fluss_order_with_lake ( |
| 20 | + `order_key` BIGINT, |
| 21 | + `cust_key` INT NOT NULL, |
| 22 | + `total_price` DECIMAL(15, 2), |
| 23 | + `order_date` DATE, |
| 24 | + `order_priority` STRING, |
| 25 | + `clerk` STRING, |
| 26 | + `ptime` AS PROCTIME(), |
| 27 | + PRIMARY KEY (`order_key`) NOT ENFORCED |
| 28 | + ) WITH ( |
| 29 | + 'table.datalake.enabled' = 'true', |
| 30 | + 'table.datalake.freshness' = '30s' |
| 31 | +); |
| 32 | +``` |
| 33 | + |
| 34 | +Then, the datalake tiering service continuously tiers data from Fluss to Lance. The parameter `table.datalake.freshness` controls the frequency that Fluss writes data to Paimon tables. By default, the data freshness is 3 minutes. |
| 35 | +For primary key tables, changelogs are also generated in the Paimon format, enabling stream-based consumption via Paimon APIs. |
| 36 | + |
| 37 | +Since Fluss version 0.7, you can also specify Paimon table properties when creating a datalake-enabled Fluss table by using the `paimon.` prefix within the Fluss table properties clause. |
| 38 | + |
| 39 | +```sql title="Flink SQL" |
| 40 | +CREATE TABLE fluss_order_with_lake ( |
| 41 | + `order_key` BIGINT, |
| 42 | + `cust_key` INT NOT NULL, |
| 43 | + `total_price` DECIMAL(15, 2), |
| 44 | + `order_date` DATE, |
| 45 | + `order_priority` STRING, |
| 46 | + `clerk` STRING, |
| 47 | + `ptime` AS PROCTIME(), |
| 48 | + PRIMARY KEY (`order_key`) NOT ENFORCED |
| 49 | + ) WITH ( |
| 50 | + 'table.datalake.enabled' = 'true', |
| 51 | + 'table.datalake.freshness' = '30s', |
| 52 | + 'paimon.file.format' = 'orc', |
| 53 | + 'paimon.deletion-vectors.enabled' = 'true' |
| 54 | +); |
| 55 | +``` |
| 56 | + |
| 57 | +For example, you can specify the Paimon property `file.format` to change the file format of the Paimon table, or set `deletion-vectors.enabled` to enable or disable deletion vectors for the Paimon table. |
| 58 | + |
| 59 | +### Reading with other Engines |
| 60 | + |
| 61 | +Since the data tiered to Paimon from Fluss is stored as a standard Paimon table, you can use any engine that supports Paimon to read it. Below is an example using [StarRocks](https://paimon.apache.org/docs/master/engines/starrocks/): |
| 62 | + |
| 63 | +First, create a Paimon catalog in StarRocks: |
| 64 | + |
| 65 | +```sql title="StarRocks SQL" |
| 66 | +CREATE EXTERNAL CATALOG paimon_catalog |
| 67 | +PROPERTIES ( |
| 68 | + "type" = "paimon", |
| 69 | + "paimon.catalog.type" = "filesystem", |
| 70 | + "paimon.catalog.warehouse" = "/tmp/paimon_data_warehouse" |
| 71 | +); |
| 72 | +``` |
| 73 | + |
| 74 | +> **NOTE**: The configuration values for `paimon.catalog.type` and `paimon.catalog.warehouse` must match those used when configuring Paimon as the lakehouse storage for Fluss in `server.yaml`. |
| 75 | +
|
| 76 | +Then, you can query the `orders` table using StarRocks: |
| 77 | + |
| 78 | +```sql title="StarRocks SQL" |
| 79 | +-- The table is in the database `fluss` |
| 80 | +SELECT COUNT(*) FROM paimon_catalog.fluss.orders; |
| 81 | +``` |
| 82 | + |
| 83 | +```sql title="StarRocks SQL" |
| 84 | +-- Query the system tables to view snapshots of the table |
| 85 | +SELECT * FROM paimon_catalog.fluss.enriched_orders$snapshots; |
| 86 | +``` |
| 87 | + |
| 88 | +## Data Type Mapping |
| 89 | + |
| 90 | +When integrating with Paimon, Fluss automatically converts between Fluss data types and Paimon data types. |
| 91 | +The following table shows the mapping between [Fluss data types](table-design/data-types.md) and Paimon data types: |
| 92 | + |
| 93 | +| Fluss Data Type | Paimon Data Type | |
| 94 | +|-------------------------------|-------------------------------| |
| 95 | +| BOOLEAN | BOOLEAN | |
| 96 | +| TINYINT | TINYINT | |
| 97 | +| SMALLINT | SMALLINT | |
| 98 | +| INT | INT | |
| 99 | +| BIGINT | BIGINT | |
| 100 | +| FLOAT | FLOAT | |
| 101 | +| DOUBLE | DOUBLE | |
| 102 | +| DECIMAL | DECIMAL | |
| 103 | +| STRING | STRING | |
| 104 | +| CHAR | CHAR | |
| 105 | +| DATE | DATE | |
| 106 | +| TIME | TIME | |
| 107 | +| TIMESTAMP | TIMESTAMP | |
| 108 | +| TIMESTAMP WITH LOCAL TIMEZONE | TIMESTAMP WITH LOCAL TIMEZONE | |
| 109 | +| BINARY | BINARY | |
| 110 | +| BYTES | BYTES | |
0 commit comments