[Feature] Refactor Lakehouse Storage Implementation

## Motivation

Currently, although we have implemented lakehouse storage and support Paimon as a lake storage, there're some flaws in the implementation:

1. The lakehouse storage is strongly coupled to Paimon which makie it hard to support other datalake formats

2. The implementation is not efficient, it leverages Flink job to compact Fluss's data to Paimon's data, and will read the Fluss's data as Flink's Row data and write the row data to Paimon. It'll need row conversion between fluss, flink, paimon and the data shuffle.  Idealy,  we won't need data shuffle, we can compact the files from a Fluss's bucket to Paimon's bukcet directly since we keep the same data distribution between Fluss and Paimon. 
What's more, we are expected to write Paimon's parquet/orc files directly and commit manifest to speed the compaction. Considering  Fluss use arrow(by default) as log format and there 's efficient conversion from Arrow to Parquet, the compaction can be more efficient.

## Solution

* Design docs(EN): https://docs.google.com/document/d/1Ghw_Jb-yHztgGvO5OpRWgibmPClDivejp7UyLUgKxOc/edit?pli=1&tab=t.0
* Design docs(ZH): https://drive.google.com/file/d/1qzM2HYRVb-Z6uMlOjeP6ywFSriVLINy7/view?usp=drive_link

## Umbrella Tasks
### fluss-server  
- * [x] #430 
- * [x] #431 
- * [x] #432 
- * [x] #433 
### fluss-lake/fluss-lake-common  
- * [x] #434 
- * [x] #435 
- * [x] #436  
- * [ ] #437  

### fluss-lake/fluss-lake-format-paimon  
- * [x] #438 

### fluss-lake/fluss-lake-format-iceberg 
- * [x] #452 

### fluss-lake/fluss-lake-format-hudi  
- * [ ] #453 

### fluss-lake/fluss-lake-format-delta  
- * [ ] #967 

### fluss-lake/fluss-lake-format-lance  
- * [x] #1155

### fluss-lake/fluss-lake-tiering-flink  
- * [x] #439  
- * [x] #440  
- * [ ] #441 

### fluss-connectors/fluss-connector-flink  
- * [x] #442  
- * [x] #443 
- * [ ] #444 

### Remove legacy module [fluss-lakehouse]
- * [x] #445  
- * [x] #446  

### Documentation  
- * [x] #447 
- * [x] #448  


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Refactor Lakehouse Storage Implementation #107

Motivation

Solution

Umbrella Tasks

fluss-server

fluss-lake/fluss-lake-common

fluss-lake/fluss-lake-format-paimon

fluss-lake/fluss-lake-format-iceberg

fluss-lake/fluss-lake-format-hudi

fluss-lake/fluss-lake-format-delta

fluss-lake/fluss-lake-format-lance

fluss-lake/fluss-lake-tiering-flink

fluss-connectors/fluss-connector-flink

Remove legacy module [fluss-lakehouse]

Documentation

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Refactor Lakehouse Storage Implementation #107

Description

Motivation

Solution

Umbrella Tasks

fluss-server

fluss-lake/fluss-lake-common

fluss-lake/fluss-lake-format-paimon

fluss-lake/fluss-lake-format-iceberg

fluss-lake/fluss-lake-format-hudi

fluss-lake/fluss-lake-format-delta

fluss-lake/fluss-lake-format-lance

fluss-lake/fluss-lake-tiering-flink

fluss-connectors/fluss-connector-flink

Remove legacy module [fluss-lakehouse]

Documentation

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions