You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/blog/releases/0.7.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ These enhancements make Fluss 0.7 ready for most production use cases.
48
48
In Fluss 0.5, we first introduced the Streaming Lakehouse feature, with the powerful Union Read ability, we can significantly reduce the cost of streaming storage and improve the data freshness of Lakehouse.
49
49
> Note: Union Reads allows querying and combining the results from both Lakehouse (historical) and Fluss (real-time) data.
50
50
51
-
However, the initial implementation had architectural limitations affecting scalability and operability in production. In 0.7, we’ve completely re-architected the Streaming Lakehouse feature to address these challenges.
51
+
However, the initial implementation had architectural limitations affecting scalability and operability in production. In Fluss 0.7, we’ve completely re-architected the Streaming Lakehouse feature to address these challenges.
52
52
53
53
### Elastic Stateless Service
54
54
Previously, the lake tiering service was implemented as a Flink job encapsulating Fluss as a Source and Paimon as a Sink, storing sync offsets state in Flink’s State, this making it a stateful service.
@@ -70,7 +70,7 @@ This optimization significantly reduces the load on Fluss and boosts batch proce
70
70
***Actionable Offset Visibility:** The offset state is now queryable, providing greater insight and control over data processing.
71
71
72
72
This design ensures end-to-end consistency across all operations.
73
-
Furthermore, this stateless design also decouples us from tight Flink dependency, paving the way for future lightweight execution models, such as running on FaaS (Function as a Service).
73
+
Furthermore, this stateless design also decouples us from the tight Flink dependency, paving the way for future lightweight execution models, such as running on FaaS (Function as a Service).
74
74
75
75
### Pluggable Lake Format
76
76
The previous implementation had a tight coupling with Apache Paimon, which restricted our ability to integrate with other lake formats, such as Iceberg.
@@ -95,13 +95,13 @@ Apart from that, we also support the following advanced partition features to ma
95
95
***Dynamic Partition Creation:** Automatically creates required partitions based on incoming data, no manual pre-creation is required.
96
96
***Automatic Partition Discovery:** Fluss source adds matched new partitions to the subscription in real-time.
97
97
98
-
Using real business data from **Taobao - the largest online shopping in China,** we tested the read and write performance between non-partitioned and partitioned tables (with 20 auto-created partitions). The write results show that the multi-level partition and dynamic partition creation mechanism do not have a significant impact on the write performance.
98
+
Using real business data from **Taobao - the largest online shopping platform in China,** we tested the read and write performance between non-partitioned and partitioned tables (with 20 auto-created partitions). The write results show that the multi-level partition and dynamic partition creation mechanism do not have a significant impact on the write performance.
99
99

100
100
101
101
At the same time, under the same data scale, we tested the streaming read performance of non-partitioned tables and partitioned tables under three partition conditions:
102
102
**unconditional**, **medium matching** (hitting five partitions), and **exact matching** (hitting one partition).
103
-
From the results we can observe that when the partition condition only matches 1/20 of the partitions,
104
-
**the network traffic is reduced by about 20 times** and the **processing time is reduced by nearly 9 times**,
103
+
From the results, we can observe that when the partition condition only matches 1/20 of the partitions,
104
+
**the network traffic is reduced by about 20x** and the **processing time is reduced by nearly 9x**,
105
105
demonstrating the huge performance benefit of partition pruning in streaming reads.
0 commit comments