Skip to content

Commit 66d74c3

Browse files
Mehul BatraMehul Batra
authored andcommitted
address comments & redo diagrams
1 parent 13fd44d commit 66d74c3

18 files changed

+35
-47
lines changed

website/blog/2025-12-02-fluss-x-iceberg-why-your-lakehouse-is-not-streamhouse-yet.md

Lines changed: 35 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,15 @@ date: 2025-12-02
55
tags: [streaming-lakehouse, apache-iceberg, real-time-analytics]
66
---
77

8-
![Banner](assets/fluss-x-iceberg/fluss-lakehouse-iceberg.png)
98

109
As software/data engineers, we've witnessed Apache Iceberg revolutionize analytical data lakes with ACID transactions, time travel, and schema evolution. Yet when we try to push Iceberg into real-time workloads—sub-second streaming queries, high-frequency CDC updates, and primary key semantics—we hit fundamental architectural walls. This blog explores how Fluss × Iceberg integration works and delivers a true real-time lakehouse.
1110

1211
Apache Fluss represents a new architectural approach: the **Streamhouse** for real-time lakehouses. Instead of stitching together separate streaming and batch systems, the Streamhouse unifies them under a single architecture. In this model, Apache Iceberg continues to serve exactly the role it was designed for: a highly efficient, scalable cold storage layer for analytics, while Fluss fills the missing piece: a hot streaming storage layer with sub-second latency, columnar storage, and built-in primary-key semantics.
1312

1413
After working on Fluss–Iceberg lakehouse integration and deploying this architecture at a massive scale, including Alibaba's 3 PB production deployment processing 40 GB/s, we're ready to share the architectural lessons learned. Specifically, why existing systems fall short, how Fluss and Iceberg naturally complement each other, and what this means for finally building true real-time lakehouses.
1514

15+
![Banner](assets/fluss-x-iceberg/fluss-realtime-lakehouse.png)
16+
1617
<!-- truncate -->
1718

1819
## The Real-Time Lakehouse Imperative
@@ -29,7 +30,7 @@ Four converging forces are driving the need for sub-second data infrastructure:
2930

3031
**4. Agentic AI Requires Real-Time Context:** AI agents need immediate access to the current system state to make decisions. Whether it's autonomous trading systems, intelligent routing agents, or customer service bots, agents can't operate effectively on stale data.
3132

32-
![Use Cases](assets/fluss-x-iceberg/fluss-lakehouse-usecases.png)
33+
![Use Cases](assets/fluss-x-iceberg/lakehouse_usecases.png)
3334

3435
### The Evolution of Data Freshness
3536

@@ -43,14 +44,15 @@ Four converging forces are driving the need for sub-second data infrastructure:
4344

4445
Yet critical use cases demand sub-second to second-level latency: search and recommendation systems with real-time personalization, advertisement attribution tracking, anomaly detection for fraud and security monitoring, operational intelligence for manufacturing/logistics/ride-sharing, and Gen AI model inference requiring up-to-the-second features. The industry needs a **hot real-time layer** sitting in front of the lakehouse.
4546

46-
![Evolution Timeline](assets/fluss-x-iceberg/evolution-timeline.png)
47+
![Evolution Timeline](assets/fluss-x-iceberg/evolution.png)
4748

4849
## What is Fluss × Iceberg?
4950

5051
### The Core Concept: Hot/Cold Unified Storage
5152

5253
The Fluss architecture delivers millisecond-level end-to-end latency for real-time data writing and reading. Its **Tiering Service** continuously offloads data into standard lakehouse formats like Apache Iceberg, enabling external query engines to analyze data directly. This streaming/lakehouse unification simplifies the ecosystem, ensures data freshness for critical use cases, and combines real-time and historical data seamlessly for comprehensive analytics.
5354

55+
**Unified Data Locality:** Fluss aligns partitions and buckets across both streaming and lakehouse layers, ensuring consistent data layout. This alignment enables direct Arrow-to-Parquet conversion without network shuffling or repartitioning, dramatically reducing I/O overhead and improving pipeline performance.
5456
Think of your data as having two thermal zones:
5557

5658
**Hot Tier (Fluss):** Last 1 hour of data, NVMe/SSD storage, sub-second latency, primary key indexed (RocksDB), streaming APIs, Apache Arrow columnar format. High-velocity writes, frequent updates, sub-second query latency requirements.
@@ -59,9 +61,10 @@ Think of your data as having two thermal zones:
5961

6062
Traditional architectures force you to maintain **separate systems** for these zones: Kafka/Kinesis for streaming (hot), Iceberg for analytics (cold), complex ETL pipelines to move data between them, and applications writing to both systems (dual-write problem).
6163

64+
![Kappa vs Lambda Architecture](assets/fluss-x-iceberg/kappa-vs-lambda.png)
65+
6266
**Fluss × Iceberg unifies these as tiered storage with Kappa architecture:** Applications write once to Fluss. A stateless Tiering Service (Flink job) automatically moves data from hot to cold storage based on configured freshness (e.g., 30 seconds, 5 minutes). Query engines see a single table that seamlessly spans both tiers—eliminating the dual-write complexity of Lambda architecture.
6367

64-
![Kappa vs Lambda Architecture](assets/fluss-x-iceberg/kappa-vs-lambda-2.png)
6568

6669
### Why This Architecture Matters
6770

@@ -75,6 +78,8 @@ Traditional architectures force you to maintain **separate systems** for these z
7578

7679
**Query flexibility:** Run streaming queries on hot data (Fluss), analytical queries on cold data (Iceberg), or union queries that transparently span both tiers.
7780

81+
![Tiering Service](assets/fluss-x-iceberg/fluss-tiering.png)
82+
7883
## What Iceberg Misses Today
7984

8085
Apache Iceberg was architected for batch-optimized analytics. While it supports streaming ingestion, fundamental design decisions create unavoidable limitations for real-time workloads.
@@ -282,43 +287,31 @@ While tiering data, the service optionally performs bin-packing compaction:
282287

283288
**Result:** Streaming workloads avoid small file proliferation without separate maintenance jobs.
284289

285-
![Tiering Service](assets/fluss-x-iceberg/fluss-tiering.png)
286-
287290
### Solution 4: Union Read for Seamless Query Across Tiers
288291

289292
**Enables:** Querying hot + cold data as a single logical table
290293

291294
The architectural breakthrough enabling a real-time lakehouse is **client-side stitching with metadata coordination**. This is what makes Fluss truly a **Streaming Lakehouse**—unlocking real-time data to the Lakehouse with union delta log (minutes) on Fluss.
292295

293-
**How Union Read Works:**
296+
### How Union Read Works
294297

295-
```
296-
┌─────────────────────────────────────────────────┐
297-
│ 1. Client queries Fluss CoordinatorServer │
298-
│ for table metadata │
299-
│ │
300-
│ 2. Coordinator responds: │
301-
│ - Iceberg snapshot ID: snap_12345 │
302-
│ - Fluss offset boundary per bucket: │
303-
│ {bucket_0: offset_5000, │
304-
│ bucket_1: offset_5123, ...} │
305-
│ │
306-
│ 3. Client query execution: │
307-
│ │
308-
│ For streaming queries: │
309-
│ ├─ Read Iceberg (offsets 0 to 5000) │
310-
│ │ - Efficient historical catch-up │
311-
│ │ - Columnar Parquet scans │
312-
│ └─ Transition to Fluss (offset 5001+) │
313-
│ - Real-time streaming │
314-
│ - No overlap, no gaps │
315-
│ │
316-
│ For batch queries: │
317-
│ ├─ Primary read: Iceberg snapshot │
318-
│ └─ Supplement: Last few minutes from Fluss │
319-
│ - Second-level freshness │
320-
└─────────────────────────────────────────────────┘
321-
```
298+
![Union Read Architecture](assets/fluss-x-iceberg/fluss-unionread.png)
299+
300+
Union Read seamlessly combines hot and cold data through intelligent offset coordination, as illustrated above:
301+
302+
**The Example:** Consider a query that needs records for users Jark, Mehul, and Yuxia:
303+
304+
1. **Offset Coordination:** Fluss CoordinatorServer provides Snapshot 06 as the Iceberg boundary. At this snapshot, Iceberg contains `{Jark: 30, Yuxia: 20}`.
305+
306+
2. **Hot Data Supplement:** Fluss's real-time layer holds the latest updates beyond the snapshot: `{Jark: 30, Mehul: 20, Yuxia: 20}` (including Mehul's new record).
307+
308+
3. **Union Read in Action:** The query engine performs a union read:
309+
- Reads `{Jark: 30, Yuxia: 20}` from Iceberg (Snapshot 06)
310+
- Supplements with `{Mehul: 20}` from Fluss (new data after the snapshot)
311+
312+
4. **Sort Merge:** Results are merged and deduplicated, producing the final unified view: `{Jark: 30, Mehul: 20}` (Yuxia's update already in Iceberg).
313+
314+
**Key Benefit:** The application queries a single logical table while the system intelligently routes between Iceberg (historical) and Fluss (real-time) with zero gaps or overlaps.
322315

323316
**Union Read Capabilities:**
324317

@@ -353,24 +346,19 @@ Fluss coordinator persists this mapping. When clients query, they receive the ex
353346
- **No gaps:** Fluss handles offsets > boundary
354347
- **Total ordering preserved** per bucket
355348

356-
![Union Read](assets/fluss-x-iceberg/fluss-union-read.png)
357349

350+
## Architecture Benefits
358351

352+
### Cost-Efficient Storage
353+
![Historical Analysis](assets/fluss-x-iceberg/fluss-lake-history.png)
359354

360-
## Architecture Benefits
355+
Automatic tiering optimizes storage and analytics: efficient backfill, projection/filter pushdown, high Parquet compression, and S3 throughput.
361356

362-
| Capability | How Fluss × Iceberg Delivers |
363-
|------------|------------------------------|
364-
| Sub-second enrichment | Lookup joins on PK tables (500K+ QPS) |
365-
| Real-time flagging | Streaming INSERT with computed risk scores |
366-
| Historical analysis | Union read combines Fluss (hot) + Iceberg (cold) |
367-
| Fast investigation | Point queries on dimension tables by PK |
368-
| Profile updates | UPDATE on PK tables, immediately reflected |
369-
| Cost-efficient storage | Hot data in Fluss, cold data tiers to Iceberg |
357+
### Real-Time Analytics
358+
![Real-time Analytics](assets/fluss-x-iceberg/fluss-lake-realtime.png)
370359

371-
![Historical Analysis](assets/fluss-x-iceberg/fluss-lakehouse-historical.png)
360+
Union Read delivers sub-second lakehouse freshness: union delta log on Fluss, Arrow-native exchange, and seamless integration with Flink, Spark *, Trino, and StarRocks.
372361

373-
![Real-time Analytics](assets/fluss-x-iceberg/fluss-lakehouse-realtime.png)
374362

375363
### Key Takeaways
376364

@@ -388,7 +376,7 @@ This gives you a working streaming lakehouse environment in minutes. Visit: [htt
388376

389377
Apache Fluss and Apache Iceberg represent a fundamental rethinking of real-time lakehouse architecture. Instead of forcing Iceberg to become a streaming platform (which architecturally it was never designed to be), Fluss embraces Iceberg for its strengths—cost-efficient analytical storage with ACID guarantees—while adding the missing hot streaming layer.
390378

391-
The result is a Kappa architecture that delivers:
379+
The result is a Streamhouse that delivers:
392380

393381
- **Sub-second query latency** for real-time workloads
394382
- **Second-level freshness** for analytical queries (versus T+1 hour)
-731 KB
Binary file not shown.
213 KB
Loading
381 KB
Loading
402 KB
Loading
-66.4 KB
Binary file not shown.
-219 KB
Binary file not shown.
-62.8 KB
Binary file not shown.
-65.5 KB
Binary file not shown.
1010 KB
Loading

0 commit comments

Comments
 (0)