Skip to content

Conversation

@swuferhong
Copy link
Contributor

Purpose

Linked issue: #749

Currently, Fluss uses HighWatermark to ensure data synchronization between the leader and followers. However, this approach can lead to data loss or data inconsistency during cluster upgrades or when tabletServers crash. Referencing Kafka KIP-101, we plan to replace the HighWatermark mechanism in Fluss with a leader epoch update mechanism.

To achieve this goal, we need to build a consistent LeaderEpochCache across different tabletServers. The update of this LeaderEpochCache relies on the fetchLogRequest pulling the leader's RecordBatch and reading the LeaderEpoch from the batch to update the cache.

Currently, the LogRecordBatch does not include the LeaderEpoch, so it needs to be introduced, and the magic version of LogRecordBatch should be upgraded accordingly.

Brief change log

Tests

API and Format

Documentation

@swuferhong swuferhong force-pushed the logRecordBatch-add-leaderEpoch branch 3 times, most recently from 1a1693f to e8b5338 Compare April 22, 2025 01:42
@swuferhong swuferhong marked this pull request as draft April 22, 2025 03:01
@swuferhong swuferhong linked an issue Apr 24, 2025 that may be closed by this pull request
2 tasks
@swuferhong swuferhong force-pushed the logRecordBatch-add-leaderEpoch branch 2 times, most recently from ea098f3 to 875f12f Compare April 24, 2025 07:43
@swuferhong swuferhong marked this pull request as ready for review April 24, 2025 07:46
@swuferhong
Copy link
Contributor Author

@wuchong pr ready, could you take a look at this pr? thx.

@swuferhong swuferhong requested a review from wuchong April 25, 2025 02:16
@swuferhong swuferhong force-pushed the logRecordBatch-add-leaderEpoch branch 2 times, most recently from e3dd71e to 03c9bbe Compare August 11, 2025 03:00
Copy link
Contributor

@platinumhamburg platinumhamburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM. However, there are some minor issues: several test cases hardcode the CURRENT_LOG_MAGIC_VALUE magic value. It would be better to parameterize these related test cases.

@swuferhong
Copy link
Contributor Author

@platinumhamburg comments addressed

@swuferhong swuferhong force-pushed the logRecordBatch-add-leaderEpoch branch 2 times, most recently from 3495486 to 3b33f5f Compare August 29, 2025 03:52
@polyzos polyzos force-pushed the main branch 3 times, most recently from d88c76c to 434a4f4 Compare August 31, 2025 15:13
Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit to address the above comments. Please review the changes. @swuferhong

@wuchong wuchong force-pushed the logRecordBatch-add-leaderEpoch branch from 3b33f5f to b60e2a1 Compare September 9, 2025 12:42
@swuferhong
Copy link
Contributor Author

@wuchong LGTM, the CI has been failed.

@polyzos
Copy link
Contributor

polyzos commented Sep 9, 2025

I think this PR affects a lot also this, right?

@wuchong
Copy link
Member

wuchong commented Sep 10, 2025

@polyzos no, this PR doesn't affect #1605. We don't need to upgrade magic version for new record format.

@wuchong wuchong merged commit a25c64b into apache:main Sep 10, 2025
8 of 9 checks passed
polyzos pushed a commit to polyzos/fluss that referenced this pull request Sep 10, 2025
…leaderEpoch (apache#778)

Co-authored-by: Jark Wu <[email protected]>
# Conflicts:
#	fluss-client/src/test/java/org/apache/fluss/client/write/IndexedLogWriteBatchTest.java
#	fluss-common/src/main/java/org/apache/fluss/record/MemoryLogRecordsIndexedBuilder.java
swuferhong added a commit to luoyuxia/fluss that referenced this pull request Sep 17, 2025
polyzos pushed a commit to polyzos/fluss that referenced this pull request Sep 21, 2025
polyzos pushed a commit to polyzos/fluss that referenced this pull request Sep 21, 2025
loserwang1024 added a commit to loserwang1024/fluss that referenced this pull request Sep 26, 2025
leosanqing pushed a commit to leosanqing/fluss that referenced this pull request Sep 29, 2025
@swuferhong swuferhong deleted the logRecordBatch-add-leaderEpoch branch December 18, 2025 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bump LogRecordBatch's CURRENT_LOG_MAGIC_VALUE to V1 to support leaderEpoch

4 participants