Skip to content

Conversation

@prince286
Copy link

This Pull Request implements two key fault tolerance features within MirrorMaker's MirrorSourceTask to enhance data integrity in mission-critical deployments.

The changes address the risk of silent data loss (truncation) and service disruption (topic reset).

1. Log Truncation Detection (Fail-Fast)

Problem: Kafka's retention policies can delete log segments, causing MM2's consumer to lose its expected offset and silently skip data.
Solution (in MirrorSourceTask.java):
An in-memory map tracks the last processed offset for each partition. If the newly polled offset is greater than the last processed offset + 1, a data gap is confirmed.

  • Action: Throws a ConnectException with a CRITICAL log message, forcing a Fail-Fast stop and preventing further silent data loss.

2. Graceful Topic Reset Handling (Auto-Recovery)

Problem: Deleting and recreating the source topic resets its offset to 0. MM2's stored offset is now invalid.
Solution (in MirrorSourceTask.java):
The logic detects if the newly polled offset is less than the last processed offset (indicating a log reset/topic recreation).

  • Action: Logs an AUTO-RECOVERED event but does not throw an exception. The consumer automatically resets to the earliest offset (0) based on Connect's default behavior, allowing replication to resume immediately.

3. Environment Fix (MirrorUtils.java)

Problem: Single-node Connect deployments fail due to Kafka 4.0.0's strict default replication factor of 3 for internal topics.
Solution (in MirrorUtils.java):
The createCompactedTopic method is modified to explicitly set replicationFactor = 1, ensuring compatibility with single-broker testing environments.

@github-actions github-actions bot added triage PRs from the community streams core Kafka Broker producer consumer tools connect performance kraft mirror-maker-2 dependencies Pull requests that update a dependency file storage Pull requests that target the storage module tiered-storage Related to the Tiered Storage feature KIP-932 Queues for Kafka build Gradle build or GitHub Actions docker Official Docker image generator RPC and Record code generator transactions Transactions and EOS clients group-coordinator labels Jan 23, 2026
@prince286 prince286 force-pushed the feature/mm2-fault-tolerance branch from 6fcfce7 to c4125f4 Compare January 29, 2026 18:41
@github-actions github-actions bot added the small Small PRs label Jan 29, 2026
@github-actions
Copy link

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Gradle build or GitHub Actions clients connect consumer core Kafka Broker dependencies Pull requests that update a dependency file docker Official Docker image generator RPC and Record code generator group-coordinator KIP-932 Queues for Kafka kraft mirror-maker-2 needs-attention performance producer small Small PRs storage Pull requests that target the storage module streams tiered-storage Related to the Tiered Storage feature tools transactions Transactions and EOS triage PRs from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant