KafkaMirrormaker2 offset management: How does it work? #10146

MandMaric · 2024-05-22T12:29:06Z

MandMaric
May 22, 2024

I would like to understand the offset management of the mm2 a bit better.
So in our case we have the topic mm2-offset-syncs.{target-cluster}.internal in our source cluster. And the topic mirrormaker2-cluster-offsets in our target cluster.

Why are there more commits in the topic that maps the offsets between both clusters? Our mirrormaker2-cluster-offsets saves the offset for each topic-partition approximately once an hour. Wouldn't this cause a lot of re-consuming if our mm2 dies? Is that setting configurable?

Answered by mimaison

May 23, 2024

Let me start by explaining the role of these 2 topics and the data they each contain.

Let's start with mirrormaker2-cluster-offsets. This is the offsets topic used by the Kafka Connect runtime that runs the MirrorMaker connectors. The Kafka Connect runtime uses it to automatically store offsets from source connectors periodically so in case a source connector is stopped (or crashes) it can resume from its last saved position in the source system. It is created in the cluster that the Kafka Connect runtime is connected too, typically the target cluster. In MirrorMaker only MirrorSourceConnector uses that mechanism to restore its position when it restarts. If you enable exactly once semantics…

View full answer

scholzj · 2024-05-22T15:04:23Z

scholzj
May 22, 2024
Maintainer

I'm not sure if Apache Kafka has any good docs explaining it. @mimaison might know if there is something like that.

0 replies

mimaison · 2024-05-23T19:44:11Z

mimaison
May 23, 2024
Collaborator

Let me start by explaining the role of these 2 topics and the data they each contain.

Let's start with mirrormaker2-cluster-offsets. This is the offsets topic used by the Kafka Connect runtime that runs the MirrorMaker connectors. The Kafka Connect runtime uses it to automatically store offsets from source connectors periodically so in case a source connector is stopped (or crashes) it can resume from its last saved position in the source system. It is created in the cluster that the Kafka Connect runtime is connected too, typically the target cluster. In MirrorMaker only MirrorSourceConnector uses that mechanism to restore its position when it restarts. If you enable exactly once semantics, this connector is able to restart exactly where it left off and not duplicate or skip records.

The other topic mm2-offset-syncs.{target-cluster}.internal is specific to MirrorMaker and is used to translate offsets between the source and target cluster. By default this topic is created in the source cluster but you can opt to put it n the target cluster if you want using offset-syncs.topic.location. Data is written into this topic by MirrorSourceConnector and MirrorCheckpointConnector reads it to translate consumer groups offsets from the source cluster. The mapping between the source and target offsets is called an offset-sync. The frequency at which MirrorSourceConnector writes offset-syncs has changed several times (and is still being worked on) over the past few releases so the exact behavior you see may vary depending on the version you are currently running. By default a new offset-syncs is emitted at least every offset.max.lag records (default to 100) or whether the gap between the source and target offset changes. In some cases, like many small transactions in the source topics, or topics with high record rates, this can result in a lot of offset-syncs. This topic is using the compact cleanup policy so in most cases its size should stay bounded and not too large. If the offset-syncs topics grows too large, you can increase offset.max.lag (however note that this may reduce the accuracy of the offset translation) or make the compaction more aggressive using the min.cleanable.dirty.ratio topic configuration.

8 replies

mimaison Apr 4, 2025
Collaborator

Sorry I don't understand what you mean. Can you re-phase your question?
Also if you have a different question than the one asked by the original author, it's best to start a new discussion.

czandra Apr 15, 2025

@mimaison could you please advice in offset migration.
I'm mirroring messages in scheme B -> A, Where B (target cluster) is a cluster where my producer is writing messages an A (old source cluster) where my consumer is reading.
Messages mirroring B->A is working, but offset updating A -> B is not.
In addition I have created my own replicationPoicy class, so I'm skiping topic rename schema conf on cluster A.
Is there any option to mirror offset without sending messages?
I think the problem is that there is no topics mapping, but don't know how to deliver such information to checkpoint connector.

mimaison Apr 15, 2025
Collaborator

The checkpoints connector relies on the source connector to build offset mappings. So you can't just mirror offsets from A to B without also mirroring records with the source connector.

Again as I said in my last reply above, this seems a different question than the one asked by the original author, it's best to start a new discussion if you want more details.

dinesh-murugiah Jun 24, 2025

@mimaison Your explanation provides good insights, i am planning to use mirror maker for migration my kafka cluster to strimzi
i have a question on accuracy of the offset translation , as per this talk -
https://current.confluent.io/2024-sessions/mirrormaker-2s-offset-translation-isnt-exactly-once-and-thats-okay
my understanding is that translation will not be accurate , so when i switch over there is possibility that my consumer will read old data
if thats a problem for my application , is it advisable to run any scripts to manually sync the offsets by stopping the consumer
do you have any reference to such scripts

NOTE: i am moving for self managed apache kafka running in AWS EC2 to Strimzi

mimaison Jun 25, 2025
Collaborator

Offset translation is not perfect but if your mirroring and consumers have low lag its accuracy is usually pretty good. The recommendation is still to be able to handle some reprocessing in your consumers.

If you want you can manually sync the offsets but I'm afraid, Strimzi nor Kafka don't provide scripts to do so, you have to build them, and decide on their tradeoffs such as accuracy or availability, yourself.

If you have other questions, I suggest starting a new discussion thread instead of adding comments to an already answered discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strimzi

KafkaMirrormaker2 offset management: How does it work? #10146

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 8 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Strimzi

KafkaMirrormaker2 offset management: How does it work? #10146

Uh oh!

MandMaric May 22, 2024

Replies: 2 comments · 8 replies

Uh oh!

scholzj May 22, 2024 Maintainer

Uh oh!

mimaison May 23, 2024 Collaborator

Uh oh!

mimaison Apr 4, 2025 Collaborator

Uh oh!

czandra Apr 15, 2025

Uh oh!

mimaison Apr 15, 2025 Collaborator

Uh oh!

dinesh-murugiah Jun 24, 2025

Uh oh!

mimaison Jun 25, 2025 Collaborator

MandMaric
May 22, 2024

Replies: 2 comments 8 replies

scholzj
May 22, 2024
Maintainer

mimaison
May 23, 2024
Collaborator

mimaison Apr 4, 2025
Collaborator

mimaison Apr 15, 2025
Collaborator

mimaison Jun 25, 2025
Collaborator