Duplicate dispatcher can be created during maintainer failover before orphan dispatcher drains

During maintainer failover, TiCDC may create duplicate dispatchers for the same table span and `startTs`. One dispatcher is owned by the new maintainer, while another appears to be a delayed/orphaned dispatcher request from the previous maintainer. The orphan dispatcher is removed after the new maintainer observes it, but it may already have entered `Working` state and pushed DMLs into the sink, causing downstream write conflicts.

In the incident around `2026-05-11 23:08:45`, new maintainer `fba6e585-1710-4b74-9102-fcae45514fff` bootstrapped with `nodeCount=1`, `spanCount=17`, and checkpoint/startTs `466233824429998495`.

For tableID `121` (`workload.sbtest20`), the current maintainer created dispatcher `36949683455125699276247693423456213681`. Another dispatcher, `106138415981049979221359258994415068339`, was later created for the same full span and same startTs, but no corresponding maintainer span/operator was found. Maintainer then logged `no span found, remove it`, while the dispatcher still had `tableProgressLen=378`.

For tableID `142` (`workload.sbtest27`), the same pattern happened between dispatcher `354956623857882180211263781318113896091` and orphan dispatcher `1172258264747904389658988347929528985`; the orphan had `tableProgressLen=380` when removal started.

Shortly after both duplicated dispatcher pairs handshook, TiCDC reported downstream `Error 9007 Write conflict` and retried DMLs.

Expected behavior: for one changefeed/mode/table span, only one active dispatcher should be able to write to the downstream sink during failover.

Suspected cause: bootstrap only reconstructs state from alive dispatcher managers. A delayed create request from the previous maintainer can still be processed by dispatcher manager after the new maintainer has already recreated the same table span. There is no global fence preventing the orphan dispatcher from writing before it is detected and drained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate dispatcher can be created during maintainer failover before orphan dispatcher drains #5083

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Duplicate dispatcher can be created during maintainer failover before orphan dispatcher drains #5083

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions