owner: invalid monotonicity assumption on pullerResolvedTs emits false warnings

### What did you do?

I investigated the owner warning below in the current codebase:

```text
the newPullerResolvedTs should not be smaller than c.pullerResolvedTs
```

The warning is emitted in `cdc/owner/changefeed.go` when
`watermark.PullerResolvedTs < c.pullerResolvedTs`.

### What did you expect to see?

Either:

1. `PullerResolvedTs` is defined as a monotonic value end-to-end, so this warning should never fire in normal scheduling paths, or
2. if `PullerResolvedTs` is a snapshot of the current scheduler state, then owner should not assume that it can only increase.

### What did you see instead?

The owner keeps `c.pullerResolvedTs` as a "grow-only" value, but the scheduler computes
`watermark.PullerResolvedTs` as the **current minimum** `puller-egress` resolved ts among all replication sets.

That scheduler-side value can legitimately become smaller in normal cases, for example:

1. A new table/span is added and its stage checkpoints are initialized from a lower checkpoint ts.
2. A table is rescheduled / recovered after capture failure, so the puller subscription is recreated and the new stage stats start from a lower value.
3. A paused/resumed or recovered changefeed reuses the same owner-side `changefeed` instance while the cached `pullerResolvedTs` from the previous run is still kept in memory.

As a result, the warning can be emitted even though there is no real timestamp regression bug in the puller.

### Root cause analysis

There is a semantic mismatch between owner and scheduler:

- In `cdc/owner/changefeed.go`, `c.pullerResolvedTs` is updated as a monotonic cached value:
  - increase: assign
  - decrease: warn only
- In `cdc/scheduler/internal/v3/replication/replication_manager.go`, `watermark.PullerResolvedTs` is recalculated every tick as:
  - `min(table.Stats.StageCheckpoints["puller-egress"].ResolvedTs)` over the **current** table set

That minimum is not monotonic by definition.

Two implementation details make the problem easier to hit:

1. `ReplicationSet.Checkpoint` is merged monotonically, but `ReplicationSet.Stats` is replaced as a whole when fresh stats are reported. So a recreated table can keep a non-regressing table checkpoint while still reporting a lower `puller-egress` stage checkpoint.
2. `releaseResources` resets `resolvedTs`, but it does not reset `lastSyncedTs` or `pullerResolvedTs`, even though the `changefeed` struct is reused on restart/resume.

This means the warning is triggered by an invalid monotonicity assumption in owner, not necessarily by corrupted puller progress.

### Why this matters

This is not just a noisy warning:

- it can mislead operators into thinking the puller resolved ts regressed unexpectedly
- `QueryChangeFeedSyncedStatus` also exposes `cfReactor.pullerResolvedTs`, so a stale monotonic cache can diverge from the current scheduler snapshot used by the system

### Suggested direction

One of these semantics should be chosen explicitly:

1. Treat owner-side `pullerResolvedTs` as the latest scheduler snapshot and allow it to go backward.
2. Keep a separate monotonic field for synced-status style reporting, but do not compare it directly with the current scheduler minimum and do not warn on snapshot regression.

Additionally, `pullerResolvedTs` and `lastSyncedTs` should probably be reset when reusing a `changefeed` instance during resume/reinitialize.

### Version

Observed by code inspection on current `master`-equivalent logic in local checkout `cf888cb0f26881469dd25307dfa721eea91c0c03`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

owner: invalid monotonicity assumption on pullerResolvedTs emits false warnings #12598

What did you do?

What did you expect to see?

What did you see instead?

Root cause analysis

Why this matters

Suggested direction

Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

owner: invalid monotonicity assumption on pullerResolvedTs emits false warnings #12598

Description

What did you do?

What did you expect to see?

What did you see instead?

Root cause analysis

Why this matters

Suggested direction

Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions