Skip to content

add debug log#4771

Draft
asddongmen wants to merge 1 commit intopingcap:masterfrom
asddongmen:0408-fix-stuck
Draft

add debug log#4771
asddongmen wants to merge 1 commit intopingcap:masterfrom
asddongmen:0408-fix-stuck

Conversation

@asddongmen
Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Signed-off-by: dongmen <414110582@qq.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 9, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 9, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 9, 2026

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 9, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tenfyzhong for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9958d3bc-3d56-407e-87fe-403f799f84b9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 9, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive logging across the syncpoint lifecycle, including dispatcher resets, preparation, and commit phases, to improve system observability. The review feedback highlights opportunities to optimize these changes by reducing log redundancy and preventing unnecessary memory allocations in performance-critical paths. Specifically, the reviewer suggests wrapping debug-level logs in conditional checks to avoid slice allocations when logging is disabled and consolidating overlapping info-level messages.

Comment on lines +227 to +241
if d.target.EnableSyncPoint() {
req := resetRequest.DispatcherRequest
log.Info("send reset dispatcher syncpoint request",
zap.Stringer("changefeedID", d.target.GetChangefeedID()),
zap.Stringer("dispatcher", d.getDispatcherID()),
zap.Stringer("eventServiceID", serverID),
zap.Uint64("epoch", epoch),
zap.Uint64("resetTs", resetTs),
zap.Uint64("checkpointTs", d.target.GetCheckpointTs()),
zap.Uint64("lastEventCommitTs", d.lastEventCommitTs.Load()),
zap.Uint64("syncPointTs", req.SyncPointTs),
zap.Uint64("syncPointIntervalSeconds", req.SyncPointInterval),
zap.Bool("skipSyncpointAtStartTs", d.target.GetSkipSyncpointAtStartTs()),
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new Info log added here is redundant with the existing Info log on line 244. Both logs are triggered during a dispatcher reset when syncpoint is enabled, and they share several fields (changefeedID, dispatcher, eventServiceID, epoch, resetTs). It is recommended to merge these logs into a single Info log with optional fields or change this new log to Debug level to reduce log noise.

Suggested change
if d.target.EnableSyncPoint() {
req := resetRequest.DispatcherRequest
log.Info("send reset dispatcher syncpoint request",
zap.Stringer("changefeedID", d.target.GetChangefeedID()),
zap.Stringer("dispatcher", d.getDispatcherID()),
zap.Stringer("eventServiceID", serverID),
zap.Uint64("epoch", epoch),
zap.Uint64("resetTs", resetTs),
zap.Uint64("checkpointTs", d.target.GetCheckpointTs()),
zap.Uint64("lastEventCommitTs", d.lastEventCommitTs.Load()),
zap.Uint64("syncPointTs", req.SyncPointTs),
zap.Uint64("syncPointIntervalSeconds", req.SyncPointInterval),
zap.Bool("skipSyncpointAtStartTs", d.target.GetSkipSyncpointAtStartTs()),
)
}
if d.target.EnableSyncPoint() {
req := resetRequest.DispatcherRequest
log.Debug("send reset dispatcher syncpoint request",
zap.Stringer("changefeedID", d.target.GetChangefeedID()),
zap.Stringer("dispatcher", d.getDispatcherID()),
zap.Stringer("eventServiceID", serverID),
zap.Uint64("epoch", epoch),
zap.Uint64("resetTs", resetTs),
zap.Uint64("checkpointTs", d.target.GetCheckpointTs()),
zap.Uint64("lastEventCommitTs", d.lastEventCommitTs.Load()),
zap.Uint64("syncPointTs", req.SyncPointTs),
zap.Uint64("syncPointIntervalSeconds", req.SyncPointInterval),
zap.Bool("skipSyncpointAtStartTs", d.target.GetSkipSyncpointAtStartTs()),
)
}

Comment on lines +573 to 590
if !ready {
fields := []zap.Field{
zap.Stringer("changefeedID", c.changefeedID),
zap.Uint64("preparingTs", preparingTs),
}
if blockingFound {
fields = append(fields,
zap.Stringer("blockingDispatcherID", blockingDispatcherID),
zap.Uint64("blockingSentResolvedTs", blockingSentResolvedTs),
zap.Uint64("blockingCheckpointTs", blockingCheckpointTs),
zap.Uint64("blockingNextSyncPointTs", blockingNextSyncPoint),
zap.Uint64("blockingSeq", blockingSeq),
zap.Uint64("blockingEpoch", blockingEpoch),
)
}
log.Debug("syncpoint prepare stage blocked by dispatcher", fields...)
return
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The construction of the fields slice and the subsequent append calls happen on every call to tryPromoteSyncPointToCommitIfReady when the syncpoint is blocked, even if the log level is higher than Debug. Since this function is called periodically (every 1 second) for every changefeed, this can lead to unnecessary allocations and GC pressure. Consider wrapping the logging logic in a check for the Debug level.

Suggested change
if !ready {
fields := []zap.Field{
zap.Stringer("changefeedID", c.changefeedID),
zap.Uint64("preparingTs", preparingTs),
}
if blockingFound {
fields = append(fields,
zap.Stringer("blockingDispatcherID", blockingDispatcherID),
zap.Uint64("blockingSentResolvedTs", blockingSentResolvedTs),
zap.Uint64("blockingCheckpointTs", blockingCheckpointTs),
zap.Uint64("blockingNextSyncPointTs", blockingNextSyncPoint),
zap.Uint64("blockingSeq", blockingSeq),
zap.Uint64("blockingEpoch", blockingEpoch),
)
}
log.Debug("syncpoint prepare stage blocked by dispatcher", fields...)
return
}
if !ready {
if log.GetLevel() <= zap.DebugLevel {
fields := []zap.Field{
zap.Stringer("changefeedID", c.changefeedID),
zap.Uint64("preparingTs", preparingTs),
}
if blockingFound {
fields = append(fields,
zap.Stringer("blockingDispatcherID", blockingDispatcherID),
zap.Uint64("blockingSentResolvedTs", blockingSentResolvedTs),
zap.Uint64("blockingCheckpointTs", blockingCheckpointTs),
zap.Uint64("blockingNextSyncPointTs", blockingNextSyncPoint),
zap.Uint64("blockingSeq", blockingSeq),
zap.Uint64("blockingEpoch", blockingEpoch),
)
}
log.Debug("syncpoint prepare stage blocked by dispatcher", fields...)
}
return
}

Comment on lines +657 to +676
if !canAdvance {
fields := []zap.Field{
zap.Stringer("changefeedID", c.changefeedID),
zap.Uint64("inFlightTs", inFlightTs),
zap.Uint64("preparingTs", c.syncPointPreparingTs.Load()),
}
if blockingFound {
fields = append(fields,
zap.String("blockingReason", blockingReason),
zap.Stringer("blockingDispatcherID", blockingDispatcherID),
zap.Uint64("blockingNextSyncPointTs", blockingNextSyncPointTs),
zap.Uint64("blockingCheckpointTs", blockingCheckpointTs),
zap.Uint64("blockingSentResolvedTs", blockingSentResolvedTs),
zap.Uint64("blockingSeq", blockingDispatcherSeq),
zap.Uint64("blockingEpoch", blockingDispatcherEpoch),
)
}
log.Debug("syncpoint commit stage blocked by dispatcher", fields...)
return
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous comment, the fields slice is allocated and populated on every call when the commit stage is blocked, regardless of the active log level. This should be optimized to avoid unnecessary allocations in this periodic task.

	if !canAdvance {
		if log.GetLevel() <= zap.DebugLevel {
			fields := []zap.Field{
				zap.Stringer("changefeedID", c.changefeedID),
				zap.Uint64("inFlightTs", inFlightTs),
				zap.Uint64("preparingTs", c.syncPointPreparingTs.Load()),
			}
			if blockingFound {
				fields = append(fields,
					zap.String("blockingReason", blockingReason),
					zap.Stringer("blockingDispatcherID", blockingDispatcherID),
					zap.Uint64("blockingNextSyncPointTs", blockingNextSyncPointTs),
					zap.Uint64("blockingCheckpointTs", blockingCheckpointTs),
					zap.Uint64("blockingSentResolvedTs", blockingSentResolvedTs),
					zap.Uint64("blockingSeq", blockingDispatcherSeq),
					zap.Uint64("blockingEpoch", blockingDispatcherEpoch),
				)
			}
			log.Debug("syncpoint commit stage blocked by dispatcher", fields...)
		}
		return
	}

Comment on lines +745 to +759
nudged := c.nudgeSyncPointCommitIfNeeded(task)
log.Debug("scan range empty after capping",
zap.Stringer("changefeedID", task.changefeedStat.changefeedID),
zap.Stringer("dispatcherID", task.id),
zap.Uint64("startTs", dataRange.CommitTsStart),
zap.Uint64("endTs", dataRange.CommitTsEnd),
zap.Uint64("scanMaxTs", scanMaxTs),
zap.Uint64("ddlResolvedTs", ddlState.ResolvedTs),
zap.Uint64("ddlCommitTs", ddlState.MaxEventCommitTs),
zap.Bool("hasPendingDDLEventInCurrentRange", hasPendingDDLEventInCurrentRange),
zap.Uint64("nextSyncPointTs", nextSyncPointTs),
zap.Uint64("preparingTs", preparingTs),
zap.Uint64("inFlightTs", inFlightTs),
zap.Bool("nudgedSyncPointCommit", nudged),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This log.Debug call is located in the scan path and includes many fields. While zap fields are relatively cheap to construct, the variadic arguments still cause a slice allocation on every call, even when Debug logging is disabled. Given this can be triggered frequently for caught-up or capped dispatchers, it should be wrapped in a level check to improve efficiency.

		nudged := c.nudgeSyncPointCommitIfNeeded(task)
		if log.GetLevel() <= zap.DebugLevel {
			log.Debug("scan range empty after capping",
				zap.Stringer("changefeedID", task.changefeedStat.changefeedID),
				zap.Stringer("dispatcherID", task.id),
				zap.Uint64("startTs", dataRange.CommitTsStart),
				zap.Uint64("endTs", dataRange.CommitTsEnd),
				zap.Uint64("scanMaxTs", scanMaxTs),
				zap.Uint64("ddlResolvedTs", ddlState.ResolvedTs),
				zap.Uint64("ddlCommitTs", ddlState.MaxEventCommitTs),
				zap.Bool("hasPendingDDLEventInCurrentRange", hasPendingDDLEventInCurrentRange),
				zap.Uint64("nextSyncPointTs", nextSyncPointTs),
				zap.Uint64("preparingTs", preparingTs),
				zap.Uint64("inFlightTs", inFlightTs),
				zap.Bool("nudgedSyncPointCommit", nudged),
			)
		}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant