txn: fix resolver cache usage for async commit #1629

cfzjywxk · 2025-04-11T10:46:05Z

Do not call checkSecondaries if the transaction status is determined for async commit transactions.
Small refactor on the TxnStatus usages.
Do not use ttl = 0 setting in test cases.

ti-chi-bot · 2025-04-11T10:46:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from cfzjywxk, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

txnkv/txnlock/lock_resolver.go:1029

Typo in comment: 'wih' should be 'with'.

// The `checkAllSecondaries` finishes wih no errors.

Copilot

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (3)

txnkv/txnlock/lock_resolver.go:1021

[nitpick] Consider renaming 'toResolveKeys' to 'resolveKeys' for clarity and consistency with its usage in the async commit resolution functions.

var toResolveKeys [][]byte

txnkv/txnlock/lock_resolver.go:1044

[nitpick] Consider adding a clarifying comment to explain the purpose and behavior when the 'resolveAsyncCommitLockReturn' failpoint is triggered to improve code readability.

if _, err := util.EvalFailpoint("resolveAsyncCommitLockReturn"); err == nil {

txnkv/txnlock/lock_resolver_test.go:18

Consider adding a test case that covers the branch for async commit transactions with undetermined status (i.e., where checkAllSecondaries is called) to ensure this logic is fully exercised.

lock := func(key, primary string, startTS uint64, useAsyncCommit bool, secondaries [][]byte) *kvrpcpb.LockInfo {

Signed-off-by: cfzjywxk <[email protected]>

cfzjywxk · 2025-04-14T11:20:06Z

/retest

MyonKeminta

Understanding this part of code has become much more difficult than I would imagine...

MyonKeminta · 2025-04-15T04:47:32Z

txnkv/txnlock/lock_resolver.go

+func (s TxnStatus) IsRolledBack() bool {
+	return s.ttl == 0 && s.commitTS == 0 && (s.action == kvrpcpb.Action_NoAction ||
+		s.action == kvrpcpb.Action_LockNotExistRollback ||
+		s.action == kvrpcpb.Action_TTLExpireRollback)
+}


In my memory, whether the TxnStatus type stands for commited or rolled back is strictly defined by the field ttl and commitTS. Is the previous definition wrong now? May this change cause other unexpected side effect?

The TxnStatus is an abstraction that introduces some complexity. In the current implementation, TxnStatus is determined by the KV response from CheckTxnStatus. Whether a transaction is rolled back depends on the TTL and the specified action, while whether it's committed depends on whether the commit timestamp is greater than 0.

To make it clear two seperate functions IsCommitted and IsRolledback are abstarcated, according to the current CheckTxnStatus implementation.

TxnStatus::RolledBack => resp.set_action(Action::NoAction), TxnStatus::TtlExpire => resp.set_action(Action::TtlExpireRollback), TxnStatus::LockNotExist => resp.set_action(Action::LockNotExistRollback), TxnStatus::Committed { commit_ts } => { resp.set_commit_version(commit_ts.into_inner()) }

Only from these four types of respones the transaction status is determined.

MyonKeminta · 2025-04-15T07:58:37Z

txnkv/txnlock/lock_resolver.go

+		toResolveKeys = make([][]byte, 0, len(status.primaryLock.Secondaries)+1)
+		toResolveKeys = append(toResolveKeys, status.primaryLock.Secondaries...)
+		toResolveKeys = append(toResolveKeys, l.Primary)


This might include those keys that has already been resolved. But it seems it's still faster than checking secondary again and then resolve them as 1 RPC is faster than 2.
Is it that you've already considered that? I suggest that this can be noted in comments.

Yes, I would add some comments. The perform should be minial as resolving async commit locks is not a common operation.

MyonKeminta · 2025-04-15T08:06:53Z

txnkv/txnlock/lock_resolver.go

-				status.ttl = cmdResp.LockTtl
-			}
-		} else if cmdResp.LockTtl != 0 {
+		if cmdResp.LockTtl != 0 {


Can't remenber the purpose of this code. Could you explain this change?

The

if status.primaryLock != nil && status.primaryLock.UseAsyncCommit && !forceSyncCommit { if !lr.store.GetOracle().IsExpired(txnID, cmdResp.LockTtl, &oracle.Option{TxnScope: oracle.GlobalTxnScope}) { status.ttl = cmdResp.LockTtl } }

logic is to change TxnStatus.ttl to zero, so the

if status.ttl != 0 return if status.primaryLock != nil && status.primaryLock.usaAsyncCommit && !forceSyncCommit { resolveAsyncCommit }

resolveAsyncCommit could be reached in the resolve function.

I’ve moved the TTL expiration check back into the resolve function. This helps avoid modifying the TxnStatus state deep inside the call stack—especially since setting the TTL to 0 can be confusing. It also ensures that the TTL field remains read-only throughout the process.

In actual code, all lock TTLs should be greater than 0. Whether a lock has expired should ideally be determined by the upper layer, rather than by modifying the internal state of TxnStatus IMO.

Signed-off-by: cfzjywxk <[email protected]>

cfzjywxk added the type/bugfix This PR fixes a bug. label Apr 11, 2025

cfzjywxk requested review from zyguan, you06, MyonKeminta, ekexium and Copilot April 11, 2025 10:46

ti-chi-bot bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Apr 11, 2025

ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 11, 2025

Copilot AI reviewed Apr 11, 2025

View reviewed changes

cfzjywxk requested a review from Copilot April 11, 2025 10:49

Copilot AI reviewed Apr 11, 2025

View reviewed changes

fix resolver cache usage for async commit

ff3bb45

Signed-off-by: cfzjywxk <[email protected]>

cfzjywxk force-pushed the fix_resolve_cache branch from 152f757 to ff3bb45 Compare April 14, 2025 07:00

cfzjywxk added 4 commits April 14, 2025 15:48

fix test

65a456d

Signed-off-by: cfzjywxk <[email protected]>

avoid zero ttl usage in test cases

f5a9157

Signed-off-by: cfzjywxk <[email protected]>

refactor case to avoid zero ttl usage

c4dbe5e

Signed-off-by: cfzjywxk <[email protected]>

refactor test case resolving async commit left locks

d22e0e0

Signed-off-by: cfzjywxk <[email protected]>

cfzjywxk added the require-LGT3 Indicates that the PR requires three LGTM. label Apr 14, 2025

MyonKeminta reviewed Apr 15, 2025

View reviewed changes

panic if saved resolved state is undertemined

1fd932d

Signed-off-by: cfzjywxk <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

txn: fix resolver cache usage for async commit #1629

txn: fix resolver cache usage for async commit #1629

Uh oh!

cfzjywxk commented Apr 11, 2025 •

edited

Loading

Uh oh!

ti-chi-bot bot commented Apr 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

cfzjywxk commented Apr 14, 2025

Uh oh!

MyonKeminta left a comment

Uh oh!

MyonKeminta Apr 15, 2025

Uh oh!

cfzjywxk Apr 15, 2025

Uh oh!

MyonKeminta Apr 15, 2025

Uh oh!

cfzjywxk Apr 15, 2025

Uh oh!

MyonKeminta Apr 15, 2025

Uh oh!

cfzjywxk Apr 15, 2025

Uh oh!

cfzjywxk Apr 15, 2025

Uh oh!

Uh oh!

txn: fix resolver cache usage for async commit #1629

Are you sure you want to change the base?

txn: fix resolver cache usage for async commit #1629

Uh oh!

Conversation

cfzjywxk commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ti-chi-bot bot commented Apr 11, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

cfzjywxk commented Apr 14, 2025

Uh oh!

MyonKeminta left a comment

Choose a reason for hiding this comment

Uh oh!

MyonKeminta Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

cfzjywxk Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

MyonKeminta Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

cfzjywxk Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

MyonKeminta Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

cfzjywxk Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

cfzjywxk Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cfzjywxk commented Apr 11, 2025 •

edited

Loading