Skip to content

Add retry logic of connection lifetime to cluster client and sentinel client#833

Merged
rueian merged 9 commits into
redis:mainfrom
terut:feat/conn-lifetime-cluster-sentinel
May 11, 2025
Merged

Add retry logic of connection lifetime to cluster client and sentinel client#833
rueian merged 9 commits into
redis:mainfrom
terut:feat/conn-lifetime-cluster-sentinel

Conversation

@terut

@terut terut commented Apr 23, 2025

Copy link
Copy Markdown
Contributor

This is follow-up PR regarding #727 .

See #727 (comment).

@rueian

rueian commented Apr 23, 2025

Copy link
Copy Markdown
Collaborator

Wow. You are lightning fast, but we still need some test cases for this.

@terut

terut commented Apr 23, 2025

Copy link
Copy Markdown
Contributor Author

@rueian Sure, I'll add the tests. I’m not familiar to cluster and sentinel client, so I wanted to check whether my understanding is correct about what you said.

@rueian

rueian commented Apr 23, 2025

Copy link
Copy Markdown
Collaborator

Yes, I think you are on the right track.

@rueian rueian force-pushed the main branch 2 times, most recently from db71c0d to 34ba5cf Compare April 28, 2025 17:51
@terut terut force-pushed the feat/conn-lifetime-cluster-sentinel branch 2 times, most recently from e2204df to 232c933 Compare April 30, 2025 07:08
@terut terut force-pushed the feat/conn-lifetime-cluster-sentinel branch from 232c933 to 3c7632f Compare April 30, 2025 12:58
@terut terut marked this pull request as ready for review May 1, 2025 07:55
@terut terut force-pushed the feat/conn-lifetime-cluster-sentinel branch from 2238fd7 to 4f57123 Compare May 1, 2025 11:43
Comment thread cluster.go Outdated
ncc := c.redirectOrNew(addr, cc, cmd.Slot(), mode)
resp = ncc.Do(ctx, cmd)
if resp.Error() == errConnExpired {
goto retry

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go back to L526?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix 3623adc . Is it correct?

Comment thread cluster.go Outdated
case RedirectAsk:
results := c.redirectOrNew(addr, cc, cmd.Slot(), mode).DoMulti(ctx, cmds.AskingCmd, cmd)
ncc := c.redirectOrNew(addr, cc, cmd.Slot(), mode)
results := c.doMulti(ctx, ncc, cmds.AskingCmd, cmd)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASKING can't be handled this way. It must be sent in every retry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rueian Does it mean that ASKING needs at head of cmds when retrying due to errConnExpired

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ASKING is only effective in the same connection and applied only to the next command; therefore, we need to re-send ASKING before the target cmd if errConnExpired happens.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will also fix askingMulti and askingMultiCache as same.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terut terut requested a review from rueian May 2, 2025 12:32
Comment thread cluster.go Outdated
if i > 0 && commands[i-1] == cmds.AskingCmd {
ml = commands[i-1:]
} else {
ml = commands[i:]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me that we need to handle the MULTI EXEC pattern correctly: If errConnExpired occurs during the MULTI EXEC, we should retry the entire MULTI EXEC block. This applies to the single client as well, which we overlooked in the previous PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rueian Would you check this pull-request #837 ? This is draft so I will fix other methods and add tests after checking the implement.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the draft looks good to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rueian Please review #837 .

@terut terut May 10, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed response of Multi command of test data.

I'm writing tests for askingMulti() but I'm not sure why multi has the following commands when calling that method. Any idea?

// Call for unit testing
multi := []Completed{
	client.B().Get().Key("1{t}").Build(),
	client.B().Multi().Build(),
	client.B().Incr().Key("2{t}").Build(),
	client.B().Incr().Key("3{t}").Build(),
	client.B().Exec().Build(),
	client.B().Get().Key("4{t}").Build(),
}
resps := client.DoMulti(context.Background(), multi...)

// multi in askingMulti(). I don't know why there is no [GET 2{t}] [Get 3{t}].
[ASKING]
[GET 1{t}]
[ASKING]
[MULTI]
[EXEC]
[ASKING]
[GET 4{t}]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix 6362824

Comment thread cluster.go Outdated
var ml []Completed
recover:
ml = ml[:0]
for i, resp := range resps.s {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we can advance the i by 6 in every iteration and only check the Error of the ExecCmd.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix 6362824

@terut terut requested a review from rueian May 10, 2025 07:29
@rueian rueian merged commit 0dc2b40 into redis:main May 11, 2025
34 checks passed
@rueian

rueian commented May 11, 2025

Copy link
Copy Markdown
Collaborator

Thanks @terut!

@terut terut deleted the feat/conn-lifetime-cluster-sentinel branch May 12, 2025 03:00
@terut

terut commented May 12, 2025

Copy link
Copy Markdown
Contributor Author

@rueian Thank you for great help for couple of months 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants