server, config: support updating leader lease online#10631
server, config: support updating leader lease online#10631JmPotato wants to merge 1 commit intotikv:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughAdds persisted leader-lease state and API handling: a new top-level Changes
Sequence DiagramsequenceDiagram
participant Client
participant API as Config API
participant Server as PD Server
participant Persist as PersistOptions
participant Storage as Config Storage
Client->>API: POST /pd/api/v1/config {"lease": 3000}
API->>API: Detect top-level "lease" key
API->>Server: SetLeaderLease(3000)
Server->>Server: Validate lease > 0
Server->>Persist: SetLeaderLease(3000)
Persist->>Persist: Update atomic leaderLease
Server->>Storage: Persist updated config
Storage-->>Server: OK
Server-->>API: 200 OK
API-->>Client: 200 OK
Note over Client,Storage: On restart / reload
Server->>Storage: Load persisted config
Storage-->>Server: Config{..., lease:3000}
Server->>Persist: ReloadLeaderLease(...)
Persist->>Persist: IsValidLeaderLease -> true
Persist-->>Server: leaderLease=3000
Server->>Server: Use persisted leaderLease for campaigning
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/config/persist_options.go`:
- Around line 829-831: Reload currently treats a default-initialized
cfg.LeaderLease as a present, valid value and overwrites any startup lease; fix
this by making the persisted presence explicit (either change cfg.LeaderLease to
a pointer type or add a boolean flag set by LoadConfig that indicates the field
was present) and only call IsValidLeaderLease and o.SetLeaderLease when that
presence indicator shows the field existed in the persisted blob; apply the same
change/guard for the other lease-related block at the 838-840 region so you only
apply persisted leases if the field was actually present in the loaded config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b9f77e78-96a8-454f-a2b0-f0b11648ce41
📒 Files selected for processing (5)
server/api/config.goserver/config/config_test.goserver/config/persist_options.goserver/server.gotests/server/api/api_test.go
c8a100b to
f66c676
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
server/config/persist_options.go (1)
829-831:⚠️ Potential issue | 🟠 MajorDo not apply
LeaderLeaseunless the persistedleasefield actually exists.On Line 829,
IsValidLeaderLease(cfg.LeaderLease)is not enough to distinguish “field missing” vs “field present.” For older stored blobs withoutlease, reload can still overwrite a custom startup lease.💡 Suggested fix (presence check before applying)
func (o *PersistOptions) Reload(storage endpoint.ConfigStorage) error { cfg := &persistedConfig{Config: &Config{}} + leaseProbe := struct { + LeaderLease *int64 `json:"lease"` + }{} // Pass nil to initialize cfg to default values (all items undefined) if err := cfg.Adjust(nil, true); err != nil { return err } isExist, err := storage.LoadConfig(cfg) if err != nil { return err } + _, _ = storage.LoadConfig(&leaseProbe) adjustScheduleCfg(&cfg.Schedule) ... if isExist { ... - if IsValidLeaderLease(cfg.LeaderLease) { - o.SetLeaderLease(cfg.LeaderLease) + if leaseProbe.LeaderLease != nil && IsValidLeaderLease(*leaseProbe.LeaderLease) { + o.SetLeaderLease(*leaseProbe.LeaderLease) } ... } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/config/persist_options.go` around lines 829 - 831, The current code calls o.SetLeaderLease when IsValidLeaderLease(cfg.LeaderLease) is true, which overwrites a startup lease even if the persisted blob lacked a `lease` field; change the condition to require the persisted field's presence as well (e.g., check the config wrapper/flag that indicates the `lease` field was present or use a nil/presence check on cfg.LeaderLease) before calling o.SetLeaderLease; update the if around IsValidLeaderLease(cfg.LeaderLease) to something like "if cfg.HasLeaderLease() && IsValidLeaderLease(cfg.LeaderLease) { o.SetLeaderLease(cfg.LeaderLease) }" (replace HasLeaderLease with the actual presence indicator in the config struct).server/api/config.go (1)
218-219:⚠️ Potential issue | 🟠 Major
leaseis updateable, but followerGET /configcan still serve stalelease.With Line 218 enabling dynamic
lease, follower reads need to merge lease from leader too. Today followerGetConfigonly syncs schedule/replication (Lines 81-82), soleasemay lag until reload/leadership change.💡 Suggested fix in follower merge path
mergedCfg := localCfg mergedCfg.Replication = leaderCfg.Replication mergedCfg.Schedule = leaderCfg.Schedule + mergedCfg.LeaderLease = leaderCfg.LeaderLease h.rd.JSON(w, http.StatusOK, mergedCfg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/api/config.go` around lines 218 - 219, GetConfig on followers currently syncs only schedule/replication and can return a stale lease; update the follower merge path to include the leader's lease value as well. In the follower GET /config handling (where schedule/replication are merged) add logic to fetch and merge the leader's lease into the response so that the value updated via updateLeaderLease is reflected to followers; ensure you reuse the same merge/order semantics as schedule/replication so updateLeaderLease (case "lease") changes propagate immediately to follower GetConfig responses.
🧹 Nitpick comments (1)
server/config/config_test.go (1)
131-148: Add one backward-compat test for missingleasekey.Please add a case where stored config omits
leaseentirely and assert reload keeps the in-memory startup lease unchanged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/config/config_test.go` around lines 131 - 148, Extend TestLeaderLeaseReloadIgnoresNonPositivePersistedValues to include a case where the persisted config omits the lease key entirely: call storage.SaveConfig with an empty value (e.g., struct{}{} or a map with no "lease" key), then run the same reload flow (NewConfig, cfg.Adjust, set cfg.LeaderLease to 7, NewPersistOptions, opt.Reload) and assert opt.GetLeaderLease() remains 7; reference TestLeaderLeaseReloadIgnoresNonPositivePersistedValues, storage.SaveConfig, NewConfig, NewPersistOptions, opt.Reload, and opt.GetLeaderLease to locate the code to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@server/api/config.go`:
- Around line 218-219: GetConfig on followers currently syncs only
schedule/replication and can return a stale lease; update the follower merge
path to include the leader's lease value as well. In the follower GET /config
handling (where schedule/replication are merged) add logic to fetch and merge
the leader's lease into the response so that the value updated via
updateLeaderLease is reflected to followers; ensure you reuse the same
merge/order semantics as schedule/replication so updateLeaderLease (case
"lease") changes propagate immediately to follower GetConfig responses.
In `@server/config/persist_options.go`:
- Around line 829-831: The current code calls o.SetLeaderLease when
IsValidLeaderLease(cfg.LeaderLease) is true, which overwrites a startup lease
even if the persisted blob lacked a `lease` field; change the condition to
require the persisted field's presence as well (e.g., check the config
wrapper/flag that indicates the `lease` field was present or use a nil/presence
check on cfg.LeaderLease) before calling o.SetLeaderLease; update the if around
IsValidLeaderLease(cfg.LeaderLease) to something like "if cfg.HasLeaderLease()
&& IsValidLeaderLease(cfg.LeaderLease) { o.SetLeaderLease(cfg.LeaderLease) }"
(replace HasLeaderLease with the actual presence indicator in the config
struct).
---
Nitpick comments:
In `@server/config/config_test.go`:
- Around line 131-148: Extend
TestLeaderLeaseReloadIgnoresNonPositivePersistedValues to include a case where
the persisted config omits the lease key entirely: call storage.SaveConfig with
an empty value (e.g., struct{}{} or a map with no "lease" key), then run the
same reload flow (NewConfig, cfg.Adjust, set cfg.LeaderLease to 7,
NewPersistOptions, opt.Reload) and assert opt.GetLeaderLease() remains 7;
reference TestLeaderLeaseReloadIgnoresNonPositivePersistedValues,
storage.SaveConfig, NewConfig, NewPersistOptions, opt.Reload, and
opt.GetLeaderLease to locate the code to change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5e0a21ab-1f69-44ae-a4af-914af870c79f
📒 Files selected for processing (5)
server/api/config.goserver/config/config_test.goserver/config/persist_options.goserver/server.gotests/server/api/api_test.go
Signed-off-by: JmPotato <github@ipotato.me>
f66c676 to
6168127
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
server/server.go (1)
1013-1025:⚠️ Potential issue | 🟠 MajorKeep
leasein the follower merge path.
GetConfig()still only overlaysScheduleandReplicationfromleaderCfg. After a runtime lease update, a follower can keep serving an oldleaseuntil its local cache reloads, so/pd/api/v1/configcan disagree with the leader. Please copyLeaderLeasefromleaderCfghere too.Suggested patch
mergedCfg := localCfg mergedCfg.Replication = leaderCfg.Replication mergedCfg.Schedule = leaderCfg.Schedule +mergedCfg.LeaderLease = leaderCfg.LeaderLease🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/server.go` around lines 1013 - 1025, GetConfig currently doesn't copy the leader's runtime lease into the returned cfg causing followers to serve stale LeaderLease; in GetConfig ensure you overlay LeaderLease from the persisted leader config (use the value returned by persistOptions.GetLeaderLease / the leaderCfg's lease) into cfg so the returned config reflects the leader's lease update (update the LeaderLease field in GetConfig to copy/clone the leader's lease value rather than leaving a stale local one).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@server/server.go`:
- Around line 1013-1025: GetConfig currently doesn't copy the leader's runtime
lease into the returned cfg causing followers to serve stale LeaderLease; in
GetConfig ensure you overlay LeaderLease from the persisted leader config (use
the value returned by persistOptions.GetLeaderLease / the leaderCfg's lease)
into cfg so the returned config reflects the leader's lease update (update the
LeaderLease field in GetConfig to copy/clone the leader's lease value rather
than leaving a stale local one).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4124abaa-3025-491c-8fe5-0b2d872988cd
📒 Files selected for processing (5)
server/api/config.goserver/config/config_test.goserver/config/persist_options.goserver/server.gotests/server/api/api_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- server/config/persist_options.go
|
@JmPotato: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: ref #10630
What is changed and how does it work?
Check List
Tests
Code changes
Side effects
Related changes
pingcap/docs/pingcap/docs-cn: TBDManual test
go test ./server/config -count=1go test -tags without_dashboard ./tests/server/api -run TestLeaderLeaseConfigAPI -count=1go test -tags without_dashboard ./server/... -count=0make buildgit diff --checkRelease note
Summary by CodeRabbit
New Features
Bug Fixes
Tests