fix(operator): apply Configuration.ClientConnection QPS/Burst to the manager rest.Config#3446
Conversation
…Burst to the manager rest.Config cfg.ClientConnection is validated and defaulted by the Configuration API (QPS=50, Burst=100) but never wired into the manager's rest.Config, so the manager ran with the client-go defaults (QPS=5, Burst=10). Operators tuning clientConnection saw the values take effect in the status server (createClient already does the rest.CopyConfig + override dance) but silently ignored for the main reconciler (#3431). Mirror the status-server pattern: copy from ctrl.GetConfigOrDie(), overlay cfg.ClientConnection.{QPS,Burst} if non-nil, hand that config to ctrl.NewManager. No behaviour change when ClientConnection is unset; new defaults now flow through when it is. Fixes #3431 Signed-off-by: SAY-5 <SAY-5@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
🎉 Welcome to the Kubeflow Trainer! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
There was a problem hiding this comment.
Pull request overview
Wires Configuration.ClientConnection QPS/Burst into the controller-manager’s rest.Config so the manager’s main Kubernetes client uses the validated/defaulted rate limits from the Configuration API (fixing the previously ignored settings described in #3431).
Changes:
- Create a
rest.Configvariable for the manager and applycfg.ClientConnection.{QPS,Burst}before constructing the controller-runtime manager.
| restCfg := ctrl.GetConfigOrDie() | ||
| // Apply the Configuration.ClientConnection QPS/Burst to the | ||
| // manager's rest.Config. Without this, the documented defaults | ||
| // (QPS=50, Burst=100) from the Configuration API were validated | ||
| // and defaulted but never wired through, so the manager ran with | ||
| // the client-go defaults (QPS=5, Burst=10). The status server's | ||
| // createClient already uses this pattern (#3431). | ||
| if cfg.ClientConnection != nil { | ||
| if cfg.ClientConnection.QPS != nil { | ||
| restCfg.QPS = *cfg.ClientConnection.QPS | ||
| } | ||
| if cfg.ClientConnection.Burst != nil { | ||
| restCfg.Burst = int(*cfg.ClientConnection.Burst) | ||
| } |
There was a problem hiding this comment.
The PR description mentions copying the config (like the status server does), but this code mutates the *rest.Config returned by ctrl.GetConfigOrDie() in place; either update the description or switch to rest.CopyConfig(...) before overriding QPS/Burst for consistency.
|
Hey @SAY-5, thanks for jumping in on this! It looks like @abhijeet-dhumal is already working on this issue and has opened #3432 for this issue. Would you be up for taking a look there and helping with review or testing? That’d be a great way to contribute here. |
|
Thanks @robert-bell — didn't realize #3432 was already in flight. Closing this in favor of @abhijeet-dhumal's PR. I'll take a look at #3432 to help with review/testing. Sorry for the noise! |
Summary
Fixes #3431.
cfg.ClientConnectionis validated and defaulted by the Configuration API (QPS=50,Burst=100) but never wired into the manager'srest.Config, so the manager ran with the client-go defaults (QPS=5,Burst=10). Operators tuningclientConnectionsaw the values take effect in the status server (pkg/statusserver/setup.go'screateClientalready does therest.CopyConfig+ override dance) but silently ignored for the main reconciler.Fix
Mirror the status-server pattern in
cmd/trainer-controller-manager/main.go: copy fromctrl.GetConfigOrDie(), overlaycfg.ClientConnection.{QPS,Burst}when non-nil, and hand that config toctrl.NewManager. No behaviour change whenClientConnectionis unset; new defaults now flow through when it is.Test plan
go build ./cmd/trainer-controller-manager/...cleango vet ./cmd/trainer-controller-manager/...cleanclientConnection.qps = 25,clientConnection.burst = 50in the Configuration CR, check the manager's actual rest client rate limits with--v=6logging or theworkqueue_*_requests_totalmetrics — should scale with the new values instead of topping out at QPS=5/Burst=10.