Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce mist client timeout for catabalancer stats sending #1400

Merged
merged 7 commits into from
Feb 26, 2025
Merged

Conversation

mjh1
Copy link
Contributor

@mjh1 mjh1 commented Feb 5, 2025

I saw occasional slowness from the mist GetState function. The mist client has a 1 minute timeout and we need a much shorter timeout here as the loop runs every 5 seconds

I saw occasional slowness from the mist GetState function. The mist client has a 1 minute timeout and we need a much shorter timeout here as the loop runs every 5 seconds
@mjh1 mjh1 requested a review from leszko February 5, 2025 09:55
@mjh1
Copy link
Contributor Author

mjh1 commented Feb 5, 2025

The diff is best viewed with whitespace hidden

Copy link
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a suggestion. Either way, LGTM

sysusage, err := GetSystemUsage()
if err != nil {
log.LogNoRequestID("catabalancer failed to get sys usage", "err", err)
done := make(chan bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the more "golang way" of doing the same would be passing context with timeout into sendMetrics(). So you could have something like this:

ctx, cancel := context.WithTimeout(context.Background(), timeout)
sendMetrics(ctx, nodeName, latitude, longitude, mist, nodeStatsDB)

And then inside sendMetrics() to have

GetSystemUsage(ctx)

And so on. The good thing about this approach would be that in case of the timeout you cancel all system and network calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. I see what you mean, unfortunately I couldn't find any funcs within GetSystemUsage which take a context so it really is just the mist client we need to have the timeout for, and so you've made me realise the proper solution is really to reduce the timeout for the mist client

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leszko I've pretty much re-done this PR, please could you take a look? thanks

@mjh1 mjh1 force-pushed the mh/catabalancer branch 3 times, most recently from 91129cb to 7c9b496 Compare February 5, 2025 11:25
@mjh1 mjh1 requested a review from leszko February 24, 2025 12:01
Copy link
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one inline comment. Other than that, LGTM

main.go Outdated
@@ -267,6 +267,7 @@ func main() {

if catabalancerEnabled && nodeStatsDB != nil {
if cli.Tags["node"] == "media" { // don't announce load balancing availability for testing nodes
mist := clients.NewMistAPIClient(cli.MistUser, cli.MistPassword, cli.MistHost, cli.MistPort, catabalancer.StatsUpdateInterval-time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment with the reasoning why it's catabalancer.StatsUpdateInterval-time.Second? It's not clear to me.

@mjh1 mjh1 merged commit 95d4007 into main Feb 26, 2025
7 of 9 checks passed
@mjh1 mjh1 deleted the mh/catabalancer branch February 26, 2025 12:08
@mjh1 mjh1 changed the title Implement timeout for whole metrics send function Reduce mist client timeout for catabalancer stats sending Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants