Skip to content

Conversation

@julianguinard
Copy link
Contributor

@julianguinard julianguinard commented Jan 5, 2026

this relates to discussion #6565 (comment) of PR 6565

the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Checklist

@julianguinard julianguinard requested a review from a team as a code owner January 5, 2026 17:17
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

  • Add an entry in our changelog in alphabetical order and link related issue
  • Update the documentation, if needed
  • Add unit & e2e tests for your changes
  • GitHub checks are passing
  • Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

@keda-automation keda-automation requested a review from a team January 5, 2026 17:17
@snyk-io
Copy link

snyk-io bot commented Jan 5, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
@julianguinard julianguinard force-pushed the chore-fix-add-missing-final-semaphore-acquisition branch from b868224 to 791b8f5 Compare January 5, 2026 17:33
@rickbrouwer
Copy link
Member

rickbrouwer commented Jan 5, 2026

/run-e2e metrics_api
Update: You can check the progress here

@julianguinard
Copy link
Contributor Author

thanks @rickbrouwer , e2e passed successfully.

Copy link
Member

@rickbrouwer rickbrouwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! lgtm!

@julianguinard julianguinard changed the title Metrics API / aggregate from kubernetes service endpoint slices : add missing final semaphore acquisition Bugfix - Metrics API / aggregate from kubernetes service endpoint slices : add missing final semaphore acquisition Jan 6, 2026
@wozniakjan wozniakjan requested a review from Copilot January 6, 2026 10:13
Copy link
Member

@wozniakjan wozniakjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@wozniakjan wozniakjan merged commit 1c4ceae into kedacore:main Jan 6, 2026
28 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a synchronization bug in the metrics API scaler where the function would not wait for all goroutines to complete before aggregating results from multiple Kubernetes service endpoints. The fix adds a final semaphore acquisition to ensure all metrics have been gathered before proceeding.

Key Changes:

  • Added final semaphore acquisition to wait for all goroutines to complete their work
  • Added debug logging to track the start and end of endpoint iteration
Comments suppressed due to low confidence (1)

pkg/scalers/metrics_api_scaler.go:453

  • The nbErrors and expectedNbMetrics variables are accessed without mutex protection after the semaphore acquisition. While the semaphore should ensure goroutines have completed, if the semaphore acquisition fails (line 442-444), these variables may still be modified by running goroutines, creating a race condition. Consider protecting these reads with the mutex or returning early if semaphore acquisition fails.
	if nbErrors > 0 && nbErrors == len(endpointsUrls) {
		err = fmt.Errorf("could not get any metric successfully from the %d provided endpoints", len(endpointsUrls))
	}
	if s.metadata.AggregationType == AverageAggregationType {
		aggregation /= float64(expectedNbMetrics)
	}
	s.logger.V(1).Info(fmt.Sprintf("fetched %d metrics out of %d endpoints from kubernetes service : %s is %v\n", expectedNbMetrics, len(endpointsUrls), s.metadata.AggregationType, aggregation))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alt-dima pushed a commit to alt-dima/keda that referenced this pull request Jan 9, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
Signed-off-by: Dima Altukhov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants