Skip to content

Conversation

@kgeckhart
Copy link
Contributor

PR Description

This PR switches labelstore to use a RWMutex with a RLock for labelstore.GetLocalLink. I tried to add some concurrency test to show proof of impact but locally I can't seem to get enough concurrency or this doesn't help so I'll do some testing internally in draft. Full breakdown of rational in Notes to the Reviewer

Which issue(s) this PR fixes

Related to:

Notes to the Reviewer

With very high append concurrency we have hit scrape timeouts which correlate with high mutex contention.

image (1)

The mutex contention is largely driven by labelstore.GetLocalLink which is called on every single append call. During the time period above that function was responsible for over 50% of our mutex wait,
image

Since it's a pure read operation switching to RWLock with RLock should help.

PR Checklist

  • CHANGELOG.md updated
  • Tests updated

@kgeckhart kgeckhart added the publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository label Dec 11, 2025
@kgeckhart
Copy link
Contributor Author

Closing, internal tests showed that the reader GetLocalRefID appeared to starve the main writer GetOrAddGlobalRefID enough that there was no reduction in contention and in some cases increase. I'll focused on the improvements from #5062 instead since this isn't as quick of a win as expected.

@kgeckhart kgeckhart closed this Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant