Skip to content

[Framework] Resolve race condition in cache during parallel set operations#72

Merged
TomerShor merged 1 commit intov3io:developmentfrom
weilerN:nuc-494-fix-cache-set-race-condition
Jun 25, 2025
Merged

[Framework] Resolve race condition in cache during parallel set operations#72
TomerShor merged 1 commit intov3io:developmentfrom
weilerN:nuc-494-fix-cache-set-race-condition

Conversation

@weilerN
Copy link
Collaborator

@weilerN weilerN commented Jun 25, 2025

Motivation

Described in the jira - https://iguazio.atlassian.net/browse/NUC-494

Root Cause

The problem happens when multiple goroutines run Load()NewSafeTrie()Store() at the same time. Each one might create its own trie, and since Store() always overwrites the value, one goroutine can replace the result of another—causing entries to be lost.

Description

Replace Load/Store pattern with the atomic LoadOrStore() to ensure only one goroutine creates the initial trie per host.

Affected Areas

No affected areas, internal package changes only

Testing

Changes Made

  • Changed the way the cache create an initial trie

@weilerN weilerN requested review from TomerShor and rokatyy June 25, 2025 11:10
Copy link
Collaborator

@rokatyy rokatyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Please add test to verify that :)

@weilerN weilerN marked this pull request as ready for review June 25, 2025 11:28
@weilerN
Copy link
Collaborator Author

weilerN commented Jun 25, 2025

Nice catch!

Please add test to verify that :)

I updated the Description and linked the unit tests PR.
I agree with you, this should have been done earlier =]

@weilerN weilerN requested a review from rokatyy June 25, 2025 11:34
@TomerShor TomerShor merged commit d97ecfc into v3io:development Jun 25, 2025
3 checks passed
TomerShor pushed a commit that referenced this pull request Jun 29, 2025
This PR adds unit tests for the flow described in the related Jira
ticket: https://iguazio.atlassian.net/browse/NUC-455 .
The tests ensure coverage of the newly implemented logic and validate
its behavior under expected conditions.

In addition, this PR includes test coverage for the changes introduced
in [#72](#72), providing
verification for that logic as well.

The unit tests failure are fixed in a separated PR -
#72
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants