Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax memory barrier from org.elasticsearch.action.search.AbstractSearchAsyncAction#hasShardResponse #124888

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

original-brownbear
Copy link
Member

Make this thing introduce less ordering by only writing as needed. While this may be a tiny performance win, the main motivation for this is to remove one source of unpredictable happens-before across transport and search threads to make bugs more likely to surface despite low thread counts in tests.

…rchAsyncAction#hasShardResponse

Make this thing introduce less ordering by only writing as needed. While
this may be a tiny performance win, the main motivation for this is to
remove one source of unpredictable happens-before across transport and
search threads to make bugs more likely to surface despite low thread
counts in tests.
Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Armin.

Can you please update the PR description to reflect the workflow you're trying to modify and what happens-before boundaries you're looking to modify? Like, what kind bugs do you think the current implementation is not surfacing at the moment in the async search flow?

I think I understand what the purpose of this "one liner" change is however, I'd like to be immediately obvious when looking over the PR (I don't think it currently is)

@@ -494,7 +494,7 @@ private static boolean isTaskCancelledException(Exception e) {
protected void onShardResult(Result result) {
assert result.getShardIndex() != -1 : "shard index is not set";
assert result.getSearchShardTarget() != null : "search shard target must not be null";
hasShardResponse.set(true);
hasShardResponse.compareAndExchangeRelease(false, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please document why acquire semantics are not needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm same question here, does it really make sense to document this? We have no data dependency between this field and any other field and we only go false -> true here so if we do a plain read that's good enough, a potentially redundant write is still an improvement of the previous version. Maybe we should come up with a general list of what to use for what instead?

@@ -734,7 +734,7 @@ protected final ShardSearchRequest buildShardSearchRequest(SearchShardIterator s
// can return a null response if the request rewrites to match none rather
// than creating an empty response in the search thread pool.
// Note that, we have to disable this shortcut for queries that create a context (scroll and search context).
shardRequest.canReturnNullResponseIfMatchNoDocs(hasShardResponse.get() && shardRequest.scroll() == null);
shardRequest.canReturnNullResponseIfMatchNoDocs(shardRequest.scroll() == null && hasShardResponse.getAcquire());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the order swap of the acquire call and the shardRequest.scroll() call on purpose? If so, can you document why ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure it makes sense to add a comment for this, obviously a potentially shared read is more expensive than a plain read so I figured I'd just switch these two around? :No magic here )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants