-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Fix SearchErrorTraceIT and friends to work with batched query execution #127150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix SearchErrorTraceIT and friends to work with batched query execution #127150
Conversation
Making this work with batched execution and fixing a memory leak: * Fix memory leak by removing listener on first message. There really only is a single message here per node anyway with batched execution in the mix. Either it's a single shard on the data node and we get a single query message or it's multiple shards and we get a single batched message, so fine to remove listener after the first message since all tests do a single request only anyway. * Add a new hook that allows inspection of the actual response. This is needed for batched since batched sends a non-error response even if the data node failed all searches. We had this before in the `onResponseSent` hook but checking the instance after it's been sent over the wire causes needless overhead in the production code so moving to a "before-style" hook here.
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
* @param action the request action | ||
* @param response response instance | ||
*/ | ||
default void onBeforeResponseSent(long requestId, String action, TransportResponse response) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DaveCTurner wdyt? this should be ok right? In prod the overhead is negligible I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one comment below. I have a general question as well—what do you mean:
We had this before in the
onResponseSent
hook
I was a bit surprised that we don't have testing to inspect transport responses being sent from data nodes.
var r = asInstanceOf(SearchQueryThenFetchAsyncAction.NodeQueryResponse.class, response); | ||
for (Object result : r.getResults()) { | ||
if (result instanceof Exception error) { | ||
checkStacktraceStateAndRemove(error, mockTs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is looping + setting the transportMessageHasStackTrace
correct here? What if one result has a stack trace: transportMessageHasStackTrace.set(true)
, then a result in a future iteration doesn't: transportMessageHasStackTrace.set(false)
. Then transportMessageHasStackTrace
really means "did the last-checked shard have a stack trace." I believe our current test cases trigger errors on all shards so the issue is not noticeable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not great, I sort of tried to point this out via the fact that we only see a single request per thread in the op too.
I just didn't want to refactor the test here, it seems we could invert the logic here easily and set a boolean "expectsStacktrace" or so in the transport message listener and then simply assert inline instead of after the fact. If we can only communicate one bit back from the listener there isn't really a mathematical way to cleanly assert on it for multiple things :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this do something like: capture the existence of a stack trace for ALL exception results, and:
- if they're all true:
transportMessageHasStackTrace.set(true)
- if they're all false:
transportMessageHasStackTrace.set(false)
- else there's something wrong, so throw some assertion error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right that would work but also make this even harder to follow? If we want to fix this my vote would be to invest 5 more minutes here and just assert inline so that we can pass the expectation for everything to the listener at the beginning of each test? :) Otherwise if we go for the 3 outcome logic, we'll have some inline assertions and some "at the end of the test" assertions mixed, that's just needlessly complex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see what you mean now - throw out transportMessageHasStackTrace
and pass the "expectsStacktrace
" to assert against inside the onBeforeResponseSent
override. I agree that's better than the mixed assertions.
#125163 see this PR, we were wasting memory holding on to the message just for the noop (in production) hook :) I dropped the holding on to the message till after it's fully flushed to the wire in that PR. |
Making this work with batched execution and fixing a memory leak:
onResponseSent
hook but checking the instance after it's been sent over the wire causes needless overhead in the production code so moving to a "before-style" hook here.part of #125788