KAFKA-20131: ClassicKafkaConsumer does not clear endOffsetRequested flag on failed LIST_OFFSETS calls#21457
Conversation
…ST_OFFSETS call fails First pass at catching the case of failures. Still work to do to handle the fact that multiple responses are possible and thus we don't want to clear the flag prematurely.
There was a problem hiding this comment.
@kirktrue went through the code, debugged it locally and ran the added tests, to me the PR looks fine. Since I'm not up to date with the consumer code though, I would request you to get a second opinion too.
…T_OFFSETS-failures
clients/src/main/java/org/apache/kafka/clients/consumer/internals/OffsetsRequestManager.java
Outdated
Show resolved
Hide resolved
.../main/java/org/apache/kafka/clients/consumer/internals/events/ApplicationEventProcessor.java
Outdated
Show resolved
Hide resolved
clients/src/main/java/org/apache/kafka/clients/consumer/internals/OffsetFetcherUtils.java
Outdated
Show resolved
Hide resolved
| remainingToSearch.keySet().retainAll(value.partitionsToRetry); | ||
|
|
||
| offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel); | ||
| offsetFetcherUtils.clearPartitionEndOffsetRequests(remainingToSearch.keySet()); |
There was a problem hiding this comment.
here we're clearing the flag for the partitions that didn't get offsets yet. I agree we need this if we don't have any time left to retry. But if there's still time, the do-while will try again. In that case, do we want to clear the flag here?
I would imagine we don't, because we'll continue retrying while there is time. It could be the case of missing leader info for instance: we want to keep the flag on for those partitions, hit the client.awaitMetadataUpdate(timer) below, and try again in the next iteration of the do-while, right?
If so, I imagine we could take the timer into consideration here? (clear the flag for the failed partitions only if timer expired?). Thoughts?
There was a problem hiding this comment.
I agree we need this if we don't have any time left to retry. But if there's still time, the do-while will try again. In that case, do we want to clear the flag here?
That's precisely what happens in the currentLag() case, though. It's always using a timeout of 0, so there's never a second pass in that loop.
There was a problem hiding this comment.
ok, we both agree we need it for currentLag/timerExpired. But in the way it's called now it applies to all cases, that's my concern. Isn't this going to clear the flag also in the case where there is time left to retry, and there is a partition that didn't have a known leader?
There was a problem hiding this comment.
I've added an explicit parameter to 'clear end offsets requests' that only the ClassicKafkaConsumer.currentLag() sets to true. This should prevent other callers from clearing the flag, regardless of the timeout setting.
|
|
||
| @Override | ||
| public void onFailure(RuntimeException e) { | ||
| offsetFetcherUtils.clearPartitionEndOffsetRequests(remainingToSearch.keySet()); |
clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
Show resolved
Hide resolved
clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
Outdated
Show resolved
Hide resolved
clients/src/test/java/org/apache/kafka/clients/consumer/KafkaConsumerTest.java
Outdated
Show resolved
Hide resolved
…T_OFFSETS-failures
…ocations in KafkaConsumerTest
|
|
||
| offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel); | ||
|
|
||
| if (isZeroTimestamp && shouldClearPartitionEndOffsets) |
There was a problem hiding this comment.
shouldn't we clear the flag if (isZeroTimestamp)?
I see the shouldClearPartitionEndOffsets is passed true only from currentLag, but what about a call to consumer.endOffsets/beginningOffsets/offsetsForTimes when called with Duration.ZERO? how would the flag get cleared?
There was a problem hiding this comment.
The flag is only set and only checked on the currentLag() path, so the other paths shouldn't need to worry about it.
It's possible for a user to pass in a zero timeout to offsetsForTimes(), et al., but we don't need to clear the flag in those cases.
There was a problem hiding this comment.
ack, makes sense that we cannot consolidate on the time check because it depends on the caller.
But then, shouldn't we consolidate on the shouldClearPartitionEndOffsets? Why do we need to check isZeroTimestamp here? vs simply if shouldClearPartitionEndOffsets then clear
There was a problem hiding this comment.
In that method, if isZeroTimestamp is set to true, it's a synonym for 'only execute a single pass of the loop?' In the case where isZeroTimestamp is false, it it's 'possibly execute the pass multiple times.'
We don't want to clear the partitionEndOffsetsRequested until we've finished all the passes of the loop we're going to make. So if shouldClearPartitionEndOffsets is true but isZeroTimestamp is false, clearing the partitionEndOffsetsRequested flag could be premature because there could be enough time on the timer for a second pass of the loop that finds the offsets for a partition that the first pass didn't.
I agree that it's confusing. I've tried a couple of different approaches, but they we're much clearer 😦
There was a problem hiding this comment.
if shouldClearPartitionEndOffsets is true but isZeroTimestamp is false
This is never true, right? (shouldClear is only used for lag)
There was a problem hiding this comment.
This is too twisty I'm afraid. A few points.
- I wonder whether it might be better to call
subscriptions.requestPartitionEndOffsetin this method too. Then you are setting it and clearing it all together. - The flag
isZeroTimestampseems misnamed to me.isZeroTimeoutsurely. - I think that
isZeroTimestamp(and the equivalenttimer.timeoutMs() == 0L) has a couple of effects. First, it clears the partition end offset requests flag when the future completes. Second, it exits the loop midway through the first iteration without polling the client. - I would change the check
if (timer.timeoutMs() == 0Lto useisZeroTimeouttoo.
There was a problem hiding this comment.
I've refactored the currentLag() method in ClassicKafkaConsumer and OffsetFetcher so that the logic resides in the latter. Now OffsetFetcher.fetchOffsetsByTimes() can a) set and clear the partition end offset with much closer locality, and b) revert back to the original logic related to the timeout.
PTAL.
| } | ||
| } while (timer.notExpired()); | ||
|
|
||
| if (shouldClearPartitionEndOffsets) { |
There was a problem hiding this comment.
isn't this going to be always false here? (so unneeded?)
shouldClearPartitionEndOffsets is true for currentLag only, so time=0 which always early returns above, right?
There was a problem hiding this comment.
You're right. I was trying to keep the logic generalized, but maybe that's only more confusing.
|
|
||
| offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel); | ||
|
|
||
| if (isZeroTimestamp && shouldClearPartitionEndOffsets) |
There was a problem hiding this comment.
ack, makes sense that we cannot consolidate on the time check because it depends on the caller.
But then, shouldn't we consolidate on the shouldClearPartitionEndOffsets? Why do we need to check isZeroTimestamp here? vs simply if shouldClearPartitionEndOffsets then clear
AndrewJSchofield
left a comment
There was a problem hiding this comment.
Thanks for the continued effort on this one. A few comments.
|
|
||
| offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel); | ||
|
|
||
| if (isZeroTimestamp && shouldClearPartitionEndOffsets) |
There was a problem hiding this comment.
This is too twisty I'm afraid. A few points.
- I wonder whether it might be better to call
subscriptions.requestPartitionEndOffsetin this method too. Then you are setting it and clearing it all together. - The flag
isZeroTimestampseems misnamed to me.isZeroTimeoutsurely. - I think that
isZeroTimestamp(and the equivalenttimer.timeoutMs() == 0L) has a couple of effects. First, it clears the partition end offset requests flag when the future completes. Second, it exits the loop midway through the first iteration without polling the client. - I would change the check
if (timer.timeoutMs() == 0Lto useisZeroTimeouttoo.
…T_OFFSETS-failures
| Timer timer, | ||
| boolean requireTimestamps) { | ||
| boolean requireTimestamps, | ||
| boolean shouldUpdatePartitionEndOffsets) { |
There was a problem hiding this comment.
this name is very confusing because it clearly says it will update end offsets (first thing that comes to mind is an actual change to positions, not a flag).
Would it help if we rename to mention it's to update a flag (maybe updatePartitionEndOffsetsFlag), or at least a description of the param?
There was a problem hiding this comment.
I changed the variable name to updatePartitionEndOffsetsFlag.
| // we may get the answer; we do not need to wait for the return value | ||
| // since we would not try to poll the network client synchronously | ||
| if (lag == null) { | ||
| if (subscriptions.partitionEndOffset(topicPartition, isolationLevel) == null) { |
There was a problem hiding this comment.
aren't we missing the check here to ensure there is no request in-flight?
We should also ensure we have a test for this: 2 consecutive calls to currentLag at this level, first one should generate a request, no response, second call should not generate the request. I would expect such test should be failing now.
There was a problem hiding this comment.
Added a unit test to catch multiple inflight LIST_OFFESTS requests.
clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
Show resolved
Hide resolved
…ag and adding maybeSetPartitionEndOffsetRequest
clients/src/main/java/org/apache/kafka/clients/consumer/internals/OffsetFetcher.java
Show resolved
Hide resolved
lianetm
left a comment
There was a problem hiding this comment.
Thanks @kirktrue! LGTM.
I'll let @AndrewJSchofield take a look in case there are more comments before merging.
Updates the
ClassicKafkaConsumerto clear out theSubscriptionStateendOffsetRequestedflag if theLIST_OFFSETScall fails.Reviewers: Viktor Somogyi-Vass viktorsomogyi@gmail.com, Lianet Magrans
lmagrans@confluent.io, Andrew Schofield aschofield@confluent.io