Skip to content

Comments

KAFKA-20131: ClassicKafkaConsumer does not clear endOffsetRequested flag on failed LIST_OFFSETS calls#21457

Merged
AndrewJSchofield merged 26 commits intoapache:trunkfrom
kirktrue:KAFKA-20131-clear-endOffsetRequested-on-LIST_OFFSETS-failures
Feb 20, 2026
Merged

KAFKA-20131: ClassicKafkaConsumer does not clear endOffsetRequested flag on failed LIST_OFFSETS calls#21457
AndrewJSchofield merged 26 commits intoapache:trunkfrom
kirktrue:KAFKA-20131-clear-endOffsetRequested-on-LIST_OFFSETS-failures

Conversation

@kirktrue
Copy link
Contributor

@kirktrue kirktrue commented Feb 11, 2026

Updates the ClassicKafkaConsumer to clear out the SubscriptionState
endOffsetRequested flag if the LIST_OFFSETS call fails.

Reviewers: Viktor Somogyi-Vass viktorsomogyi@gmail.com, Lianet Magrans
lmagrans@confluent.io, Andrew Schofield aschofield@confluent.io

…ST_OFFSETS call fails

First pass at catching the case of failures. Still work to do to handle the fact that multiple responses are possible and thus we don't want to clear the flag prematurely.
@github-actions github-actions bot added consumer clients triage PRs from the community labels Feb 11, 2026
@kirktrue kirktrue marked this pull request as ready for review February 12, 2026 00:29
@viktorsomogyi viktorsomogyi self-requested a review February 12, 2026 15:26
Copy link
Contributor

@viktorsomogyi viktorsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kirktrue went through the code, debugged it locally and ran the added tests, to me the PR looks fine. Since I'm not up to date with the consumer code though, I would request you to get a second opinion too.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @kirktrue ! Initial high level comment regarding the changes related to the retry logic

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments. Thanks!

remainingToSearch.keySet().retainAll(value.partitionsToRetry);

offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel);
offsetFetcherUtils.clearPartitionEndOffsetRequests(remainingToSearch.keySet());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we're clearing the flag for the partitions that didn't get offsets yet. I agree we need this if we don't have any time left to retry. But if there's still time, the do-while will try again. In that case, do we want to clear the flag here?

I would imagine we don't, because we'll continue retrying while there is time. It could be the case of missing leader info for instance: we want to keep the flag on for those partitions, hit the client.awaitMetadataUpdate(timer) below, and try again in the next iteration of the do-while, right?

If so, I imagine we could take the timer into consideration here? (clear the flag for the failed partitions only if timer expired?). Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we need this if we don't have any time left to retry. But if there's still time, the do-while will try again. In that case, do we want to clear the flag here?

That's precisely what happens in the currentLag() case, though. It's always using a timeout of 0, so there's never a second pass in that loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, we both agree we need it for currentLag/timerExpired. But in the way it's called now it applies to all cases, that's my concern. Isn't this going to clear the flag also in the case where there is time left to retry, and there is a partition that didn't have a known leader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an explicit parameter to 'clear end offsets requests' that only the ClassicKafkaConsumer.currentLag() sets to true. This should prevent other callers from clearing the flag, regardless of the timeout setting.


@Override
public void onFailure(RuntimeException e) {
offsetFetcherUtils.clearPartitionEndOffsetRequests(remainingToSearch.keySet());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@lianetm lianetm removed the triage PRs from the community label Feb 12, 2026
@kirktrue kirktrue changed the title KAFKA-20131: SubscriptionState endOffsetRequested remains permanently set if LIST_OFFSETS call fails KAFKA-20131: ClassicKafkaConsumer does not clear endOffsetRequested flag on failed LIST_OFFSETS calls Feb 13, 2026

offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel);

if (isZeroTimestamp && shouldClearPartitionEndOffsets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we clear the flag if (isZeroTimestamp)?

I see the shouldClearPartitionEndOffsets is passed true only from currentLag, but what about a call to consumer.endOffsets/beginningOffsets/offsetsForTimes when called with Duration.ZERO? how would the flag get cleared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag is only set and only checked on the currentLag() path, so the other paths shouldn't need to worry about it.

It's possible for a user to pass in a zero timeout to offsetsForTimes(), et al., but we don't need to clear the flag in those cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, makes sense that we cannot consolidate on the time check because it depends on the caller.

But then, shouldn't we consolidate on the shouldClearPartitionEndOffsets? Why do we need to check isZeroTimestamp here? vs simply if shouldClearPartitionEndOffsets then clear

Copy link
Contributor Author

@kirktrue kirktrue Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that method, if isZeroTimestamp is set to true, it's a synonym for 'only execute a single pass of the loop?' In the case where isZeroTimestamp is false, it it's 'possibly execute the pass multiple times.'

We don't want to clear the partitionEndOffsetsRequested until we've finished all the passes of the loop we're going to make. So if shouldClearPartitionEndOffsets is true but isZeroTimestamp is false, clearing the partitionEndOffsetsRequested flag could be premature because there could be enough time on the timer for a second pass of the loop that finds the offsets for a partition that the first pass didn't.

I agree that it's confusing. I've tried a couple of different approaches, but they we're much clearer 😦

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if shouldClearPartitionEndOffsets is true but isZeroTimestamp is false

This is never true, right? (shouldClear is only used for lag)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too twisty I'm afraid. A few points.

  • I wonder whether it might be better to call subscriptions.requestPartitionEndOffset in this method too. Then you are setting it and clearing it all together.
  • The flag isZeroTimestamp seems misnamed to me. isZeroTimeout surely.
  • I think that isZeroTimestamp (and the equivalent timer.timeoutMs() == 0L) has a couple of effects. First, it clears the partition end offset requests flag when the future completes. Second, it exits the loop midway through the first iteration without polling the client.
  • I would change the check if (timer.timeoutMs() == 0L to use isZeroTimeout too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the currentLag() method in ClassicKafkaConsumer and OffsetFetcher so that the logic resides in the latter. Now OffsetFetcher.fetchOffsetsByTimes() can a) set and clear the partition end offset with much closer locality, and b) revert back to the original logic related to the timeout.

PTAL.

}
} while (timer.notExpired());

if (shouldClearPartitionEndOffsets) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this going to be always false here? (so unneeded?)
shouldClearPartitionEndOffsets is true for currentLag only, so time=0 which always early returns above, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I was trying to keep the logic generalized, but maybe that's only more confusing.


offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel);

if (isZeroTimestamp && shouldClearPartitionEndOffsets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, makes sense that we cannot consolidate on the time check because it depends on the caller.

But then, shouldn't we consolidate on the shouldClearPartitionEndOffsets? Why do we need to check isZeroTimestamp here? vs simply if shouldClearPartitionEndOffsets then clear

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the continued effort on this one. A few comments.


offsetFetcherUtils.updateSubscriptionState(value.fetchedOffsets, isolationLevel);

if (isZeroTimestamp && shouldClearPartitionEndOffsets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too twisty I'm afraid. A few points.

  • I wonder whether it might be better to call subscriptions.requestPartitionEndOffset in this method too. Then you are setting it and clearing it all together.
  • The flag isZeroTimestamp seems misnamed to me. isZeroTimeout surely.
  • I think that isZeroTimestamp (and the equivalent timer.timeoutMs() == 0L) has a couple of effects. First, it clears the partition end offset requests flag when the future completes. Second, it exits the loop midway through the first iteration without polling the client.
  • I would change the check if (timer.timeoutMs() == 0L to use isZeroTimeout too.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kirktrue , nice refactoring ! An important easy-fix gap, and test coverage for it, but other than that seems almost there.

Timer timer,
boolean requireTimestamps) {
boolean requireTimestamps,
boolean shouldUpdatePartitionEndOffsets) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name is very confusing because it clearly says it will update end offsets (first thing that comes to mind is an actual change to positions, not a flag).

Would it help if we rename to mention it's to update a flag (maybe updatePartitionEndOffsetsFlag), or at least a description of the param?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the variable name to updatePartitionEndOffsetsFlag.

// we may get the answer; we do not need to wait for the return value
// since we would not try to poll the network client synchronously
if (lag == null) {
if (subscriptions.partitionEndOffset(topicPartition, isolationLevel) == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we missing the check here to ensure there is no request in-flight?

We should also ensure we have a test for this: 2 consecutive calls to currentLag at this level, first one should generate a request, no response, second call should not generate the request. I would expect such test should be failing now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a unit test to catch multiple inflight LIST_OFFESTS requests.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kirktrue! LGTM.
I'll let @AndrewJSchofield take a look in case there are more comments before merging.

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@AndrewJSchofield AndrewJSchofield merged commit abcbef6 into apache:trunk Feb 20, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants