Skip to content

Lookup sender only retry 3 times for finding leader even if a large lookup retry count is configured #2051

@swuferhong

Description

@swuferhong

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.8.0 (latest release)

Please describe the bug 🐞

The error is as follow:

2025-11-30 11:45:07
java.lang.Exception: Could not complete the stream element: Record @ 1764473843648 : +I(115188743,422ef291f654cc155b80c73595804f858191cb9a6bc0eeb208765e0d9c095392d4bbbe18cc480ec4ace3777098644cc2673a,7e021e49c346fa7427545304b26c725e4898a26a699f59d142e2f9747137d7e94d81276953170a085256201322c48b8c10ad,57388).
	at org.apache.flink.streaming.api.operators.async.AsyncWaitOperator$ResultHandler.completeExceptionally(AsyncWaitOperator.java:637)
	at org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinRunner$JoinedRowResultFuture.completeExceptionally(AsyncLookupJoinRunner.java:253)
	at org.apache.flink.table.runtime.operators.join.lookup.DelegatingResultFuture.accept(DelegatingResultFuture.java:46)
	at org.apache.flink.table.runtime.operators.join.lookup.DelegatingResultFuture.accept(DelegatingResultFuture.java:32)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
	at org.apache.flink.table.functions.AsyncLookupFunction.lambda$eval$0(AsyncLookupFunction.java:56)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
	at org.apache.fluss.flink.source.lookup.FlinkAsyncLookupFunction.lambda$asyncLookup$0(FlinkAsyncLookupFunction.java:140)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
	at org.apache.fluss.client.lookup.LookupSender.groupByLeaderAndType(LookupSender.java:155)
	at org.apache.fluss.client.lookup.LookupSender.sendLookups(LookupSender.java:135)
	at org.apache.fluss.client.lookup.LookupSender.runOnce(LookupSender.java:126)
	at org.apache.fluss.client.lookup.LookupSender.run(LookupSender.java:99)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:879)
Caused by: org.apache.flink.table.api.TableException: Failed to asynchronously lookup entries with key '+I(57388)'
	at org.apache.flink.table.functions.AsyncLookupFunction.lambda$eval$0(AsyncLookupFunction.java:58)
	... 18 more
Caused by: java.lang.RuntimeException: Execution of Fluss asyncLookup failed: org.apache.fluss.exception.FlussRuntimeException: Leader not found after retry  3 times for table bucket: TableBucket{tableId=32, bucket=127}
	at org.apache.fluss.flink.source.lookup.FlinkAsyncLookupFunction.lambda$asyncLookup$0(FlinkAsyncLookupFunction.java:143)
	... 13 more
Caused by: java.util.concurrent.CompletionException: org.apache.fluss.exception.FlussRuntimeException: Leader not found after retry  3 times for table bucket: TableBucket{tableId=32, bucket=127}
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
	at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
	at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
	... 11 more
Caused by: org.apache.fluss.exception.FlussRuntimeException: Leader not found after retry  3 times for table bucket: TableBucket{tableId=32, bucket=127}
	at org.apache.fluss.client.metadata.MetadataUpdater.leaderFor(MetadataUpdater.java:122)
	at org.apache.fluss.client.lookup.LookupSender.groupByLeaderAndType(LookupSender.java:153)
	... 8 more

Solution

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions