Skip to content

[Client] Client deadlock when concurrent metadata updates block Netty IO threads #2209

@platinumhamburg

Description

@platinumhamburg

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.8.0 (latest release)

Please describe the bug 🐞

The client can deadlock when multiple concurrent requests trigger synchronous metadata updates on Netty IO threads. The synchronous getPartitionId() call in lookup operations blocks the IO thread while waiting for metadata responses, which can cause a deadlock when many concurrent requests need metadata updates simultaneously.

Root Cause:

  • PrimaryKeyLookuper.lookup() synchronously calls getPartitionId() on Netty IO threads
  • This blocks IO threads while waiting for metadata updates to complete
  • Under high concurrency, all IO threads can become blocked waiting for metadata responses
  • No available threads to process incoming metadata responses → deadlock

Impact:

  • Client hangs indefinitely when lookup operations require partition metadata
  • Affects all operations using primary key or secondary index lookups
  • More likely to occur with partitioned tables under high concurrency

Solution

Refactor metadata update mechanism to be fully asynchronous with request batching and deduplication:

  1. Async Metadata Updates: Provide CompletableFuture-based APIs (checkAndUpdatePartitionMetadataAsync(), updateMetadataAsync()) to avoid blocking Netty IO threads

  2. Request Batching: Aggregate multiple concurrent metadata requests into a single RPC call to reduce network overhead and contention

  3. Request Deduplication: Multiple concurrent requests for the same resource (table/partition) share the same update future, preventing duplicate RPC calls

  4. Atomic Resource Keys: Each metadata resource (table, partition, partition ID) is treated as an atomic unit for deduplication

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions