Skip to content

[rpc] Prevent EventLoop blocking and implement TCP-level backpressure in RequestChannel #2064

@platinumhamburg

Description

@platinumhamburg

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.8.0 (latest release)

Please describe the bug 🐞

The current RequestChannel implementation uses a bounded ArrayBlockingQueue for request queuing. When the queue is full, putRequest() blocks the calling thread, which can cause Netty EventLoop threads to block. This leads to:

  1. EventLoop Thread Blocking: When the request queue reaches capacity, ArrayBlockingQueue.put() blocks the EventLoop thread, severely degrading server performance and responsiveness.

  2. No Backpressure Control: The bounded queue provides a hard limit but doesn't implement proper backpressure at the TCP level. When the queue is full, new requests are blocked at the application level rather than being controlled at the network layer.

  3. Memory Management Issues: Without proper backpressure, the system cannot gracefully handle traffic spikes, potentially leading to memory exhaustion or degraded performance.

Solution

  1. Use Unbounded Buffer Queue: Replace ArrayBlockingQueue with unbounded queue to ensure putRequest() never blocks, preventing EventLoop threads from being blocked.

  2. Implement TCP-Level Backpressure:

    • When queue size exceeds backpressureThreshold, pause all associated Netty channels by setting autoRead(false)
    • When queue size drops below resumeThreshold (50% of backpressure threshold), resume channels by setting autoRead(true)
    • This provides hysteresis to avoid thrashing between pause/resume states
  3. Channel Lifecycle Management:

    • Register channels when they become active in NettyServerHandler.channelActive()
    • Unregister channels when they become inactive in NettyServerHandler.channelInactive()
    • Each RequestChannel manages its own set of associated channels independently
  4. Thread Safety and Performance:

    • Use ReentrantLock to protect backpressure state transitions atomically
    • Use volatile boolean isBackpressureActive for fast-path checks to avoid unnecessary lock contention
    • All channel operations are submitted to the channel's EventLoop thread for thread safety

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions