Open
Description
Is your feature request related to a problem?
In #640, we introduced an adaptive rate limiter to control the bulk request rate per executor node. While this mechanism has helped reduce ingestion throttling, further improvements are needed to optimize its effectiveness, especially under large-scale Spark workloads in many executors.
What solution would you like?
-
Assumptions
- The AOSS collection is not a contention point and the Spark streaming job is expected to self-throttle.
- This is subtly different from the Internet congestion control problem, where clients compete for a shared and unpredictable network resource.
-
Objective
- Scales up and fully utilizes the target OCU as quickly as possible, similar to QoS with reserved capacity;
- Maintain high throughput consistently until the end of ingestion;
- This ensures optimal performance and minimizes ingestion time and Spark compute cost.
This meta issue tracks the following follow-up improvements:
- [FEATURE] Improve adaptive rate limit to handle lower rate #1060
- [FEATURE] Improve adaptive rate limiter with multi-signal feedback #1084 [High priority]
- [FEATURE] Support initial rate boost in adaptive rate limiter #1085
What alternatives have you considered?
N/A
Do you have any additional context?
N/A
Metadata
Metadata
Assignees
Type
Projects
Status
New