-
Notifications
You must be signed in to change notification settings - Fork 115
Description
Is your feature request related to a problem? Please describe
Currently, the worker coordinator actor is responsible for collecting samples from worker actors and processing these at set intervals. When running high throughput tests, the task of processing hundreds of thousands of samples can become very slow/heavy for the worker coordinator actor, especially when waiting 30 seconds for so many samples to pile up.
These pile ups can eventually cause issues at high TPS tests, specifically with memory, and leads to issues described in this RFC.
Describe the solution you'd like
To help alleviate this work from the worker coordinator actor, there should be a new MetricsActor that is solely responsible for processing these samples. Rather than waiting 30 seconds to process samples, this actor can process these samples at a much faster rate, allowing the worker coordinator to only worry about the Worker actors and actually coordinating the test.
At the same time, we can move all post-processing related work to a separate file within the worker_coordinator folder. Currently, all the classes/methods related to metrics collection reside in the worker_coordinator.py file. Moving these classes/methods along with the new metrics actor can reduce the size of the worker_coordinator.py file by 500-700 lines and create a clear separation of concerns.
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status