You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Data] Delay 'cluster resources not enough' warning until operator is persistently starved (#63969)
## Description
A dataset's resource allocator depends on the `AutoscalingCoordinator`
server to get its share of allocated resources. To improve reliability,
#62725 made calls to the server
non-blocking. One consequence of this change is that the dataset gets
zero resources at the very start of execution while it waits for the
first response from the autoscaling coordnanator. As a result, we'd
consistently emit spurious warnings like this at the start of execution:
```
Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadRange]. The job may hang forever unless the cluster scales up.
```
To avoid this confusion, I've made it so that we only emit the warning
after the first eligible operator has been starved for a minute.
## Related issues
## Additional information
---------
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
0 commit comments