tasks cannot work normally after cpu overload and then recover #7901
gochendong
started this conversation in
General
Replies: 1 comment 4 replies
-
What version of Dolphin is, Dolphin can currently perform failure recovery and thread waiting. For the configuration of master and worker memory usage, please refer to the following:
|
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I found that Dolphin has a very serious bug. I deployed it in a cluster of 2masters and 3workers. When some machines have cpu overload and then recover after a period of time, the scheduling task cannot run normally, and the dolphin service needs to be restarted.
Beta Was this translation helpful? Give feedback.
All reactions