workflow controller keeps crashing under load #14232
static-moonlight
started this conversation in
General
Replies: 1 comment
-
related to death by |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Scenario: we are using Argo to run smaller workflows, lots of them, during normal operation ~500 per hour.
After a small outage, Argo gets flooded with 1000+ workflows. It seems the workflow controller can't handle that.
This is a serious problem. I need to know why the workflow controller keeps crashing. How do I find out? Where do I need to look?
On that note: I also need ideas how to make it more stable/resilient/reliable.
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions