-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
For reasons outlined in #72, the support bundle collection process could not complete on one node in a cluster. It looks like we wait here indefinitely to receive all expected bundles before proceeding. Since the collection agent on one node failed before checking in, we did not proceed to finish creating the bundle, and the user had nothing to send to support.
Some suggested resolutions:
- A timeout mechanism could automatically send on
m.chafter some time, even if all bundles had not been received. This would ensure we got something, though we would have to determine what a reasonable timeout should be. - Watch for DaemonSet Pod restarts. After some threshold (or maybe just one), stop expecting the corresponding collection agent to send a bundle.
- The collection agent could survive errors like the one the user experienced and send at least something to the manager. This probably doesn't help us in a network partition, etc.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels