Open
Description
Describe the bug
Let's imagine we had a crash in bluechi-controller or bluechi-agent. For example: https://github.com/containers/eclipse-bluechi/issues/425
- Should bluechi (same apply to agent) service keep down due: bluechi.service: Start request repeated too quickly. ?
- keep trying until is able to restore? (i.e: a new config was sent to network) but how long to wait until to try the restart? - What's the minimum possible wait until to restart the node or redeploy? (agents depend on manager node to report)
There are systemd service
keys that might help this behavior: StartLimitInterval
and StartLimitBurst
Output of systemctl status bluechi-controller:
× hirte.service - Hirte systemd service controller manager daemon
Loaded: loaded (/usr/local/lib/systemd/system/hirte.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Wed 2023-08-02 05:45:53 UTC; 1s ago
Duration: 3ms
Docs: man:hirte(1)
man:hirte.conf(5)
Process: 214542 ExecStart=/usr/bin/hirte -c /etc/hirte/hirte.conf (code=exited, status=1/FAILURE)
Main PID: 214542 (code=exited, status=1/FAILURE)
CPU: 3ms
Aug 02 05:45:53 control systemd[1]: hirte.service: Scheduled restart job, restart counter is at 5.
Aug 02 05:45:53 control systemd[1]: Stopped Hirte systemd service controller manager daemon.
Aug 02 05:45:53 control systemd[1]: hirte.service: Start request repeated too quickly.
Aug 02 05:45:53 control systemd[1]: hirte.service: Failed with result 'exit-code'.
Aug 02 05:45:53 control systemd[1]: Failed to start Hirte systemd service controller manager daemon.