Reloader frequent restarts due to failed liveness probes

I'm facing an issue with frequent restarts on v0.0.95.

Liveness endpoint (/metrics) response may sometimes be in the range of 1-5 seconds. Of course I can just increase the timeout (which is 1 second by default), but this only hides the problem. I believe there is some inefficiency in the code which affects even /metrics responses.

There are quite a lot of secrets and configmaps in the cluster, so it might put a strain, but there are no cpu and memory limits, so it should just take as much as needed and continue working. I think that /metrics should have its own thread or it would be even better to have a dedicated /readiness and /liveness endpoints which will really check and appropriately report the status of the service. Otherwise its unreliable to run in production especially considering there is no HA, so if pod is restarted I believe it will lose any info about the resources and may miss triggering reloads.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reloader frequent restarts due to failed liveness probes #252

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reloader frequent restarts due to failed liveness probes #252

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions