[Feature]: Optionally restarting on browser crashes with exponential backoff

The crawler currently handles browser crashes (or other interruptions) by exiting with a specific error code, and assuming that the crawler container will be restarted. This has many advantages, ensuring full cleanup, etc... and works well with Kubernetes pod behavior.
Since we run the crawler in production only in Kubernetes, we have leaned into this behavior more and more.

However, we understand many users don't want to run the crawler in Kubernetes, or with an external controller or process manager.

For these deployments, having the crawler exit with a status code is not ideal, and I'm thinking if perhaps a wrapper shell script that does exponential backoff and restarts the node process would provide a good standalone feature? We would then default to running with `restartsOnError` set to true, and that would be the default path.

The exponential back-off script could be something simple, there's many example, like: https://gist.github.com/nathforge/62456d9b18e276954f58eb61bf234c17

It would need to have additional properties, to mimic the Kubernetes behavior:
- Reset the time if crawler is running successfully for some amount of time (eg. 10 without any exits)
- Don't restart on certain error codes, like time limit reached or out of disk space

As this wouldn't be our production deployment, we would want help from the community in testing this approach, as we won't have a lot of bandwidth to test this, especially for longer running crawls.

I can see it being helpful for issues such as one in #927 and especially for openzim/zimit#527

For users running larger-scale crawls with just Browsertrix Crawler (@benoit74, @gitreich @Mr0grog) would you be willing to help test this type of setup? What do you think of this approach?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Optionally restarting on browser crashes with exponential backoff #938

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Optionally restarting on browser crashes with exponential backoff #938

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions