Description
I’ve got some crawls I run regularly and that sometimes crash in the middle, so I’ve got them wrapped in code to restart using the latest state file after looking at the log output to determine whether restarting makes sense (mainly I just look for the log line that says “crawl status: interrupted”).
The --help
docs mention a --restartsOnError
option that I’m not currently using:
if set, assume will be restarted if interrupted, don't run post-crawl processes on interrupt [boolean] [default: false]
But that description seemed overly concise and left me with some questions (what doesn’t happen? What do I need to account for?). A brief search of the code also makes it clear this option has some non-obvious implications for general behavior: it changes the 17 exit code to 0 and causes some things to throw instead of exit more gracefully (which might also affect exit codes?).
It would be great if there was some more detail about what this option does and how to use it effectively, maybe as part of the docs on interrupting/restarting crawls. I’d be happy to take a stab at that if someone could fill me in on the details a bit more here.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status