Summary
Provide a pluggable, persistent frontier and seen-set so long crawls can resume after crashes or restarts.
Motivation
- Reliability for multi-day jobs
- Lower revisit rates after restart
Scope
- Frontier interface remains the same
- Implement
badger or bolt backend
--state-dir config; resume <jobID> CLI
Acceptance Criteria
- Kill the process mid-crawl; on resume, crawl continues with minimal revisits
- State file size and write rate documented
Tasks