Make Corso backups restartable #2853
Description
First backups can take a long time due to the amount of data to backup and throttling. Currently these backups only get minor benefits from kopia assisted incrementals, leading to long runtimes if a backup fails partway through and needs restarted.
Right now, restarting a backup will avoid reuploading previously uploaded data but will require enumerating all items again. Since item enumeration can be time-consuming this draws out backup time
Corso should be enhanced to have a way to restart backups that fail partway through. The enhancements should take into consideration things like liveness (as an extreme example, ensuring Corso will make progress and eventually generate a backup if it crashes after every 5min)
This ticket provides tracking information for individual items that need to be completed to implement restartable backups
Basic features
- fetch enough information to make a
details.ItemInfo
entry for an item during enumeration #2854 - persist enumerated item pages to kopia #2855
- feed persisted enumerated item pages back into GraphConnector #2856
- use persisted item pages to source items to lookup during backup #2857
- recreate
details.ItemInfo
entries based on persisted item pages #2859 - define a state machine for Corso backups #2860
- find and reuse the most recent next or delta token from persisted item enumeration pages #2863