-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Fix backups for tasks using a mounted file share #9972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
62343ef to
c25755a
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #9972 +/- ##
===========================================
+ Coverage 73.88% 76.12% +2.23%
===========================================
Files 428 428
Lines 46257 46247 -10
Branches 4139 4139
===========================================
+ Hits 34179 35204 +1025
+ Misses 12078 11043 -1035
🚀 New features to boost your workflow:
|
35affd4 to
8abde7d
Compare
zhiltsov-max
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check if this PR closes any other issues from the list? I can see there are 4 similar issues.
Wow, apparently everyone on the team encountered this at some point. 🤪 Updated the description. |
8abde7d to
60ad3f3
Compare
|
…rame range Let's say we have a cloud storage-based tasks that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972.
…rame range Let's say we have a cloud storage-based tasks that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972.
…rame range Let's say we have a cloud storage-based tasks that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972.
…rame range Let's say we have a cloud storage-based task that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972.
…rame range Let's say we have a cloud storage-based task that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972.
…rame range (#10004) Let's say we have a cloud storage-based task that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until #9972. This bug does _not_ affect backups of tasks based on local files, because those currently include _all_ frames, even those that fall outside that frame range. I think that behavior is probably wrong, but it doesn't require immediate fixing.
60ad3f3 to
871dc7b
Compare
Currently, backing up such a task succeeds, but restoring fails with "No such file or directory: <root>/share/manifest.jsonl". IMO, the problem is not really in the restore logic, but in the backup logic. Share task backups are "heavyweight", in that they include a copy of the media files. But the `storage` field in `task.json` is still set to "share". As a result, CVAT tries to import it as a share task, and that logic is broken. I think we should handle such backups consistently with heavyweight cloud storage backups, and set `storage` to "local". The restore logic will then work perfectly well. In the future, if we happen to implement lightweight backups for share tasks, we can use `storage: "share"` for those.
There doesn't seem to be much purpose to this check - chunks are not even used for backups, and backups of such tasks work fine with the check removed.
This is the same issue as in cvat-ai#10004, but for tasks with share storage. Use the same fix.
This code tries to fix up backups of tasks with share storage. As of the previous commit, we now do this while creating the backup instead. The code is safe to delete, because importing a task with `storage="share"` currently doesn't work anyway.
871dc7b to
f85319b
Compare
|
|
We should start using this filtering for local tasks as well at some point, maybe it makes sense to deprecate the inclusion of extra files in the backup manifest for them. |
…rame range (#10004) Let's say we have a cloud storage-based task that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until #9972. This bug does _not_ affect backups of tasks based on local files, because those currently include _all_ frames, even those that fall outside that frame range. I think that behavior is probably wrong, but it doesn't require immediate fixing.
Fixes #4989 Fixes #6149 Fixes #7965 Fixes #8297 There are three problems here. First, CVAT refuses to back up such tasks if they use static chunks. There doesn't seem to be much purpose for this; the backup process doesn't even use chunks. With the check removed, the backups work fine. Second, even when backing up succeeds, restoring fails with "No such file or directory: <root>/share/manifest.jsonl". IMO, the problem here is not really in the restore logic, but in the backup logic. Share task backups are "heavyweight", in that they include a copy of the media files. But the `storage` field in `task.json` is still set to "share". As a result, CVAT tries to import it as a share task, and that logic is broken. I think we should handle such backups consistently with heavyweight cloud storage backups, and set `storage` to "local". The restore logic will then work perfectly well. In the future, if we happen to implement lightweight backups for share tasks, we can use `storage: "share"` for those. Third, if a task has images and a custom frame range (start frame/stop frame/step), then importing the backup _also_ doesn't work for reasons described in #10004. We can fix this using the `_write_filtered_media_manifest` function introduced in that PR. This PR also removes logic from `create_thread` that tries to fix up backups that have `storage="share"`. After this patch, such backups no longer need fixing up (and before the patch, they couldn't be imported anyway).
…rame range (cvat-ai#10004) Let's say we have a cloud storage-based task that was created with 6 input images, and the following settings: start frame 1, stop frame 4, frame step 2. A heavyweight backup of such a task will then contain only frames 1 and 3 in the `data` directory; however, the manifest will still contain all 6 frames. If you try to restore such a backup, CVAT will fail, because it checks that the files in the `data` directory correspond 1:1 to the manifest. This could potentially be fixed in the restore code; however, it seems to me that the backups created in this case are incorrect, as they have manifests referencing nonexistent files. As such, I think it's more appropriate to fix this in the backup code. The fix is to filter the manifest during backup, leaving only entries corresponding to frames that actually get backed up. We also need to reset the frame range to the default, so that it matches the filtered manifest. Note that the same bug also affects backups of tasks based on the mounted share. It should be reasonably easy to fix (just use `_write_filtered_media_manifest` in the `StorageChoice.SHARE` branch), but I cannot test such a fix, because share backups are currently broken entirely. So I will defer this fix until cvat-ai#9972. This bug does _not_ affect backups of tasks based on local files, because those currently include _all_ frames, even those that fall outside that frame range. I think that behavior is probably wrong, but it doesn't require immediate fixing.
Fixes cvat-ai#4989 Fixes cvat-ai#6149 Fixes cvat-ai#7965 Fixes cvat-ai#8297 There are three problems here. First, CVAT refuses to back up such tasks if they use static chunks. There doesn't seem to be much purpose for this; the backup process doesn't even use chunks. With the check removed, the backups work fine. Second, even when backing up succeeds, restoring fails with "No such file or directory: <root>/share/manifest.jsonl". IMO, the problem here is not really in the restore logic, but in the backup logic. Share task backups are "heavyweight", in that they include a copy of the media files. But the `storage` field in `task.json` is still set to "share". As a result, CVAT tries to import it as a share task, and that logic is broken. I think we should handle such backups consistently with heavyweight cloud storage backups, and set `storage` to "local". The restore logic will then work perfectly well. In the future, if we happen to implement lightweight backups for share tasks, we can use `storage: "share"` for those. Third, if a task has images and a custom frame range (start frame/stop frame/step), then importing the backup _also_ doesn't work for reasons described in cvat-ai#10004. We can fix this using the `_write_filtered_media_manifest` function introduced in that PR. This PR also removes logic from `create_thread` that tries to fix up backups that have `storage="share"`. After this patch, such backups no longer need fixing up (and before the patch, they couldn't be imported anyway).



Fixes #4989
Fixes #6149
Fixes #7965
Fixes #8297
Motivation and context
There are three problems here.
First, CVAT refuses to back up such tasks if they use static chunks. There doesn't seem to be much purpose for this; the backup process doesn't even use chunks. With the check removed, the backups work fine.
Second, even when backing up succeeds, restoring fails with "No such file or directory: /share/manifest.jsonl".
IMO, the problem here is not really in the restore logic, but in the backup logic. Share task backups are "heavyweight", in that they include a copy of the media files. But the
storagefield intask.jsonis still set to "share". As a result, CVAT tries to import it as a share task, and that logic is broken.I think we should handle such backups consistently with heavyweight cloud storage backups, and set
storageto "local". The restore logic will then work perfectly well. In the future, if we happen to implement lightweight backups for share tasks, we can usestorage: "share"for those.Third, if a task has images and a custom frame range (start frame/stop frame/step), then importing the backup also doesn't work for reasons described in #10004. We can fix this using the
_write_filtered_media_manifestfunction introduced in that PR.This PR also removes logic from
create_threadthat tries to fix up backups that havestorage="share". After this patch, such backups no longer need fixing up (and before the patch, they couldn't be imported anyway).How has this been tested?
Unit tests.
Checklist
developbranch[ ] I have updated the documentation accordinglyLicense
Feel free to contact the maintainers if that's a concern.