What steps did you take and what happened:
When waiting for the CSI Snapshot to complete, the CSI plugin checks for the SnapHandle every 5 seconds up until csiSnapshotTimeout (default 10min) is reached. This is a problem for workloads that use Microsoft VSS because VSS will unfreeze the filesystem after 10 seconds (which is not configurable). If a workload has 2 volumes, the 5 second polling interval will almost always result in a forced unfreeze before the post hook runs and likely before the last PVC's snapshot is done.
See the VSS doc here: https://learn.microsoft.com/en-us/windows/win32/vss/overview-of-processing-a-backup-under-vss
Note that that the 10-second unfreeze is not configurable.
What did you expect to happen:
PVC backup to complete before FS unfreeze.
Forthcoming PR will refactor this to poll every second for the first 10 seconds, followed by the previous "every 5 seconds" until the snapshot timeout is reached.
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add:
Environment:
- Velero version (use
velero version):
- Velero features (use
velero client config get features):
- Kubernetes version (use
kubectl version):
- Kubernetes installer & version:
- Cloud provider or hardware configuration:
- OS (e.g. from
/etc/os-release):
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
- 👍 for "I would like to see this bug fixed as soon as possible"
- 👎 for "There are more important bugs to focus on right now"
What steps did you take and what happened:
When waiting for the CSI Snapshot to complete, the CSI plugin checks for the SnapHandle every 5 seconds up until csiSnapshotTimeout (default 10min) is reached. This is a problem for workloads that use Microsoft VSS because VSS will unfreeze the filesystem after 10 seconds (which is not configurable). If a workload has 2 volumes, the 5 second polling interval will almost always result in a forced unfreeze before the post hook runs and likely before the last PVC's snapshot is done.
See the VSS doc here: https://learn.microsoft.com/en-us/windows/win32/vss/overview-of-processing-a-backup-under-vss
Note that that the 10-second unfreeze is not configurable.
What did you expect to happen:
PVC backup to complete before FS unfreeze.
Forthcoming PR will refactor this to poll every second for the first 10 seconds, followed by the previous "every 5 seconds" until the snapshot timeout is reached.
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>to generate the support bundle, and attach to this issue, more options please refer tovelero debug --helpIf you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velerovelero backup describe <backupname>orkubectl get backup/<backupname> -n velero -o yamlvelero backup logs <backupname>velero restore describe <restorename>orkubectl get restore/<restorename> -n velero -o yamlvelero restore logs <restorename>Anything else you would like to add:
Environment:
velero version):velero client config get features):kubectl version):/etc/os-release):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.