Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions cluster/scripts/node-backup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ source "${TOOLS_LIB}/libcli.source"
source "${SPLICE_ROOT}/cluster/scripts/utils.source"

RUN_ID=$(date +%s)
WAIT_FOR_BACKUP_RETRIES=450

##### PVC Backup & Restore

Expand Down Expand Up @@ -54,7 +55,10 @@ function wait_for_pvc_backup() {
_info "Backup of $description PVC ready!"
break
else
(( i++ )) && (( i > 300 )) && _error "Timed out waiting for backup of $description PVC"
(( i++ )) && (( i > WAIT_FOR_BACKUP_RETRIES )) && {
kubectl delete volumesnapshot -n "$namespace" "$backupName";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know we're allowed to delete here at all? If an operation is running then this will just fail too, won't it? But I guess that's fine because then we'll still retry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we'll still retry

Hm or will this script just die, because we're not capturing the error code of the kubectl delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully it works and it gets deleted, but otherwise you're correct, it will exit 1 and retry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm and assuming the delete never works... will we loop forever here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_error has an exit 1 inside, so either kubectl delete fails or _error exits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it does... so fine then, not veto-ing if you feel confident about this.

_error "Timed out waiting for backup of $description PVC";
}
sleep 5
_info "still waiting..."
fi
Expand Down Expand Up @@ -148,7 +152,10 @@ function wait_for_cloudsql_backup() {
_info "Backup of $description ready! Backup ID: $id "
break
else
(( i++ ))&& (( i > 300 )) &&_error "Timed out waiting for backup of $description db"
(( i++ ))&& (( i > WAIT_FOR_BACKUP_RETRIES )) && {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this time out too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but I'd rather mirror the behavior between both

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the consistency but sentences that contain both delete and backup always make scared and look more closely...

gcloud sql backups delete "$id" --instance "$db_id" --quiet;
_error "Timed out waiting for backup of $description db";
}
sleep 5
_info "still waiting..."
fi
Expand Down