In this tutorial, you will learn how to deploy Velero to your Kubernetes cluster, create backups, and recover from a backup if something goes wrong. You can back up your entire cluster, or optionally choose a namespace or label selector to back up.
Backups can be run one off or scheduled. It’s a good idea to have scheduled backups so you are certain you have a recent backup to easily fall back to. You can also create backup hooks, if you want to execute actions before or after a backup is made.
Why choose Velero?
Velero gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes. You can run Velero with a cloud provider or on-premises.
Advantages of using Velero:
- Take full
backupsof your cluster andrestorein case of data loss. Migratefrom one cluster to another.Replicateyourproductioncluster todevelopmentandtestingclusters.
Velero consists of two parts:
- A
serverthat runs on your cluster. - A
command-lineclient that runs locally.
Each Velero operation – on-demand backup, scheduled backup, restore – is a custom resource, defined with a Kubernetes Custom Resource Definition (CRD) and stored in etcd. Velero also includes controllers that process the custom resources to perform backups, restores, and all related operations.
Whenever you execute a backup command, the Velero CLI makes a call to the Kubernetes API server to create a Backup object. Backup Controller is notified about the change, and performs backup object inspection and validation (i.e. whether it is cluster backup, namespace backup, etc.). Then, it makes a call to the Kubernetes API server to query the data to be backed up, and starts the backup process once it collects all the data. Finally, data is backed up to DigitalOcean Spaces storage as a tarball file (.tar.gz).
Similarly whenever you execute a restore command, the Velero CLI makes a call to Kubernetes API server to restore from a backup object. Based on the restore command executed, Velero Restore Controller makes a call to DigitalOcean Spaces, and initiates restore from the particular backup object.
Below is a diagram that shows the Backup/Restore workflow for Velero:
Velero is ideal for the disaster recovery use case, as well as for snapshotting your application state, prior to performing system operations on your cluster, like upgrades. For more details on this topic, please visit the How Velero Works official page.
After finishing this tutorial, you should be able to:
- Configure
DO Spacesstorage backend forVeleroto use. BackupandrestoreyourapplicationsBackupandrestoreyour entireDOKScluster.- Create
scheduledbackups for your applications. - Create
retention policiesfor your backups.
- Introduction
- Prerequisites
- Step 1 - Installing Velero using Helm
- Step 2 - Namespace Backup and Restore Example
- Step 3 - Backup and Restore Whole Cluster Example
- Step 4 - Scheduled Backups
- Step 5 - Deleting Backups
- Conclusion
To complete this tutorial, you need the following:
- A DO Spaces Bucket and
accesskeys. Save theaccessandsecretkeys in a safe place for later use. - A DigitalOcean API token for
Veleroto use. - A Git client, to clone the
Starter Kitrepository. - Helm, for managing
Veleroreleases and upgrades. - Doctl, for
DigitalOceanAPI interaction. - Kubectl, for
Kubernetesinteraction. - Velero client, to manage
Velerobackups.
In this step, you will deploy Velero and all the required components, so that it will be able to perform backups for your Kubernetes cluster resources (PV's as well). Backups data will be stored in the DO Spaces bucket created earlier in the Prerequisites section.
Steps to follow:
-
First, clone the
Starter KitGit repository and change directory to your local copy:git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git cd Kubernetes-Starter-Kit-Developers -
Next, add the
Helmrepository and list the available charts:helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts helm repo update vmware-tanzu helm search repo vmware-tanzu
The output looks similar to the following:
NAME CHART VERSION APP VERSION DESCRIPTION vmware-tanzu/velero 2.29.7 1.8.1 A Helm chart for veleroNote:
The chart of interest is
vmware-tanzu/velero, which will installVeleroon the cluster. Please visit the velero-chart page for more details about this chart. -
Then, open and inspect the Velero
Helmvalues file provided in theStarter Kitrepository, using an editor of your choice (preferably withYAMLlint support). You can use VS Code for example:VELERO_CHART_VERSION="2.29.7" code 05-setup-backup-restore/assets/manifests/velero-values-v${VELERO_CHART_VERSION}.yaml
-
Next, please replace the
<>placeholders accordingly for your DO SpacesVelerobucket (like: name, region and secrets). Make sure that you provide your DigitalOceanAPItoken as well (DIGITALOCEAN_TOKENkey). -
Finally, install
VelerousingHelm:VELERO_CHART_VERSION="2.29.7" helm install velero vmware-tanzu/velero --version "${VELERO_CHART_VERSION}" \ --namespace velero \ --create-namespace \ -f 05-setup-backup-restore/assets/manifests/velero-values-v${VELERO_CHART_VERSION}.yaml
Note:
A
specificversion for theVeleroHelm chart is used. In this case2.29.7is picked, which maps to the1.8.1version of the application (see the output fromStep 2.). It’s good practice in general, to lock on a specific version. This helps to have predictable results, and allows versioning control viaGit.
Now, please check your Velero deployment:
helm ls -n veleroThe output looks similar to the following (STATUS column should display deployed):
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
velero velero 1 2022-06-09 08:38:24.868664 +0300 EEST deployed velero-2.29.7 1.8.1
Next, verify that Velero is up and running:
kubectl get deployment velero -n veleroThe output looks similar to the following (deployment pods must be in the Ready state):
NAME READY UP-TO-DATE AVAILABLE AGE
velero 1/1 1 1 67s
Notes:
- If you’re interested in looking further, you can view Velero’s server-side components:
kubectl -n velero get all-
Explore
VeleroCLI help pages, to see whatcomandsandsub-commandsare available. You can get help for each, by using the--helpflag:List all the available commands for
Velero:velero --help
List
backupcommand options forVelero:velero backup --help
Velero uses a number of CRD's (Custom Resource Definitions) to represent its own resources like backups, backup schedules, etc. You'll discover each in the next steps of the tutorial, along with some basic examples.
In this step, you will learn how to perform a one time backup for an entire namespace from your DOKS cluster, and restore it afterwards making sure that all the resources are re-created. The namespace in question is ambassador.
Next, you will perform the following tasks:
Createtheambassadornamespacebackup, using theVeleroCLI.Deletetheambassadornamespace.Restoretheambassadornamespace, using theVeleroCLI.Checktheambassadornamespace restoration.
First, initiate the backup:
velero backup create ambassador-backup --include-namespaces ambassadorNext, check that the backup was created:
velero backup getThe output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
ambassador-backup Completed 0 0 2021-08-25 19:33:03 +0300 EEST 29d default <none>
Then, after a few moments, you can inspect it:
velero backup describe ambassador-backup --detailsThe output looks similar to:
Name: ambassador-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: ambassador
Excluded: <none>
...
Hints:
-
Look for the
Phaseline. It should sayCompleted. -
Check that no
Errorsare reported as well. -
A new Kubernetes
Backupobject is created:~ kubectl get backup/ambassador-backup -n velero -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/source-cluster-k8s-gitversion: v1.21.2 velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "21" ...
Finally, take a look at the DO Spaces bucket - there's a new folder named backups, which contains the assets created for your ambassador-backup:
First, simulate a disaster, by intentionally deleting the ambassador namespace:
kubectl delete namespace ambassadorNext, check that the namespace was deleted (namespaces listing should not print ambassador):
kubectl get namespacesFinally, verify that the echo and quote backend services endpoint is DOWN (please refer to Creating the Ambassador Edge Stack Backend Services), regarding the backend applications used in the Starter Kit tutorial). You can use curl to test (or you can use your web browser):
curl -Li http://quote.starter-kit.online/quote/
curl -Li http://echo.starter-kit.online/echo/First, restore the ambassador-backup:
velero restore create --from-backup ambassador-backupImportant note:
When you delete the ambassador namespace, the load balancer resource associated with the ambassador service will be deleted as well. So, when you restore the ambassador service, the LB will be recreated by DigitalOcean. The issue is that you will get a NEW IP address for your LB, so you will need to adjust the A records for getting traffic into your domains hosted on the cluster.
First, check the Phase line from the ambassador-backup restore command output. It should say Completed (also, please take a note of the Warnings section - it tells if something went bad or not):
velero restore describe ambassador-backupNext, verify that all the resources were restored for the ambassador namespace (look for the ambassador pods, services and deployments):
kubectl get all --namespace ambassadorThe output looks similar to:
NAME READY STATUS RESTARTS AGE
pod/ambassador-5bdc64f9f6-9qnz6 1/1 Running 0 18h
pod/ambassador-5bdc64f9f6-twgxb 1/1 Running 0 18h
pod/ambassador-agent-bcdd8ccc8-8pcwg 1/1 Running 0 18h
pod/ambassador-redis-64b7c668b9-jzxb5 1/1 Running 0 18h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ambassador LoadBalancer 10.245.74.214 159.89.215.200 80:32091/TCP,443:31423/TCP 18h
service/ambassador-admin ClusterIP 10.245.204.189 <none> 8877/TCP,8005/TCP 18h
service/ambassador-redis ClusterIP 10.245.180.25 <none> 6379/TCP 18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ambassador 2/2 2 2 18h
deployment.apps/ambassador-agent 1/1 1 1 18h
deployment.apps/ambassador-redis 1/1 1 1 18h
NAME DESIRED CURRENT READY AGE
replicaset.apps/ambassador-5bdc64f9f6 2 2 2 18h
replicaset.apps/ambassador-agent-bcdd8ccc8 1 1 1 18h
replicaset.apps/ambassador-redis-64b7c668b9 1 1 1 18h
Ambassador Hosts:
kubectl get hosts -n ambassadorThe output looks similar to (STATE should be Ready, as well as the HOSTNAME column pointing to the fully qualified host name):
NAME HOSTNAME STATE PHASE COMPLETED PHASE PENDING AGE
echo-host echo.starter-kit.online Ready 11m
quote-host quote.starter-kit.online Ready 11m
Ambassador Mappings:
kubectl get mappings -n ambassadorThe output looks similar to (notice the echo-backend which is mapped to the echo.starter-kit.online host and /echo/ source prefix, same for quote-backend):
NAME SOURCE HOST SOURCE PREFIX DEST SERVICE STATE REASON
ambassador-devportal /documentation/ 127.0.0.1:8500
ambassador-devportal-api /openapi/ 127.0.0.1:8500
ambassador-devportal-assets /documentation/(assets|styles)/(.*)(.css) 127.0.0.1:8500
ambassador-devportal-demo /docs/ 127.0.0.1:8500
echo-backend echo.starter-kit.online /echo/ echo.backend
quote-backend quote.starter-kit.online /quote/ quote.backend
Finally, after reconfiguring your LoadBalancer and DigitalOcean domain settings, please verify that the echo and quote backend services endpoint is UP (please refer to Creating the Ambassador Edge Stack Backend Services). For example, you can use curl to test each endpoint:
curl -Li https://quote.starter-kit.online/quote/
curl -Li https://echo.starter-kit.online/echo/In the next step, you will simulate a disaster by intentionally deleting your DOKS cluster (the Starter Kit DOKS cluster).
In this step, you will simulate a disaster recovery scenario. The whole DOKS cluster will be deleted, and then restored from a previous backup.
Next, you will perform the following tasks:
CreatetheDOKSclusterbackup, usingVeleroCLI.DeletetheDOKScluster, usingdoctl.RestoretheDOKScluster important applications, usingVeleroCLI.ChecktheDOKScluster applications state.
First, create a backup for the whole DOKS cluster:
velero backup create all-cluster-backupNext, check that the backup was created and it's not reporting any errors. The following command lists all the available backups:
velero backup getThe output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
all-cluster-backup Completed 0 0 2021-08-25 19:43:03 +0300 EEST 29d default <none>
Finally, inspect the backup state and logs (check that no errors are reported):
velero backup describe all-cluster-backup
velero backup logs all-cluster-backupAn important aspect to keep in mind is that whenever you destroy a DOKS cluster without specifying the --dangerous flag to the doctl command and then restore it, the same Load Balancer with the same IP is created. This means that you don't need to update your DigitalOcean DNS A records. When the --dangerous flag is supplied to the doctl command, the existing Load Balancer will be destroyed and a new Load Balancer with a new external IP is created as well when Velero restores your ingress controller. So, please make sure to update your DigitalOcean DNS A records accordingly.
First, delete the whole DOKS cluster (make sure to replace the <> placeholders accordingly):
doctl kubernetes cluster delete <DOKS_CLUSTER_NAME>to delete the kubernetes cluster without destroying the associated Load Balancer or
doctl kubernetes cluster delete <DOKS_CLUSTER_NAME> --dangerousto delete the kubernetes cluster and also destroying the associated Load Balancer.
Next, re-create the cluster, as described in Section 1 - Set up DigitalOcean Kubernetes. Please make sure the new DOKS cluster node count is equal or greater with to the original one - this is important!
Then, install Velero CLI and Server, as described in the Prerequisites section, and Step 1 - Installing Velero using Helm respectively. Please make sure to use the same Helm Chart version - this is important!
Finally, restore everything by using below command:
velero restore create --from-backup all-cluster-backupFirst, check the Phase line from the all-cluster-backup restore describe command output (please replace the <> placeholders accordingly). It should say Completed (also, please take a note of the Warnings section - it tells if something went bad or not):
velero restore describe all-cluster-backup-<timestamp>An important aspect to keep in mind is that whenever you destroy a DOKS cluster without specifying the --dangerous flag to the doctl command and then restore it, the same Load Balancer with the same IP is created. This means that you don't need to update your DigitalOcean DNS A records. When the --dangerous flag is supplied to the doctl command, the existing Load Balancer will be destroyed and a new Load Balancer with a new external IP is created as well when Velero restores your ingress controller. So, please make sure to update your DigitalOcean DNS A records accordingly.
Next, an important aspect to keep in mind is that whenever you destroy a DOKS cluster without specifying the --dangerous flag to the doctl command and then restore it, the same Load Balancer with the same IP is created. When the --dangerous flag is supplied to the doctl command, the existing Load Balancer will be destroyed and a new Load Balancer with a new external IP is created as well when Velero restores your ingress controller. You have to make sure that DNS records will be updated as well, to reflect the change.
Now, verify all cluster Kubernetes resources (you should have everything in place):
kubectl get all --all-namespacesFinally, the backend applications should respond to HTTP requests as well (please refer to Creating the Ambassador Edge Stack Backend Services), regarding the backend applications used in the Starter Kit tutorial):
curl -Li http://quote.starter-kit.online/quote/
curl -Li http://echo.starter-kit.online/echo/In the next step, you will learn how to perform scheduled (or automatic) backups for your DOKS cluster applications.
Taking backups automatically based on a schedule, is a really useful feature to have. It allows you to rewind back time, and restore the system to a previous working state if something goes wrong.
Creating a scheduled backup is a very straightforward process. An example is provided below for a 1 minute interval (the kube-system namespace was picked).
First, create the schedule:
velero schedule create kube-system-minute-backup --schedule="@every 1m" --include-namespaces kube-systemHint:
Linux cronjob format is supported also:
schedule="*/1 * * * *"
Next, verify that the schedule was created:
velero schedule getThe output looks similar to:
NAME STATUS CREATED SCHEDULE BACKUP TTL LAST BACKUP SELECTOR
kube-system-minute-backup Enabled 2021-08-26 12:37:44 +0300 EEST @every 1m 720h0m0s 32s ago <none>
Then, inspect all the backups, after one minute or so:
velero backup getThe output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
kube-system-minute-backup-20210826093916 Completed 0 0 2021-08-26 12:39:20 +0300 EEST 29d default <none>
kube-system-minute-backup-20210826093744 Completed 0 0 2021-08-26 12:37:44 +0300 EEST 29d default <none>
First, check the Phase line from one of the backups (please replace the <> placeholders accordingly) - it should say Completed:
velero backup describe kube-system-minute-backup-<timestamp>Finally, take a note of possible Erros and Warnings from the above command output as well - it tells if something went bad or not.
To restore one of the minute backups, please follow the same steps as you learned in the previous steps of this tutorial. This is a good way to exercise and test your experience accumulated so far.
In the next step, you will learn how to manually or automatically delete specific backups you created over time.
When you decide that some older backups are not needed anymore, you can free up some resources both on the Kubernetes cluster, as well as on the Velero DO Spaces bucket.
In this step, you will learn how to use one of the following methods to delete Velero backups:
Manually(or by hand), usingVeleroCLI.Automatically, by setting backupsTTL(Time To Live), viaVeleroCLI.
First, pick a one minute backup for example, and issue the following command (please replace the <> placeholders accordingly):
velero backup delete kube-system-minute-backup-<timestamp>Now, check that it's gone from the velero backup get command output. It should be deleted from the DO Spaces bucket as well.
Next, you will learn how to delete multiple backups at once, by using a selector. The velero backup delete subcommand provides a flag called --selector. It allows you to delete multiple backups at once based on Kubernetes Labels. The same rules apply as for Kubernetes Label Selectors.
First, list the available backups:
velero backup getThe output looks similar to:
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
ambassador-backup Completed 0 0 2021-08-25 19:33:03 +0300 EEST 23d default <none>
backend-minute-backup-20210826094116 Completed 0 0 2021-08-26 12:41:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826094016 Completed 0 0 2021-08-26 12:40:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093916 Completed 0 0 2021-08-26 12:39:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093816 Completed 0 0 2021-08-26 12:38:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093716 Completed 0 0 2021-08-26 12:37:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093616 Completed 0 0 2021-08-26 12:36:16 +0300 EEST 24d default <none>
backend-minute-backup-20210826093509 Completed 0 0 2021-08-26 12:35:09 +0300 EEST 24d default <none>
Next, say that you want to delete all the backend-minute-backup-* assets. Pick a backup from the list, and inspect the Labels:
velero describe backup backend-minute-backup-20210826094116The output looks similar to (notice the velero.io/schedule-name label value):
Name: backend-minute-backup-20210826094116
Namespace: velero
Labels: velero.io/schedule-name=backend-minute-backup
velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: backend
Excluded: <none>
...
Next, you can delete all the backups that match the backend-minute-backup value of the velero.io/schedule-name label:
velero backup delete --selector velero.io/schedule-name=backend-minute-backupFinally, check that all the backend-minute-backup-* assets disappeared from the velero backup get command output, as well as from the DO Spaces bucket.
When you create a backup, you can specify a TTL (Time To Live), by using the --ttl flag. If Velero sees that an existing backup resource is expired, it removes:
- The
Backupresource - The backup
filefrom cloud objectstorage - All
PersistentVolumesnapshots - All associated
Restores
The TTL flag allows the user to specify the backup retention period with the value specified in hours, minutes and seconds in the form --ttl 24h0m0s. If not specified, a default TTL value of 30 days will be applied.
Next, you will create a short lived backup for the ambassador namespace, with a TTL value set to 3 minutes.
First, create the ambassador backup, using a TTL value of 3 minutes:
velero backup create ambassador-backup-3min-ttl --ttl 0h3m0s --include-namespaces ambassadorNext, inspect the ambassador backup:
velero backup describe ambassador-backup-3min-ttlThe output looks similar to (notice the Namespaces -> Included section - it should display ambassador, and TTL field is set to 3ms0):
Name: ambassador-backup-3min-ttl
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: ambassador
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 3m0s
...
A new folder should be created in the DO Spaces Velero bucket as well, named ambassador-backup-3min-ttl.
Finally, after three minutes or so, the backup and associated resources should be automatically deleted. You can verify that the backup object was destroyed, using: velero backup describe ambassador-backup-3min-ttl. It should fail with an error, stating that the backup doesn't exist anymore. The corresponding ambassador-backup-3min-ttl folder from the DO Spaces Velero bucket, should be gone as well.
Going further, you can explore all the available velero backup delete options, via:
velero backup delete --helpIn this tutorial, you learned how to perform one time, as well as scheduled backups, and to restore everything back. Having scheduled backups in place, is very important as it allows you to revert to a previous snapshot in time, if something goes wrong along the way. You walked through a disaster recovery scenario, as well.
You can learn more about Velero, by following below topics:
- Backup Command Reference
- Restore Command Reference
- Backup Hooks
- Cluster Migration
- Velero Troubleshooting
Next, you will learn how to handle Kubernetes Secrets using the Sealed Secrets Controller or External Secrets Operator.

