What happened?
We are usung vCluster OSS and want to be able to backup and restore vclusters properly. We are also using Rancher as k8s management tool.
Tests with velero in the pasts fails because of the rancher integration.
The snapshot runs fine and we get a file in our s3-Target. During the snapshot we see the volumesnapshots in the host cluster.
A snapshot restore without deleting deployments or pvcs inside the vcluster comes up but without restoring the old data, its using the existing pvc.
When we delete the deployment and pvcs in the vcluster and try to restore, the vcluster comes up but the pvcs stuck in "pending" state.
Error in an impacted PVC describe:
Warning SyncError 7m51s persistent-volume-claim-syncer Error syncing to host cluster: update object status: persistentvolumeclaims "mysql-pv-claim-x-test-x-kw-test123" is forbidden: User "system:serviceaccount:kw-test123:vc-kw-test123" cannot update resource "persistentvolumeclaims/status" in API group "" in the namespace "kw-test123"
What did you expect to happen?
PVCs will be restored sucessfully.
How can we reproduce it (as minimally and precisely as possible)?
- create a vcluster
- create a stateful deployment (with ceph-rbd CSI driver StorageClass synced from Hostcluster)
- Run vcluster snapshot create --include-volumes
- Verify the snapshot archive is written to S3
- Delete the workload and its PVCs
- Run vcluster snapshot restore
- Observe that PVCs are created but remain Pending with no dataSource
Anything else we need to know?
- Host-Cluster OS is Talos OS
- Rancher Environment
- Ceph for CSI and S3-Endpoint
Host cluster Kubernetes version
Details
$ kubectl version
Server Version: v1.33.4
vcluster version
Details
$ vcluster --version
vcluster version 0.34.0
VCluster Config
Details
sync:
toHost:
ingresses:
enabled: true
secrets:
enabled: true
all: true
fromHost:
nodes:
enabled: true
storageClasses:
enabled: true
controlPlane:
backingStore:
etcd:
deploy:
enabled: true
statefulSet:
highAvailability:
replicas: 3
resources:
requests:
cpu: 20m
memory: 150Mi
service:
annotations:
"loft.sh/uninstall-on-cluster-delete": "true"
statefulSet:
highAvailability:
replicas: 3
security:
podSecurityContext:
fsGroup: 65532
containerSecurityContext:
runAsUser: 65532
runAsNonRoot: true
What happened?
We are usung vCluster OSS and want to be able to backup and restore vclusters properly. We are also using Rancher as k8s management tool.
Tests with velero in the pasts fails because of the rancher integration.
The snapshot runs fine and we get a file in our s3-Target. During the snapshot we see the volumesnapshots in the host cluster.
A snapshot restore without deleting deployments or pvcs inside the vcluster comes up but without restoring the old data, its using the existing pvc.
When we delete the deployment and pvcs in the vcluster and try to restore, the vcluster comes up but the pvcs stuck in "pending" state.
Error in an impacted PVC describe:
Warning SyncError 7m51s persistent-volume-claim-syncer Error syncing to host cluster: update object status: persistentvolumeclaims "mysql-pv-claim-x-test-x-kw-test123" is forbidden: User "system:serviceaccount:kw-test123:vc-kw-test123" cannot update resource "persistentvolumeclaims/status" in API group "" in the namespace "kw-test123"What did you expect to happen?
PVCs will be restored sucessfully.
How can we reproduce it (as minimally and precisely as possible)?
Anything else we need to know?
Host cluster Kubernetes version
Details
vcluster version
Details
VCluster Config
Details