Skip to content

Velero Panic on backup - Observed a panic: sync: negative WaitGroup counter #8708

@Gui13

Description

@Gui13

What steps did you take and what happened:

During a velero backup, we hit a panic (this is not the same as #8657):


time="2025-02-20T09:35:08Z" level=info msg="plugin process exited" backup-storage-location=velero/default cmd=/plugins/velero-plugin-for-microsoft-azure controller=backup-storage-location id=2068 logSource="pkg/plugin/clientmgmt/process/logrus_adapter.go:80" plugin=/plugins/velero-plugin-for-microsoft-azure
time="2025-02-20T09:35:15Z" level=info msg="The request has status 'InProgress', skip." controller=backup-deletion deletebackuprequest=velero/velero-braincube-20250120000012-gt6xm logSource="pkg/controller/backup_deletion_controller.go:145"
time="2025-02-20T09:35:27Z" level=error msg="pod volume backup failed: data path backup canceled: PVB is canceled" backup=velero/manual-velero-braincube-20250220 logSource="pkg/podvolume/backupper.go:382"
time="2025-02-20T09:35:27Z" level=error msg="pod volume backup failed: data path backup canceled: PVB is canceled" backup=velero/manual-velero-braincube-20250220 logSource="pkg/podvolume/backupper.go:382"
E0220 09:35:27.132115       1 runtime.go:77] Observed a panic: sync: negative WaitGroup counter
goroutine 757 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x26c8ce0, 0x314da30})
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:75 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00467f6c0?})
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:49 +0x6b
panic({0x26c8ce0?, 0x314da30?})
	/usr/local/go/src/runtime/panic.go:770 +0x132
sync.(*WaitGroup).Add(0xc007cec270?, 0x2c4e600?)
	/usr/local/go/src/sync/waitgroup.go:62 +0xd8
sync.(*WaitGroup).Done(...)
	/usr/local/go/src/sync/waitgroup.go:87
github.com/vmware-tanzu/velero/pkg/podvolume.newBackupper.func1({0x40a3d2?, 0xc007cf6060?}, {0x2c4e600?, 0xc00da55208?})
	/go/src/github.com/vmware-tanzu/velero/pkg/podvolume/backupper.go:167 +0x1eb
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(...)
	/go/pkg/mod/k8s.io/client-go@v0.29.0/tools/cache/controller.go:246
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/pkg/mod/k8s.io/client-go@v0.29.0/tools/cache/shared_informer.go:970 +0xea
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:226 +0x33
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0030d9f70, {0x3156de0, 0xc008688c60}, 0x1, 0xc0037a0ba0)
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0006b7770, 0x3b9aca00, 0x0, 0x1, 0xc0037a0ba0)
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:161
k8s.io/client-go/tools/cache.(*processorListener).run(0xc007ce63f0)
	/go/pkg/mod/k8s.io/client-go@v0.29.0/tools/cache/shared_informer.go:966 +0x69
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:72 +0x52
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start in goroutine 379
	/go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:70 +0x73
panic: sync: negative WaitGroup counter [recovered]
	panic: sync: negative WaitGroup counter

What did you expect to happen:

No crash :-)

The following information will help us better understand what's going on:

I have collected a bundle, but I would like to send it privately to no disclose too much information.

Anything else you would like to add:

Environment:

  • Velero version (use velero version): 1.15.2
  • Velero features (use velero client config get features): FSB is used extensively
  • Kubernetes version (use kubectl version): 1.31
  • Kubernetes installer & version: AKZ Azure
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): Azure Linux

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions