Race condition: two PVCs get the same project quota

**Describe the bug:** We use localpv with ext4 hard quotas. They work quite fine, but from time to time, we get the problem, that the quota has exceeded despite the folder contains less than the defined quota (10GiB). Today I could track the problem down to 2 PVCs that oviously had the same project quota ID set:
```
/nvme/disk# ls
lost+found  pvc-2fabebc9-8143-4b60-beef-563180845e64  pvc-6d3a015a-c547-4292-9ed6-95b35a7aea41

/nvme/disk/pvc-6d3a015a-c547-4292-9ed6-95b35a7aea41# du -h --max-depth=1
4.2G	./workspace
33M	./remoting
8.0K	./caches
4.3G	.

/nvme/disk# du -h --max-depth=1
6.1G	./pvc-2fabebc9-8143-4b60-beef-563180845e64
16K	./lost+found
4.3G	./pvc-6d3a015a-c547-4292-9ed6-95b35a7aea41
11G	.

/nvme/disk# repquota -avugP
*** Report for project quotas on device /dev/md0
Block grace time: 7days; Inode grace time: 7days
                        Block limits                File limits
Project         used    soft    hard  grace    used  soft  hard  grace
----------------------------------------------------------------------
#0        --      20       0       0              2     0     0       
#1        --       0 10737419 10737419              0     0     0       
#2        --       0 10737419 10737419              0     0     0       
#3        --       0 10737419 10737419              0     0     0       
#4        -- 10737416 10737419 10737419           6122     0     0       
#5        --       0 10737419 10737419              0     0     0       
#6        --       0 10737419 10737419              0     0     0       
```

I think the problem occurs because of a race condition when determining the project id:
https://github.com/openebs/dynamic-localpv-provisioner/blob/e797585cb1e2c3578b914102bfe0e8768b04d950/cmd/provisioner-localpv/app/helper_hostpath.go#L294+L295

I see two possible workaround: either make sure that only one create-quota-pod can run at a time on one single node or apply a random project number instead of trying to increment them.

**Expected behaviour:** Each PVC has the quota it is configured with.

**Steps to reproduce the bug:**
Unfortunately, it is really hard to reproduce the bug, as it only happens now and then. During tests I scaled a deployment with a PVC up and down very fast to check the create and cleanup and had no problem. Maybe you can reproduce it with more than one deployment scaled up in parallel

**The output of the following commands will help us better understand what's going on**:


* `kubectl get pods -n <openebs_namespace> --show-labels`
   `nvme-provisioner-localpv-provisioner-68f8494cf7-84hdv   1/1     Running              80 (12h ago)   32d   app=localpv-provisioner,chart=localpv-provisioner-3.3.0,component=localpv-provisioner,heritage=Helm,name=openebs-localpv-provisioner,openebs.io/component-name=openebs-localpv-provisioner,openebs.io/version=3.3.0,pod-template-hash=68f8494cf7,release=nvme-provisioner`

**Anything else we need to know?:**
The provisioner pod has lots of restarts, we don't know why, there is no error in the pod log, but it seems not to be related

**Environment details:**
- OpenEBS version (use `kubectl get po -n openebs --show-labels`): 3.3.0
- Kubernetes version (use `kubectl version`): 1.23.15
- Cloud provider or hardware configuration: AWS
- OS (e.g: `cat /etc/os-release`): Amazon Linux 2
- kernel (e.g: `uname -a`): 5.4.228-131.415.amzn2.x86_64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Race condition: two PVCs get the same project quota #155

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Race condition: two PVCs get the same project quota #155

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions