Skip to content

flock/unlink issue #4694

Description

@jkroonza

Hi,

We ran into some problems with a distributed lock mechanism based on top of flock on top of glusterfs. The basic lock strategy is the usual:

while (true) {
  fd = open("...?", O_CREAT|O_RDWR);
  flock(fd, FLOCK_EX);
  if (fstat on filename && dev_no for fd + filename matches && ino_no for fd + filename matches)
     break; /* success */
  close(fd);
}

And for unlocking:

unlink(filename);
close(fd);

We've bumped into a few cases where this blatantly fails, so we wrote some code to help with troubleshooting the problem.

It seems everything works just fine as long as we don't issue the unlink() call. I'm not sure where things goes wrong. This is a super basic volume that I'm using:

gluster volume create bench replica 2 192.168.255.255:/mnt/b/{a,b} force
gluster volume start bench
mount -t glusterfs localhost:bench /mnt/m

# gluster volume info bench 
 
Volume Name: bench
Type: Replicate
Volume ID: 705eb5a1-19f9-46a1-aabd-bbde8c12d0b8
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.255.255:/mnt/b/a
Brick2: 192.168.255.255:/mnt/b/b
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

For reference, local ext4 fs:

$ ./flock_bench -d fl -T 10 -t 100 -u
Stats:
                  Attempts:         5675150 (100.00 %)
       Attempts Crosscheck:         5675150 (100.00 %)
                  Obtained:          674103 ( 11.88 %)
              Full Success:          674103 ( 11.88 %)
               Open failed:               0 (  0.00 %)
     flock(LOCK_EX) failed:               0 (  0.00 %)
flock(LOCK_EX) would block:               0 (  0.00 %)
          stat failed lock:         1713814 ( 30.20 %)
        stat failed unlock:               0 (  0.00 %)
              fstat failed:               0 (  0.00 %)
  lock failed (wrong file):         3287233 ( 57.92 %)
  file changed during lock:               0 (  0.00 %)
             unlink failed:               0 (  0.00 %)
     flock(LOCK_UN) failed:               0 (  0.00 %)

Via single fuse mount:

$ ./flock_bench -d /mnt/m/ -T 10 -t 100 -u
Stats:
                  Attempts:           50381 (100.00 %)
       Attempts Crosscheck:           50381 (100.00 %)
                  Obtained:            2468 (  4.90 %)
              Full Success:            2468 (  4.90 %)
               Open failed:               0 (  0.00 %)
     flock(LOCK_EX) failed:               0 (  0.00 %)
flock(LOCK_EX) would block:               0 (  0.00 %)
          stat failed lock:            1369 (  2.72 %)
        stat failed unlock:               0 (  0.00 %)
              fstat failed:               0 (  0.00 %)
  lock failed (wrong file):           46544 ( 92.38 %)
  file changed during lock:               0 (  0.00 %)
             unlink failed:               0 (  0.00 %)
     flock(LOCK_UN) failed:               0 (  0.00 %)

So that all seems sane up to here (including massive drop in performance, which is acceptable for our use-case at least).

The moment we move to multiple fuse mounts, without unlink():

# ./flock_bench -g localhost:bench -d /tmp/b -T 10 -t 100
-- trim moaning about umount ...
                  Attempts:           39651 (100.00 %)
       Attempts Crosscheck:           39651 (100.00 %)
                  Obtained:           39651 (100.00 %)
              Full Success:           39651 (100.00 %)
               Open failed:               0 (  0.00 %)
     flock(LOCK_EX) failed:               0 (  0.00 %)
flock(LOCK_EX) would block:               0 (  0.00 %)
          stat failed lock:               0 (  0.00 %)
        stat failed unlock:               0 (  0.00 %)
              fstat failed:               0 (  0.00 %)
  lock failed (wrong file):               0 (  0.00 %)
  file changed during lock:               0 (  0.00 %)
             unlink failed:               0 (  0.00 %)
     flock(LOCK_UN) failed:               0 (  0.00 %)

However, with unlink:

# ./flock_bench -g localhost:bench -d /tmp/b -T 10 -t 100 -u 2>&1 | tee /tmp/multimount_fuse_with_unlink.txt
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
flock: No such file or directory
...
unlink(139902779909824/lockfile): No such file or directory
...
unlink(139899491575488/lockfile): No such file or directory
unlink(139901664204480/lockfile): No such file or directory
... stuff about umount again ...
Stats:
                  Attempts:          171100 (100.00 %)
       Attempts Crosscheck:          172458 (100.79 %)
                  Obtained:            2167 (  1.27 %)
              Full Success:            2167 (  1.27 %)
               Open failed:               0 (  0.00 %)
     flock(LOCK_EX) failed:          168929 ( 98.73 %)
flock(LOCK_EX) would block:               0 (  0.00 %)
          stat failed lock:               3 (  0.00 %)
        stat failed unlock:               0 (  0.00 %)
              fstat failed:               0 (  0.00 %)
  lock failed (wrong file):               1 (  0.00 %)
  file changed during lock:               0 (  0.00 %)
             unlink failed:            1358 (  0.79 %)
     flock(LOCK_UN) failed:               0 (  0.00 %)

So it seems as long as the lock file never gets unlinked we do okay, so we'll be updating the use-cases we've got to detect the containing filesystem, and if it's glusterfs to skip the unlink, preferring to leak the file rather than running into these problems. Would still be awesome if this can be tracked and fixed.

flock_bench.c

multimount_fuse_with_unlink.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions