Skip to content

Timeout Responding to GETFDPW #158

@kevincox

Description

@kevincox

It appears that in some cases, seemingly when some NSS module takes too long—which has been noticed to occur when the network state of the machine is changing—nsncd will fail to respond to GETFDPW requests resulting in a 5s hang until glibc times out and makes a GETPWBYUID request which succeeds fairly quickly.

Example log from ZSH startup:

     0.000089 getppid()                 = 7445
     0.000048 getpid()                  = 7447
     0.000047 getuid()                  = 1000
     0.000064 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
     0.000082 connect(11, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
     0.000230 sendto(11, "\2\0\0\0\v\0\0\0\7\0\0\0passwd\0", 19, MSG_NOSIGNAL, NULL, 0) = 19
     0.000246 poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 0 (Timeout)
     5.004649 close(11)                 = 0
     0.000249 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
     0.000113 connect(11, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
     0.000149 sendto(11, "\2\0\0\0\1\0\0\0\5\0\0\0001000\0", 17, MSG_NOSIGNAL, NULL, 0) = 17
     0.000121 poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ([{fd=11, revents=POLLIN|POLLERR|POLLHUP}])
     1.373307 read(11, 0x7ffd8b2dc4d0, 36) = -1 ECONNRESET (Connection reset by peer)
     0.000456 close(11)                 = 0
     0.000302 newfstatat(AT_FDCWD, "/etc/nsswitch.conf", {st_mode=S_IFREG|0444, st_size=256, ...}, 0) = 0
     0.000323 newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
     0.000068 openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 11
     0.000143 fstat(11, {st_mode=S_IFREG|0444, st_size=256, ...}) = 0
     0.000124 read(11, "passwd:    files systemd\ngroup: "..., 4096) = 256
     0.000131 read(11, "", 4096)        = 0
     0.000074 fstat(11, {st_mode=S_IFREG|0444, st_size=256, ...}) = 0
     0.000058 close(11)                 = 0
     0.000106 openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 11
     0.000078 fstat(11, {st_mode=S_IFREG|0644, st_size=11018, ...}) = 0
     0.000051 lseek(11, 0, SEEK_SET)    = 0
     0.000055 read(11, "root:x:0:0:System administrator:"..., 4096) = 4096
     0.000130 close(11)                 = 0

nsncd logs:

Starting Name Service Cache Daemon (nsncd)...
Apr 29 19:24:11.253 INFO started, config: Config { ignored_request_types: {}, worker_count: 8, handoff_timeout: 10s }, path: "/var/run/nscd/socket"
Started Name Service Cache Daemon (nsncd).
Apr 29 19:29:25.645 ERRO timed out waiting for an available worker, thread: accept
Apr 29 19:29:25.645 INFO shutting down

This was original discussed on the nixpkgs bug tracker: NixOS/nixpkgs#402693

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions