-
Notifications
You must be signed in to change notification settings - Fork 258
Description
Issue with sleep infinity in awesome-akash repo
Problem Description
In the awesome-akash repository, several containers are configured to run sleep infinity as their main process. This configuration can lead to unintended behavior, particularly the accumulation of zombie or defunct processes. This occurs because sleep infinity does not handle child processes properly, causing them to remain in a defunct state.
Here's an example where sleep infinity is used:
awesome-akash$ git grep -i sleep |grep infin
Ethereum_2.0/main.sh:sleep infinity
Falcon-7B/Dockerfile:CMD python3 falcon7b.py && sleep infinity
Sentinel-dVPN-node/main.sh: sleep infinity
bitcoin/main.sh:sleep infinity
cryptodredge-c11/entrypoint.sh:sleep infinity
semantra/deploy.yaml: sleep infinity ;'
softether-vpn/launch:sleep infinity
Impact
It is possible for certain deployments to initiate subprocesses that do not properly implement the wait() function. This improper handling can result in the formation of <defunct> processes, also known as “zombie” processes. Zombie processes occur when a subprocess completes its task but still remains in the system’s process table due to the parent process not reading its exit status. Over time, if not managed correctly, these zombie processes have the potential to accumulate and occupy all available process slots in the system, leading to resource exhaustion.
These zombie processes aren’t too harmful much (they don’t occupy cpu/mem / nor impact cgroup cpu/mem limits) unless they take up the whole process table space so no new processes will be able to spawn, i.e. the limit:
$ cat /proc/sys/kernel/pid_max
4194304
If sleep infinity is set as the main container process (PID 1), it fails to properly reap child processes, leading to their accumulation as zombie processes. Containers with such configurations may be terminated by the zombie killer cron job, implemented by some providers to handle these defunct processes.
Proposed Solutions
- Use a Proper Primary Process: Containers should use their main application or a robust init system as the primary process. For example, running
/usr/sbin/sshd -Dis preferable tosleep infinity. - Process Management in Containers: Consider using dedicated init systems like
tini,dumb-init, orrunit, which are designed to handle child processes correctly:
Additional Resources
- Multi-service containers and managing child processes
- Phusion's baseimage as a solid init system
- Docker and the PID 1 zombie reaping problem
- Container Init Process
Request for Action
I suggest reviewing the current use of sleep infinity across the repository and discussing potential alternatives for better process management. This change could improve the stability and performance of deployments using this repository.