PHP-FPM constantly restarting children with New Relic extension enabled in container with non root mode #862
Description
Description
I am trying to implement rootless containers for our application (php-fpm + nginx in k8s).
Problem occurs only when new relic extension is enabled (more precisely - when extension successfully connects to daemon. if not - everything works fine)
Depending on pm
mode I observe following situations:
- In
dynamic
mode at some point children start receivingSIGKILL
after ~10-20 seconds. Master process constantly restarting these processes as it expects children to finish withSIGQUIT
- In
static
mode ifmax_requests
is set php-fpm stop correctly working after all children exceedsmax_requests
value, Master process can't respawn children - In
static
mode ifmax_requests
is0
everything works fine. When running container under root user everything works fine as well.
At the same time I don't see any suspicious logs and metrics/transactions are correctly sent to New Relic collectors. There are no traffic on application except kube probe. There are no any spikes in memory/cpu utilisation
Steps to Reproduce
-
Set up pod with enabled security context and non root user with enabled new relic extension
-
Generate traffic (for example, 5 VU for 5 minutes)
-
Stop traffic
Expected Behavior
PHP-FPM should work as expected under non root user
Relevant Logs / Console output
[20-Mar-2024 09:49:15.245718] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:15.249579] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:15.249662] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 287 exited on signal 9 (SIGKILL) after 14.015627 seconds from start
[20-Mar-2024 09:49:15.249741] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:15.249427] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 2 events
[20-Mar-2024 09:49:15.249520] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:15.253627] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 294 started
[20-Mar-2024 09:49:15.253408] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
2024/03/20 09:49:16.176045 (26) Debug: received binary message, len=9940
2024-03-20 09:49:16.178 +0000 (288 288) debug: APPINFO reply connected
2024-03-20 09:49:16.179 +0000 (288 288) debug: APPINFO reply full app='***' agent_run_id=BQAEdEbs***
2024-03-20 09:49:16.179 +0000 (288 288) debug: Adaptive sampling configuration. Connect: 1710928127000000 us. Frequency: 60000000 us. Target: 10.
2024-03-20 09:49:16.181 +0000 (288 288) debug: 'WT_IS_FILENAME & SCRIPT_FILENAME' naming is '/index.php'
2024-03-20 09:49:16.183 +0000 (288 288) warning: User instrumentation from opcache: missing 'scripts' key in status information
2024-03-20 09:49:16.184 +0000 (288 288) debug: detected library=Guzzle 6
2024-03-20 09:49:16.244 +0000 (288 288) debug: detected library=Monolog
[20-Mar-2024 09:49:16.247757] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 1 active children, 6 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:16.255 +0000 (289 289) debug: closed daemon connection fd=7
2024-03-20 09:49:16.254 +0000 (289 289) debug: MSHUTDOWN processing started
2024-03-20 09:49:16.330 +0000 (288 288) debug: ignoring this transaction
2024-03-20 09:49:16.334 +0000 (288 288) debug: # elements: 5, # buckets used: 4
[20-Mar-2024 09:49:16.334698] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
2024-03-20 09:49:16.334 +0000 (288 288) debug: collisions - min: 1, max: 2, avg: 2
2024/03/20 09:49:17.086909 (26) Debug: harvesting log events
2024/03/20 09:49:17.087126 (26) Debug: harvesting error events
2024/03/20 09:49:17.087204 (26) Debug: harvesting custom events
2024/03/20 09:49:17.087026 (26) Debug: harvesting transaction events
[20-Mar-2024 09:49:17.249764] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:17.255 +0000 (288 288) debug: closed daemon connection fd=7
2024-03-20 09:49:17.255 +0000 (288 288) debug: MSHUTDOWN processing started
[20-Mar-2024 09:49:18.251010] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:18.256774] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:18.256957] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:18.256902] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 288 exited on signal 9 (SIGKILL) after 15.021025 seconds from start
[20-Mar-2024 09:49:18.256844] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.256668] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.256574] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.259668] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
[20-Mar-2024 09:49:18.259823] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 295 started
[20-Mar-2024 09:49:19.252095] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:19.256189] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:19.256306] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:19.256250] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 289 exited on signal 9 (SIGKILL) after 14.015966 seconds from start
[20-Mar-2024 09:49:19.256113] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 2 events
[20-Mar-2024 09:49:19.256177] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:19.259848] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
[20-Mar-2024 09:49:19.260000] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 296 started
[20-Mar-2024 09:49:20.254242] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:20.262 +0000 (290 290) debug: closed daemon connection fd=7
2024-03-20 09:49:20.262 +0000 (290 290) debug: MSHUTDOWN processing started
[20-Mar-2024 09:49:21.255477] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:21.259052] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:21.259138] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 290 exited on signal 9 (SIGKILL) after 14.018960 seconds from start
Your Environment
Dockerfile
FROM ubuntu:focal
ENV \
USER='user' \
UID=1033 \
GROUP='app' \
GID=1033 \
PHP_VERSION="8.0"
RUN \
set -x && \
groupadd --system --gid ${GID} ${GROUP} && \
useradd --system --gid ${GID} --no-create-home --home /nonexistent --shell /bin/false --uid ${UID} ${USER}
# Installing packages
RUN \
apt-get update && \
apt-get install -y \
curl \
apt-utils \
gettext-base \
apt-transport-https \
ca-certificates \
language-pack-en-base \
software-properties-common && \
LC_ALL=en_US.UTF-8 apt-add-repository ppa:ondrej/php && \
echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | tee /etc/apt/sources.list.d/newrelic.list && \
curl https://download.newrelic.com/548C16BF.gpg | apt-key add - && \
apt-get update && \
apt-get install -y --no-install-recommends \
php${PHP_VERSION} \
php${PHP_VERSION}-fpm \
php${PHP_VERSION}-bcmath \
php${PHP_VERSION}-bz2 \
php${PHP_VERSION}-yaml \
php${PHP_VERSION}-curl \
php${PHP_VERSION}-dba \
php${PHP_VERSION}-gd \
php${PHP_VERSION}-gmp \
php${PHP_VERSION}-apcu \
php${PHP_VERSION}-imap \
php${PHP_VERSION}-intl \
php${PHP_VERSION}-ldap \
php${PHP_VERSION}-mbstring \
php${PHP_VERSION}-mysql \
php${PHP_VERSION}-opcache \
php${PHP_VERSION}-pgsql \
php${PHP_VERSION}-readline \
php${PHP_VERSION}-soap \
php${PHP_VERSION}-xml \
php${PHP_VERSION}-xmlrpc \
php${PHP_VERSION}-xsl \
php${PHP_VERSION}-zip \
php${PHP_VERSION}-amqp \
php${PHP_VERSION}-gnupg \
php${PHP_VERSION}-igbinary \
php${PHP_VERSION}-memcached \
php${PHP_VERSION}-imagick \
php${PHP_VERSION}-mongo \
php${PHP_VERSION}-mongodb \
php${PHP_VERSION}-redis \
php${PHP_VERSION}-ssh2 \
php${PHP_VERSION}-tideways \
php${PHP_VERSION}-xhprof \
php${PHP_VERSION}-grpc \
newrelic-php5 && \
apt-get autoremove -y && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/* && \
# creating symlink for the current php version
ln -s $(which php-fpm${PHP_VERSION}) /usr/local/sbin/php-fpm
COPY configs/php-fpm.template /etc/php/${PHP_VERSION}/fpm/php-fpm.template
COPY configs/newrelic.ini.template /etc/php/${PHP_VERSION}/fpm/newrelic.ini.template
COPY scripts/startup.sh /usr/local/bin/startup.sh
COPY configs/conf_dir.php_mod_apcu/ /etc/php/${PHP_VERSION}
RUN \
mkdir -p \
/app \
/run/php/ \
/var/run/syslog-ng/ \
/var/log/php \
/usr/local/etc/webapp && \
# stage env for old projects, and dafault value for the image
echo "test" > /usr/local/etc/webapp/srvtype && \
touch /run/php${PHP_VERSION}-fpm.pid && \
touch /run/php${PHP_VERSION}-newrelic.pid && \
touch /tmp/.newrelic.sock && \
chown -R ${USER}:${GROUP} \
/run/php${PHP_VERSION}-fpm.pid \
/run/php${PHP_VERSION}-newrelic.pid \
/etc/php/ \
/run/php/ \
/var/log/ \
/usr/local/etc/webapp/srvtype \
/tmp/.newrelic.sock \
/app
WORKDIR /app
COPY .opcache.preload.php /app
USER ${USER}
ENV \
# Clear env=no, for the app's that are using vars in the fpm pool
PHP_FPM_CLEAR_ENV='yes' \
PHP_FPM_ERROR_LOG='/var/log/php/php-error.log' \
PHP_FPM_SLOW_LOG='/var/log/php/php-slow.log' \
PHP_FPM_SLOW_LOG_TIMEOUT='0' \
# port or socket
PHP_FPM_LISTEN_TYPE='port' \
PHP_FPM_LISTEN_PORT=9000 \
PHP_FPM_LISTEN_SOCKET='/run/php/php-fpm.sock' \
# Old projects env selection /usr/local/etc/webapp/srvtype
SRV_TYPE='test' \
# New relic vars
NEW_RELIC_ENABLED='false' \
NEW_RELIC_LICENSE='' \
NEW_RELIC_APP_NAME='' \
PHP_FPM_STATUS_PATH='/fpm_status'
# Generating default php-fpm config.
# Config will be overriten if default variables are changed
# Or in case, if image entrypoint will be overriten, so startup script will not launched.
RUN \
/usr/local/bin/startup.sh ls -l /etc/php/${PHP_VERSION}/fpm/php-fpm.conf
EXPOSE ${PHP_FPM_LISTEN_PORT}
STOPSIGNAL SIGQUIT
ENTRYPOINT [ "/usr/local/bin/startup.sh" ]
CMD [ "php-fpm" ]
PHP-FPM configuration
[global]
pid = /run/php${PHP_VERSION}-fpm.pid
error_log = ${PHP_FPM_ERROR_LOG}
daemonize = no
log_level = debug
[www]
clear_env = ${PHP_FPM_CLEAR_ENV}
catch_workers_output = yes
user = ${USER}
group = ${GROUP}
listen = ${PHP_FPM_LISTEN_REAL_ADDR}
listen.mode = 0660
listen.owner = ${USER}
listen.group = ${GROUP}
pm = dynamic
pm.status_path = ${PHP_FPM_STATUS_PATH}
pm.max_children = 10
pm.start_servers = 2
pm.max_requests = 0
pm.min_spare_servers = 2
pm.max_spare_servers = 5
pm.process_idle_timeout = 60s
slowlog = ${PHP_FPM_SLOW_LOG}
request_slowlog_timeout = ${PHP_FPM_SLOW_LOG_TIMEOUT}
request_terminate_timeout = 0
New Relic configuration
extension = "newrelic.so"
[newrelic]
newrelic.enabled = "${NEW_RELIC_ENABLED}"
newrelic.daemon.app_connect_timeout=15s
newrelic.daemon.start_timeout=5s
newrelic.license = "${NEW_RELIC_LICENSE}"
newrelic.logfile = "${PHP_FPM_ERROR_LOG}"
newrelic.loglevel = "debug"
newrelic.appname = "${NEW_RELIC_APP_NAME}"
newrelic.daemon.logfile = "${PHP_FPM_ERROR_LOG}"
newrelic.daemon.loglevel = "debug"
newrelic.daemon.port = "/tmp/.newrelic.sock"
newrelic.daemon.pidfile = "/run/php${PHP_VERSION}-newrelic.pid"
newrelic.browser_monitoring.auto_instrument = false
PHP version:
PHP 8.0.28 (cli) (built: Feb 14 2023 18:32:57) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.28, Copyright (c) Zend Technologies
with Zend OPcache v8.0.28, Copyright (c), by Zend Technologies
New Relic version: New Relic daemon version 10.18.0.8-e78afe1fd086