Skip to content

PHP-FPM constantly restarting children with New Relic extension enabled in container with non root mode #862

Open
@Aspirin4k

Description

Description

I am trying to implement rootless containers for our application (php-fpm + nginx in k8s).
Problem occurs only when new relic extension is enabled (more precisely - when extension successfully connects to daemon. if not - everything works fine)
Depending on pm mode I observe following situations:

  1. In dynamic mode at some point children start receiving SIGKILL after ~10-20 seconds. Master process constantly restarting these processes as it expects children to finish with SIGQUIT
  2. In static mode if max_requests is set php-fpm stop correctly working after all children exceeds max_requests value, Master process can't respawn children
  3. In static mode if max_requests is 0 everything works fine. When running container under root user everything works fine as well.

At the same time I don't see any suspicious logs and metrics/transactions are correctly sent to New Relic collectors. There are no traffic on application except kube probe. There are no any spikes in memory/cpu utilisation

Steps to Reproduce

  1. Set up pod with enabled security context and non root user with enabled new relic extension

  2. Generate traffic (for example, 5 VU for 5 minutes)

  3. Stop traffic

Expected Behavior

PHP-FPM should work as expected under non root user

Relevant Logs / Console output

[20-Mar-2024 09:49:15.245718] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:15.249579] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:15.249662] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 287 exited on signal 9 (SIGKILL) after 14.015627 seconds from start
[20-Mar-2024 09:49:15.249741] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:15.249427] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 2 events
[20-Mar-2024 09:49:15.249520] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:15.253627] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 294 started
[20-Mar-2024 09:49:15.253408] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
2024/03/20 09:49:16.176045 (26) Debug: received binary message, len=9940
2024-03-20 09:49:16.178 +0000 (288 288) debug: APPINFO reply connected
2024-03-20 09:49:16.179 +0000 (288 288) debug: APPINFO reply full app='***' agent_run_id=BQAEdEbs***
2024-03-20 09:49:16.179 +0000 (288 288) debug: Adaptive sampling configuration. Connect: 1710928127000000 us. Frequency: 60000000 us. Target: 10.
2024-03-20 09:49:16.181 +0000 (288 288) debug: 'WT_IS_FILENAME & SCRIPT_FILENAME' naming is '/index.php'
2024-03-20 09:49:16.183 +0000 (288 288) warning: User instrumentation from opcache: missing 'scripts' key in status information
2024-03-20 09:49:16.184 +0000 (288 288) debug: detected library=Guzzle 6
2024-03-20 09:49:16.244 +0000 (288 288) debug: detected library=Monolog
[20-Mar-2024 09:49:16.247757] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 1 active children, 6 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:16.255 +0000 (289 289) debug: closed daemon connection fd=7
2024-03-20 09:49:16.254 +0000 (289 289) debug: MSHUTDOWN processing started
2024-03-20 09:49:16.330 +0000 (288 288) debug: ignoring this transaction
2024-03-20 09:49:16.334 +0000 (288 288) debug: # elements: 5, # buckets used: 4
[20-Mar-2024 09:49:16.334698] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
2024-03-20 09:49:16.334 +0000 (288 288) debug: collisions - min: 1, max: 2, avg: 2
2024/03/20 09:49:17.086909 (26) Debug: harvesting log events
2024/03/20 09:49:17.087126 (26) Debug: harvesting error events
2024/03/20 09:49:17.087204 (26) Debug: harvesting custom events
2024/03/20 09:49:17.087026 (26) Debug: harvesting transaction events
[20-Mar-2024 09:49:17.249764] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:17.255 +0000 (288 288) debug: closed daemon connection fd=7
2024-03-20 09:49:17.255 +0000 (288 288) debug: MSHUTDOWN processing started
[20-Mar-2024 09:49:18.251010] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:18.256774] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:18.256957] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:18.256902] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 288 exited on signal 9 (SIGKILL) after 15.021025 seconds from start
[20-Mar-2024 09:49:18.256844] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.256668] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.256574] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:18.259668] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
[20-Mar-2024 09:49:18.259823] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 295 started
[20-Mar-2024 09:49:19.252095] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:19.256189] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 1 events
[20-Mar-2024 09:49:19.256306] DEBUG: pid 1, fpm_children_make(), line 405: blocking signals before child birth
[20-Mar-2024 09:49:19.256250] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 289 exited on signal 9 (SIGKILL) after 14.015966 seconds from start
[20-Mar-2024 09:49:19.256113] DEBUG: pid 1, fpm_event_loop(), line 440: event module triggered 2 events
[20-Mar-2024 09:49:19.256177] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:19.259848] DEBUG: pid 1, fpm_children_make(), line 429: unblocking signals, child born
[20-Mar-2024 09:49:19.260000] NOTICE: pid 1, fpm_children_make(), line 435: [pool www] child 296 started
[20-Mar-2024 09:49:20.254242] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
2024-03-20 09:49:20.262 +0000 (290 290) debug: closed daemon connection fd=7
2024-03-20 09:49:20.262 +0000 (290 290) debug: MSHUTDOWN processing started
[20-Mar-2024 09:49:21.255477] DEBUG: pid 1, fpm_pctl_perform_idle_server_maintenance(), line 395: [pool www] currently 0 active children, 7 spare children, 7 running children. Spawning rate 1
[20-Mar-2024 09:49:21.259052] DEBUG: pid 1, fpm_got_signal(), line 82: received SIGCHLD
[20-Mar-2024 09:49:21.259138] WARNING: pid 1, fpm_children_bury(), line 258: [pool www] child 290 exited on signal 9 (SIGKILL) after 14.018960 seconds from start

Your Environment

Dockerfile

FROM ubuntu:focal

ENV \
  USER='user' \
  UID=1033 \
  GROUP='app' \
  GID=1033 \
  PHP_VERSION="8.0"

RUN \
  set -x && \
  groupadd --system --gid ${GID} ${GROUP}  && \
  useradd --system --gid ${GID} --no-create-home --home /nonexistent --shell /bin/false --uid ${UID} ${USER}

# Installing packages
RUN \
  apt-get update && \
  apt-get install -y \
    curl \
    apt-utils \
    gettext-base \
    apt-transport-https \
    ca-certificates \
    language-pack-en-base \
    software-properties-common && \
  LC_ALL=en_US.UTF-8 apt-add-repository ppa:ondrej/php && \
  echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | tee /etc/apt/sources.list.d/newrelic.list && \
  curl https://download.newrelic.com/548C16BF.gpg | apt-key add - && \
  apt-get update && \
  apt-get install -y --no-install-recommends \
    php${PHP_VERSION} \
    php${PHP_VERSION}-fpm \
    php${PHP_VERSION}-bcmath \
    php${PHP_VERSION}-bz2 \
    php${PHP_VERSION}-yaml \
    php${PHP_VERSION}-curl \
    php${PHP_VERSION}-dba \
    php${PHP_VERSION}-gd \
    php${PHP_VERSION}-gmp \
    php${PHP_VERSION}-apcu \
    php${PHP_VERSION}-imap \
    php${PHP_VERSION}-intl \
    php${PHP_VERSION}-ldap \
    php${PHP_VERSION}-mbstring \
    php${PHP_VERSION}-mysql \
    php${PHP_VERSION}-opcache \
    php${PHP_VERSION}-pgsql \
    php${PHP_VERSION}-readline \
    php${PHP_VERSION}-soap \
    php${PHP_VERSION}-xml \
    php${PHP_VERSION}-xmlrpc \
    php${PHP_VERSION}-xsl \
    php${PHP_VERSION}-zip \
    php${PHP_VERSION}-amqp \
    php${PHP_VERSION}-gnupg \
    php${PHP_VERSION}-igbinary \
    php${PHP_VERSION}-memcached \
    php${PHP_VERSION}-imagick \
    php${PHP_VERSION}-mongo \
    php${PHP_VERSION}-mongodb \
    php${PHP_VERSION}-redis \
    php${PHP_VERSION}-ssh2 \
    php${PHP_VERSION}-tideways \
    php${PHP_VERSION}-xhprof \
    php${PHP_VERSION}-grpc \
    newrelic-php5 && \
  apt-get autoremove -y && \
  apt-get clean -y && \
  rm -rf /var/lib/apt/lists/* && \
  # creating symlink for the current php version
  ln -s $(which php-fpm${PHP_VERSION}) /usr/local/sbin/php-fpm

COPY configs/php-fpm.template /etc/php/${PHP_VERSION}/fpm/php-fpm.template
COPY configs/newrelic.ini.template /etc/php/${PHP_VERSION}/fpm/newrelic.ini.template
COPY scripts/startup.sh /usr/local/bin/startup.sh
COPY configs/conf_dir.php_mod_apcu/ /etc/php/${PHP_VERSION}

RUN \
  mkdir -p \
    /app \
    /run/php/ \
    /var/run/syslog-ng/ \
    /var/log/php \
    /usr/local/etc/webapp && \
  # stage env for old projects, and dafault value for the image
  echo "test" > /usr/local/etc/webapp/srvtype && \
  touch /run/php${PHP_VERSION}-fpm.pid && \
  touch /run/php${PHP_VERSION}-newrelic.pid && \
  touch /tmp/.newrelic.sock && \
  chown -R ${USER}:${GROUP} \
    /run/php${PHP_VERSION}-fpm.pid \
    /run/php${PHP_VERSION}-newrelic.pid \
    /etc/php/ \
    /run/php/ \
    /var/log/ \
    /usr/local/etc/webapp/srvtype \
    /tmp/.newrelic.sock \
    /app

WORKDIR /app
COPY .opcache.preload.php /app

USER ${USER}

ENV \
  # Clear env=no, for the app's that are using vars in the fpm pool
  PHP_FPM_CLEAR_ENV='yes' \
  PHP_FPM_ERROR_LOG='/var/log/php/php-error.log' \
  PHP_FPM_SLOW_LOG='/var/log/php/php-slow.log' \
  PHP_FPM_SLOW_LOG_TIMEOUT='0' \
  # port or socket
  PHP_FPM_LISTEN_TYPE='port' \
  PHP_FPM_LISTEN_PORT=9000 \
  PHP_FPM_LISTEN_SOCKET='/run/php/php-fpm.sock' \
  # Old projects env selection /usr/local/etc/webapp/srvtype
  SRV_TYPE='test' \
  # New relic vars
  NEW_RELIC_ENABLED='false' \
  NEW_RELIC_LICENSE='' \
  NEW_RELIC_APP_NAME='' \
  PHP_FPM_STATUS_PATH='/fpm_status'

# Generating default php-fpm config.
# Config will be overriten if default variables are changed
# Or in case, if image entrypoint will be overriten, so startup script will not launched.
RUN \
  /usr/local/bin/startup.sh ls -l /etc/php/${PHP_VERSION}/fpm/php-fpm.conf

EXPOSE ${PHP_FPM_LISTEN_PORT}

STOPSIGNAL SIGQUIT

ENTRYPOINT [ "/usr/local/bin/startup.sh" ]

CMD [ "php-fpm" ]

PHP-FPM configuration

[global]
pid = /run/php${PHP_VERSION}-fpm.pid
error_log = ${PHP_FPM_ERROR_LOG}
daemonize = no
log_level = debug

[www]
clear_env = ${PHP_FPM_CLEAR_ENV}

catch_workers_output = yes

user   = ${USER}
group  = ${GROUP}
listen = ${PHP_FPM_LISTEN_REAL_ADDR}
listen.mode  = 0660
listen.owner = ${USER}
listen.group = ${GROUP}

pm = dynamic
pm.status_path = ${PHP_FPM_STATUS_PATH}
pm.max_children  = 10
pm.start_servers = 2
pm.max_requests  = 0
pm.min_spare_servers = 2
pm.max_spare_servers = 5
pm.process_idle_timeout = 60s

slowlog = ${PHP_FPM_SLOW_LOG}
request_slowlog_timeout   = ${PHP_FPM_SLOW_LOG_TIMEOUT}
request_terminate_timeout = 0

New Relic configuration

extension = "newrelic.so"

[newrelic]
newrelic.enabled = "${NEW_RELIC_ENABLED}"

newrelic.daemon.app_connect_timeout=15s
newrelic.daemon.start_timeout=5s

newrelic.license = "${NEW_RELIC_LICENSE}"

newrelic.logfile = "${PHP_FPM_ERROR_LOG}"
newrelic.loglevel = "debug"

newrelic.appname = "${NEW_RELIC_APP_NAME}"

newrelic.daemon.logfile = "${PHP_FPM_ERROR_LOG}"
newrelic.daemon.loglevel = "debug"

newrelic.daemon.port = "/tmp/.newrelic.sock"

newrelic.daemon.pidfile = "/run/php${PHP_VERSION}-newrelic.pid"

newrelic.browser_monitoring.auto_instrument = false

PHP version:

PHP 8.0.28 (cli) (built: Feb 14 2023 18:32:57) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.28, Copyright (c) Zend Technologies
    with Zend OPcache v8.0.28, Copyright (c), by Zend Technologies

New Relic version: New Relic daemon version 10.18.0.8-e78afe1fd086

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions