Description
Context
Following up on issue #806, we conducted an isolated investigation to better understand the CPU spikes observed when the New Relic PHP agent is enabled. Our testing was performed in a controlled environment with a single container on a dedicated Kubernetes node.
Environment
- Kubernetes 1.30 (EKS)
- Instance type: m7a
- PHP-FPM 8.2
- New Relic PHP Agent: Latest version with all features disabled
Findings
CPU Usage Pattern
*Figure 1: Grafana CPU metrics showing distinct usage patterns:
- Baseline period with normal activity
- Spike to ~100% CPU with New Relic disabled (16:10)
- Spike to ~300% CPU with New Relic enabled (16:15)*
Flame Graph Comparison
Figure 2: System-wide flame graph (test 4) with New Relic disabled, showing normal system call patterns and CPU usage distribution
Figure 3: System-wide flame graph (test 4) with New Relic enabled, demonstrating significantly increased fstatat64
system calls and higher CPU utilization across all cores
This pattern remained consistent across multiple test runs and was not affected by:
- Disabling all New Relic features
- Using the latest agent version
- Different sampling frequencies (99Hz and 997Hz)
System Call Analysis
Through system-wide performance profiling, we identified a significant increase in fstatat64
system calls when the New Relic agent is enabled. This suggests excessive file operations being performed by the agent.
Testing Methodology
We conducted extensive profiling using:
-
PHP-FPM specific profiling at different sampling rates:
perf record -F [99|997] -p $(pgrep php-fpm -o) -a -g --call-graph fp -- sleep 60
-
System-wide profiling:
perf record -F [99|997] -a -g -- sleep 60
-
System call tracing:
timeout 60 strace -tt -f -C -p $(pgrep -o php-fpm)
Version Impact
This performance regression appears to have been introduced between versions 10.0.0.312 and 10.7.0.319. Earlier versions did not exhibit this behavior.
Supporting Evidence
All profiling results are attached to this issue in newrelic_profiling_results.zip
, which includes:
PHP-FPM Specific Profiles
- With New Relic disabled:
- 99Hz sampling (
phpfpm_nr_off_99hz.*
) - 997Hz sampling (
phpfpm_nr_off_997hz.*
)
- 99Hz sampling (
- With New Relic enabled:
- 99Hz sampling (
phpfpm_nr_on_99hz.*
) - 997Hz sampling (
phpfpm_nr_on_997hz.*
)
- 99Hz sampling (
System-Wide Profiles
- With New Relic disabled:
- Test 3 (
system_nr_off_99hz_test3.*
) - Test 4 (
system_nr_off_99hz_test4.*
)
- Test 3 (
- With New Relic enabled:
- Test 3 (
system_nr_on_99hz_test3.*
) - Test 4 (
system_nr_on_99hz_test4.*
)
- Test 3 (
Questions
- Is there a known reason for the increased frequency of fstatat64 calls?
- Are there plans to optimize file operations in future releases?
- Could this be related to the agent's file monitoring or instrumentation mechanisms?