protect execution TIP under RPC load (#19905)

lupin012 · claude · web-flow · commit b6d67fab58d8 · 2026-03-27T06:01:20.000Z
This PR introduces an HTTP admission control layer to protect the Staged Sync pipeline from being starved or delayed by high RPC load. This PR introduces a two-level admission control system to protect the Staged Sync pipeline from being starved or delayed by high RPC load. Root Cause Analysis: Under heavy RPC traffic, the node accumulates a large number of goroutines blocked on roTxsLimiter.Acquire. When DB slots become available, the backlog drains in a way that starves the staged sync pipeline. The goroutine pile-up also causes a significant spike in virtual memory and overall system instability. Solution: Two gates work in tandem: 1. HTTP admission handler (rpcAdmissionHandler) — outer gate installed at the top of every HTTP RPC stack, before CORS, Gzip, or JSON decoding. If the number of inflight requests exceeds the configured limit, the request is rejected immediately with HTTP 503. This prevents goroutine accumulation at the source. On every admitted request the handler tags the context with WithRPCContext (limit value) so the DB layer can identify the caller. 2. BeginRo inner gate — if the context carries a positive RPC limit, BeginRo uses TryAcquire on roTxsLimiter and returns ErrServerOverloaded immediately if the semaphore is full. Internal callers (staged sync, background workers) always use blocking Acquire and are never rejected. This two-level approach means most overload is shed at the HTTP layer (goroutines never enter the system), while any RPC requests that slip through under transient concurrency spikes are still fail-fast at the DB layer rather than piling up behind the semaphore. Configuration: - --rpc.max.concurrency: HTTP admission limit. - 0 (default): uses --db.read.concurrency (auto-tuned to GOMAXPROCS × 64, capped at 9000) - > 0: explicit limit - -1: unlimited (admission control disabled, BeginRo falls back to blocking Acquire) (as old behaviour) | Resource | Result | | :--- | :--- | ### Summary of Resource Management Improvements | Resource | Result | | :--- | :--- | | **Goroutine pile-up** | ✅ Requests rejected at HTTP layer before CORS, Gzip, or JSON decoding | | **Staged sync starvation** | ✅ Internal callers (staged sync, workers) use blocking `Acquire` and are never rejected; RPC uses `TryAcquire` fail-fast | | **Transient overload spikes** | ✅ `BeginRo` inner gate catches RPC requests that pass the HTTP layer during concurrency spikes | | **Scalability** | ✅ Default limit auto-tuned to `GOMAXPROCS × 64` (capped at 9000) via `--db.read.concurrency` | | **Configuration** | ✅ Zero required config, one optional flag (`--rpc.max.concurrency`) | Benchmark & Stress Test Results Setup: 32 Cores, 64GB RAM, 70GB Swap. Minimal Node in Sync. Parallel eth_call stress tests (28k QPS). <details> <summary><b>Click to expand: Benchmark Data (Before vs After on local node)</b></summary> ### Current SW (main release) CPU 03:23:56 PM all 29.55 0.00 22.30 34.33 0.00 13.83 03:24:06 PM all 56.41 0.00 15.44 10.83 0.00 17.32 03:24:16 PM all 75.60 0.00 13.36 2.86 0.00 8.18 03:24:26 PM all 73.19 0.00 14.35 2.82 0.00 9.63 03:24:36 PM all 73.35 0.00 14.56 2.75 0.00 9.34 Memory 15:23:30 rss=31.89GB vsz=7.65TB proc_swap=11.81GB sys_swap=27.21/72.00GB MemAvail=1.15GB SwapAvail=44.79GB 15:23:40 rss=32.74GB vsz=7.65TB proc_swap=11.00GB sys_swap=27.02/72.00GB MemAvail=1.50GB SwapAvail=44.98GB 15:23:50 rss=33.83GB vsz=7.65TB proc_swap=9.89GB sys_swap=25.65/72.00GB MemAvail=1.44GB SwapAvail=46.35GB 15:24:00 rss=36.33GB vsz=7.65TB proc_swap=7.60GB sys_swap=23.55/72.00GB MemAvail=1.67GB SwapAvail=48.45GB 15:24:10 rss=37.85GB vsz=7.65TB proc_swap=6.91GB sys_swap=21.83/72.00GB MemAvail=5.10GB SwapAvail=50.17GB 15:24:20 rss=39.30GB vsz=7.65TB proc_swap=6.69GB sys_swap=20.23/72.00GB MemAvail=7.28GB SwapAvail=51.77GB 15:24:30 rss=40.40GB vsz=7.65TB proc_swap=6.20GB sys_swap=17.94/72.00GB MemAvail=10.20GB SwapAvail=54.06GB 15:24:40 rss=41.44GB vsz=7.65TB proc_swap=5.23GB sys_swap=14.95/72.00GB MemAvail=20.01GB SwapAvail=57.05GB 15:24:50 rss=41.68GB vsz=7.65TB proc_swap=5.20GB sys_swap=14.92/72.00GB MemAvail=16.14GB SwapAvail=57.08GB 15:25:00 rss=42.77GB vsz=7.65TB proc_swap=4.95GB sys_swap=14.87/72.00GB MemAvail=11.41GB SwapAvail=57.13GB 15:25:11 rss=42.78GB vsz=7.65TB proc_swap=5.26GB sys_swap=15.55/72.00GB MemAvail=8.58GB SwapAvail=56.45GB 15:25:21 rss=40.79GB vsz=7.65TB proc_swap=6.88GB sys_swap=17.46/72.00GB MemAvail=5.65GB SwapAvail=54.54GB TIP Trucking [15:21:44] block #24,656,279 ts=2026-03-14 15:19:47 lag=+117.8s ALERT: lag=117.8s — node is behind the tip! [15:21:44] block #24,656,280 ts=2026-03-14 15:19:59 lag=+105.8s ALERT: lag=105.8s — node is behind the tip! [15:21:44] block #24,656,281 ts=2026-03-14 15:20:11 lag=+93.8s ALERT: lag=93.8s — node is behind the tip! [15:21:44] block #24,656,282 ts=2026-03-14 15:20:23 lag=+81.8s ALERT: lag=81.8s — node is behind the tip! [15:21:44] block #24,656,283 ts=2026-03-14 15:20:47 lag=+57.8s ALERT: lag=57.8s — node is behind the tip! [15:21:57] block #24,656,284 ts=2026-03-14 15:20:59 lag=+58.0s ALERT: lag=58.0s — node is behind the tip! [15:21:57] block #24,656,285 ts=2026-03-14 15:21:11 lag=+46.0s ALERT: lag=46.0s — node is behind the tip! [15:21:57] block #24,656,286 ts=2026-03-14 15:21:23 lag=+34.0s ALERT: lag=34.0s — node is behind the tip! [15:21:57] block #24,656,287 ts=2026-03-14 15:21:35 lag=+22.0s ALERT: lag=22.0s — node is behind the tip! [15:21:57] block #24,656,288 ts=2026-03-14 15:21:47 lag=+10.0s OK [15:22:07] block #24,656,289 ts=2026-03-14 15:21:59 lag=+8.0s OK [15:22:19] block #24,656,290 ts=2026-03-14 15:22:11 lag=+8.3s OK [15:22:32] block #24,656,291 ts=2026-03-14 15:22:23 lag=+9.3s OK [15:23:02] ALERT: no new block for 30s (last block #24656291) — node may be losing the tip! [15:23:32] ALERT: no new block for 60s (last block #24656291) — node may be losing the tip! [15:24:02] ALERT: no new block for 90s (last block #24656291) — node may be losing the tip! [15:24:24] block #24,656,292 ts=2026-03-14 15:22:35 lag=+109.5s ALERT: lag=109.5s — node is behind the tip! [15:24:24] block #24,656,293 ts=2026-03-14 15:22:47 lag=+97.5s ALERT: lag=97.5s — node is behind the tip! [15:24:24] block #24,656,294 ts=2026-03-14 15:22:59 lag=+85.5s ALERT: lag=85.5s — node is behind the tip! [15:24:24] block #24,656,295 ts=2026-03-14 15:23:11 lag=+73.5s ALERT: lag=73.5s — node is behind the tip! [15:24:54] ALERT: no new block for 30s (last block #24656295) — node may be losing the tip! [15:25:17] block #24,656,296 ts=2026-03-14 15:23:23 lag=+114.2s ALERT: lag=114.2s — node is behind the tip! [15:25:17] block #24,656,297 ts=2026-03-14 15:23:35 lag=+102.2s ALERT: lag=102.2s — node is behind the tip! [15:25:17] block #24,656,298 ts=2026-03-14 15:23:47 lag=+90.2s ALERT: lag=90.2s — node is behind the tip! [15:25:17] block #24,656,299 ts=2026-03-14 15:23:59 lag=+78.2s ALERT: lag=78.2s — node is behind the tip! [15:25:17] block #24,656,300 ts=2026-03-14 15:24:11 lag=+66.2s ALERT: lag=66.2s — node is behind the tip! [15:25:17] block #24,656,301 ts=2026-03-14 15:24:23 lag=+54.2s ALERT: lag=54.2s — node is behind the tip! [15:25:17] block #24,656,302 ts=2026-03-14 15:24:35 lag=+42.2s ALERT: lag=42.2s — node is behind the tip! [15:25:17] block #24,656,303 ts=2026-03-14 15:24:47 lag=+30.2s ALERT: lag=30.2s — node is behind the tip! > ./run_perf_tests.py -p pattern/mainnet/stress_test_eth_call_001_latest.tar -t 28000:60 -y eth_call -m 2 -r 100 -Z Performance Test started Test repetitions: 100 on sequence: 28000:60 for pattern: pattern/mainnet/stress_test_eth_call_001_latest.tar Test on port: http://localhost:8545 [1. 1] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m39s] [1. 2] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m46s] [1. 3] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m38s] > ./run_perf_tests.py -p pattern/mainnet/stress_test_eth_call_001_latest.tar -t 28000:60 -y eth_call -m 2 -r 100 -Z Performance Test started Test repetitions: 100 on sequence: 28000:60 for pattern: pattern/mainnet/stress_test_eth_call_001_latest.tar Test on port: http://localhost:8545 [1. 1] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m39s] [1. 2] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m45s] [1. 3] daemon: executes test qps: 28000 time: 60 -> [R=100.00% max=1m40s] ### NEW Software (with PR) CPU 7:58:51 AM all 51.09 0.00 6.16 0.35 0.00 42.40 07:58:56 AM all 49.26 0.00 5.82 0.03 0.00 44.89 07:59:01 AM all 50.34 0.00 5.95 0.20 0.00 43.51 07:59:06 AM all 51.60 0.00 5.88 0.04 0.00 42.47 07:59:11 AM all 48.97 0.00 5.90 0.06 0.00 45.07 07:59:16 AM all 49.59 0.00 6.11 0.36 0.00 43.93 07:59:21 AM all 48.69 0.00 5.78 0.03 0.00 45.51 07:59:26 AM all 53.50 0.00 6.66 0.26 0.00 39.59 07:59:31 AM all 50.45 0.00 6.37 0.02 0.00 43.16 07:59:36 AM all 48.71 0.00 6.18 0.03 0.00 45.08 07:59:41 AM all 53.58 0.00 6.45 0.15 0.00 39.81 07:59:46 AM all 53.74 0.00 6.13 0.05 0.00 40.07 07:59:51 AM all 31.76 0.00 3.95 0.23 0.00 64.06 07:59:56 AM all 37.20 0.00 5.05 0.03 0.00 57.71 08:00:01 AM all 77.10 0.00 12.95 0.01 0.00 9.94 08:00:06 AM all 78.22 0.00 12.58 0.08 0.00 9.11 08:00:11 AM all 77.64 0.00 12.50 0.00 0.00 9.86 08:00:16 AM all 77.48 0.00 12.61 0.08 0.00 9.83 08:00:21 AM all 77.61 0.00 12.47 0.01 0.00 9.90 08:00:26 AM all 77.35 0.00 12.89 0.06 0.00 9.70 08:00:31 AM all 77.85 0.00 12.92 0.04 0.00 9.19 08:00:36 AM all 77.73 0.00 12.80 0.02 0.00 9.44 08:00:41 AM all 78.42 0.00 12.95 0.05 0.00 8.59 08:00:46 AM all 78.52 0.00 12.55 0.01 0.00 8.93 08:00:51 AM all 78.42 0.00 12.77 0.19 0.00 8.62 08:00:56 AM all 56.98 0.00 8.64 0.11 0.00 34.28 Memory 2026-03-20 08:00:36 pid=1117840 rss=30.04GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.93GB SwapAvail=71.02GB 2026-03-20 08:00:41 pid=1117840 rss=30.20GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.86GB SwapAvail=71.02GB 2026-03-20 08:00:41 pid=1117840 rss=30.20GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.86GB SwapAvail=71.02GB 2026-03-20 08:00:46 pid=1117840 rss=30.20GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.90GB SwapAvail=71.02GB 2026-03-20 08:00:46 pid=1117840 rss=30.20GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.90GB SwapAvail=71.02GB 2026-03-20 08:00:51 pid=1117840 rss=30.28GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.88GB SwapAvail=71.02GB 2026-03-20 08:00:51 pid=1117840 rss=30.28GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.88GB SwapAvail=71.02GB 2026-03-20 08:00:56 pid=1117840 rss=30.54GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=40.39GB SwapAvail=71.02GB 2026-03-20 08:00:56 pid=1117840 rss=30.54GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=40.39GB SwapAvail=71.02GB 2026-03-20 08:01:02 pid=1117840 rss=30.61GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=40.25GB SwapAvail=71.02GB 2026-03-20 08:01:02 pid=1117840 rss=30.61GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=40.25GB SwapAvail=71.02GB 2026-03-20 08:01:07 pid=1117840 rss=30.61GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.97GB SwapAvail=71.02GB 2026-03-20 08:01:07 pid=1117840 rss=30.61GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.97GB SwapAvail=71.02GB 2026-03-20 08:01:12 pid=1117840 rss=30.62GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.48GB SwapAvail=71.02GB 2026-03-20 08:01:12 pid=1117840 rss=30.62GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.48GB SwapAvail=71.02GB 2026-03-20 08:01:17 pid=1117840 rss=30.71GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.57GB SwapAvail=71.02GB 2026-03-20 08:01:17 pid=1117840 rss=30.71GB vsz=7.49TB proc_swap=0.00GB sys_swap=0.98/72.00GB MemAvail=39.57GB SwapAvail=71.02GB TIP Trucking 07:56:10] block #24,697,055 ts=2026-03-20 07:55:59 lag=+12.0s OK [07:56:15] block #24,697,056 ts=2026-03-20 07:56:11 lag=+4.5s OK [07:56:25] block #24,697,057 ts=2026-03-20 07:56:23 lag=+2.5s OK [07:56:38] block #24,697,058 ts=2026-03-20 07:56:35 lag=+3.4s OK [07:56:50] block #24,697,059 ts=2026-03-20 07:56:47 lag=+3.5s OK [07:57:02] block #24,697,060 ts=2026-03-20 07:56:59 lag=+3.6s OK [07:57:16] block #24,697,061 ts=2026-03-20 07:57:11 lag=+5.6s OK [07:57:27] block #24,697,062 ts=2026-03-20 07:57:23 lag=+4.7s OK [07:57:39] block #24,697,063 ts=2026-03-20 07:57:35 lag=+4.3s OK [07:57:49] block #24,697,064 ts=2026-03-20 07:57:47 lag=+2.4s OK [07:58:01] block #24,697,065 ts=2026-03-20 07:57:59 lag=+2.9s OK [07:58:13] block #24,697,066 ts=2026-03-20 07:58:11 lag=+2.8s OK [07:58:25] block #24,697,067 ts=2026-03-20 07:58:23 lag=+2.4s OK [07:58:37] block #24,697,068 ts=2026-03-20 07:58:35 lag=+2.7s OK [07:58:49] block #24,697,069 ts=2026-03-20 07:58:47 lag=+2.3s OK [07:59:01] block #24,697,070 ts=2026-03-20 07:58:59 lag=+2.1s OK [07:59:15] block #24,697,071 ts=2026-03-20 07:59:11 lag=+4.3s OK [07:59:25] block #24,697,072 ts=2026-03-20 07:59:23 lag=+2.6s OK [07:59:40] block #24,697,073 ts=2026-03-20 07:59:35 lag=+5.3s OK [08:00:02] block #24,697,074 ts=2026-03-20 07:59:59 lag=+3.9s OK [08:00:13] block #24,697,075 ts=2026-03-20 08:00:11 lag=+2.8s OK ./run_perf_tests.py -p pattern/mainnet/stress_test_eth_call_001_latest.tar -t 28000:60 -y eth_call -m 2 -r 100 -Z Performance Test started Test repetitions: 100 on sequence: 28000:60 for pattern: pattern/mainnet/stress_test_eth_call_001_latest.tar Test on port: http://localhost:8545 [1. 1] daemon: executes test qps: 28000 time: 60 -> [R=51.39% max=605.449ms error=503 Service Unavailable] [1. 2] daemon: executes test qps: 28000 time: 60 -> [R=51.55% max=442.974ms error=503 Service Unavailable] [1. 3] daemon: executes test qps: 28000 time: 60 -> [R=49.52% max=440.405ms error=503 Service Unavailable] [1. 4] daemon: executes test qps: 28000 time: 60 -> [R=51.01% max=440.004ms error=503 Service Unavailable] [1. 5] daemon: executes test qps: 28000 time: 60 -> [R=49.66% max=597.333ms error=503 Service Unavailable] ./run_perf_tests.py -p pattern/mainnet/stress_test_eth_call_001_latest.tar -t 28000:60 -y eth_call -m 2 -r 100 -Z Performance Test started Test repetitions: 100 on sequence: 28000:60 for pattern: pattern/mainnet/stress_test_eth_call_001_latest.tar Test on port: http://localhost:8545 [1. 1] daemon: executes test qps: 28000 time: 60 -> [R=51.51% max=581.793ms error=503 Service Unavailable] [1. 2] daemon: executes test qps: 28000 time: 60 -> [R=51.61% max=431.222ms error=503 Service Unavailable] [1. 3] daemon: executes test qps: 28000 time: 60 -> [R=49.48% max=495.57ms error=503 Service Unavailable] [1. 4] daemon: executes test qps: 28000 time: 60 -> [R=50.91% max=433.208ms error=503 Service Unavailable] [1. 5] daemon: executes test qps: 28000 time: 60 -> [R=49.57% max=538.283ms error=503 Service Unavailable] Verified on CI TIPtrucking infrastructure. Previous software versions experienced "TIP lost" at 3,000 QPS. With these changes, the system now successfully handles up to 6,000 QPS without any TIP loss or degradation. </details> Stress Test Observations (main release) - Chain Tip Loss: Under heavy load, the node fails to stay synced and the Chain Tip is lost, as the staged sync pipeline is starved of DB read slots by queued RPC goroutines. - Virtual Memory Pressure: The system experiences severe VM pressure, with process swap usage reaching 11.81 GB. The massive accumulation of goroutines blocked on roTxsLimiter.Acquire causes excessive paging and swapping. This state is highly unstable and frequently leads to the process being terminated by the OOM Killer, causing total node downtime. - Request Satisfaction (100%): Despite the performance degradation, all requests are eventually satisfied. However, this is achieved at the cost of system stability and synchronization. - Increased Latency: Request latency increases dramatically due to deep queuing, with response times reaching up to 1m 40s. --- Stress Test Observations (with PR) - Chain Tip Stability: The two-level admission control prevents goroutine accumulation entirely. The HTTP outer gate rejects excess requests before any processing; the BeginRo inner gate ensures that any RPC request that does enter the system uses TryAcquire (fail-fast) rather than blocking. Internal callers (staged sync, background workers) always use blocking Acquire and are never rejected, so the pipeline makes continuous progress. - Virtual Memory Pressure: Significantly lower memory footprint. By eliminating request queuing at the HTTP layer, the system avoids excessive paging and swapping (0.00 GB swap), keeping the OS stable. - Request Satisfaction (~50%): Approximately 50% of requests are satisfied; the remainder are immediately rejected with 503 Service Unavailable. This is the intended fail-fast behavior — goroutines never accumulate, DB slots are never exhausted. - Latency Consistency: Response latency remains consistently low. By refusing to queue requests beyond the system's capacity, the node avoids the massive latency spikes (previously up to 1m 40s) seen before the fix. This behavior is aligned with Nethermind, which returns 503 Service Unavailable under high load, prioritizing node health over request queuing. --- Final Observation By adopting a fail-fast strategy at two levels — HTTP admission before any expensive processing, and TryAcquire inside BeginRo for RPC callers — we enforce resource isolation at the core level. Internal execution paths retain guaranteed access to DB read slots via blocking Acquire, while external RPC pressure is shed immediately. This approach shifts congestion management responsibility to the external infrastructure (load balancers, proxies), which is better equipped to handle buffering, ensuring that the Erigon node remains stable and synchronized regardless of external RPC load. ## 🚀 RPC Concurrency & Resource Management Comparison | Feature | Erigon (main) | **Erigon (with PR)** | | :--- | :--- | :--- | | **Admission control** | ❌ None | ✅ **HTTP outer gate** (`rpcAdmissionHandler`) | | **Overload response** | Unlimited queuing | ✅ **Immediate HTTP 503** | | **Rejection point** | ❌ None | ✅ Pre-CORS, Gzip, JSON decode | | **Goroutine accumulation** | ⚠️ Yes, unlimited | ✅ **Eliminated** — goroutines don't enter the system | | **Internal pipeline protection** | ❌ RPC and staged sync compete for slots | ✅ **Internal callers** use blocking `Acquire` | | **DB slots protection** | ❌ None — RPC exhausts slots | ✅ `TryAcquire` in `BeginRo` for RPC | | **Memory under load** | ❌ Critical — swap up to 11.81 GB, OOM | ✅ **Stable** (0.00 GB swap in test) | | **Latency under overload** | High (~1m 40s) | ✅ **Constantly low** (fail-fast) | | **Configuration required** | ❌ No concurrency flags | ✅ **Zero config**; `--rpc.max.concurrency` optional | | **Execution isolation** | ❌ Chain tip lost under load | ✅ **Guaranteed by design** | ### 📊 Performance Comparison: Main (18/03) vs. PR This benchmark compares the current `main` branch against this PR using the same set of APIs under heavy load. | API | main (18/03) post_exec p50 | PR post_exec p50 | Improvement | | :--- | :---: | :---: | :---: | | **eth_call** @ 3000 QPS | 6.82s ✅ | 5.89s ✅ | **−14%** | | **eth_getBlockByNumber** @ 3000 QPS | 13.73s ⚠️ | 5.23s ✅ | **−62%** | | **eth_getProof** @ 1000–3000 QPS | 49.12s (tip lost) | 2.84s ✅ | **−94%** | --- ### 🔍 Key Observations * **eth_call**: Neither `main` nor the PR caused a chain tip loss. Since `eth_call` is read-only and light on DB slots, it is inherently more stable, but the PR still delivers a **14% reduction** in p50 latency. * **eth_getBlockByNumber**: Remains stable up to **6000 QPS** with no actual tip loss. Any observed `sync=0` periods during testing were identified as monitoring false negatives rather than actual node desync. * **eth_getProof**: This is the most impactful result. While `main` lost the chain tip at only 1000 QPS (p50=49s), the **PR successfully holds up to 3000 QPS** with a p50 of 2.84s—a **94% performance gain**. ### 🏆 Overall Conclusion The final PR successfully **eliminates chain tip loss** across all tested APIs and QPS levels. No real tip loss was observed in any production-level test run, ensuring much higher node reliability under stress. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
diff --git a/.github/workflows/qa-rpc-performance-tests.yml b/.github/workflows/qa-rpc-performance-tests.yml
@@ -122,7 +122,7 @@ jobs:
         if: matrix.client == 'erigon' || needs.setup.outputs.run_geth == 'true'
         run: |
           rm -rf ${{runner.workspace}}/rpc-tests
-          git -c advice.detachedHead=false clone --depth 1 --branch v1.115.0 https://github.com/erigontech/rpc-tests ${{runner.workspace}}/rpc-tests
+          git -c advice.detachedHead=false clone --depth 1 --branch v1.124.0 https://github.com/erigontech/rpc-tests ${{runner.workspace}}/rpc-tests
           cd ${{runner.workspace}}/rpc-tests
 
       - name: Clean Erigon Build Directory
diff --git a/.github/workflows/qa-tip-tracking-with-load.yml b/.github/workflows/qa-tip-tracking-with-load.yml
@@ -82,7 +82,7 @@ jobs:
       if: matrix.client == 'erigon' || (matrix.client == 'geth' && github.event.inputs.run_geth == 'true')
       run: |
         rm -rf ${{runner.workspace}}/rpc-tests
-        git -c advice.detachedHead=false clone --depth 1 --branch v1.78.0 https://github.com/erigontech/rpc-tests ${{runner.workspace}}/rpc-tests
+        git -c advice.detachedHead=false clone --depth 1 --branch v1.124.0 https://github.com/erigontech/rpc-tests ${{runner.workspace}}/rpc-tests
         cd ${{runner.workspace}}/rpc-tests
 
     - name: Clean Erigon Build Directory
diff --git a/cmd/rpcdaemon/cli/config.go b/cmd/rpcdaemon/cli/config.go
@@ -129,6 +129,7 @@ func RootCommand() (*cobra.Command, *httpcfg.HttpCfg) {
 	rootCmd.PersistentFlags().BoolVar(&cfg.RpcStreamingDisable, utils.RpcStreamingDisableFlag.Name, false, utils.RpcStreamingDisableFlag.Usage)
 	rootCmd.PersistentFlags().BoolVar(&cfg.DebugSingleRequest, utils.HTTPDebugSingleFlag.Name, false, utils.HTTPDebugSingleFlag.Usage)
 	rootCmd.PersistentFlags().IntVar(&cfg.DBReadConcurrency, utils.DBReadConcurrencyFlag.Name, utils.DBReadConcurrencyFlag.Value, utils.DBReadConcurrencyFlag.Usage)
+	rootCmd.PersistentFlags().IntVar(&cfg.RpcMaxConcurrentRequests, utils.RpcMaxConcurrentRequestsFlag.Name, utils.RpcMaxConcurrentRequestsFlag.Value, utils.RpcMaxConcurrentRequestsFlag.Usage)
 	rootCmd.PersistentFlags().BoolVar(&cfg.TraceCompatibility, "trace.compat", false, "Bug for bug compatibility with OE for trace_ routines")
 	rootCmd.PersistentFlags().BoolVar(&cfg.GethCompatibility, "rpc.gethcompat", false, "Enables Geth-compatible storage iteration order for debug_storageRangeAt (sorted by keccak256 hash). Disabled by default for performance.")
 	rootCmd.PersistentFlags().BoolVar(&cfg.TestingEnabled, "rpc.testing", false, "Enables the testing_ RPC namespace (testing_buildBlockV1). WARNING: do not enable on production networks.")
@@ -747,7 +748,17 @@ func startRegularRpcServer(ctx context.Context, cfg *httpcfg.HttpCfg, rpcAPI []r
 		logger.Info("Socket Endpoint opened", "url", socketUrl)
 	}
 
-	httpHandler := node.NewHTTPHandlerStack(srv, cfg.HttpCORSDomain, cfg.HttpVirtualHost, cfg.HttpCompression)
+	// RPC admission limit: -1 = unlimited, 0 = use db.read.concurrency, >0 = explicit limit.
+	var rpcConcurrencyLimit int64
+	switch {
+	case cfg.RpcMaxConcurrentRequests == -1:
+		rpcConcurrencyLimit = 0 // disabled
+	case cfg.RpcMaxConcurrentRequests > 0:
+		rpcConcurrencyLimit = int64(cfg.RpcMaxConcurrentRequests)
+	default:
+		rpcConcurrencyLimit = int64(cfg.DBReadConcurrency)
+	}
+	httpHandler := node.NewHTTPHandlerStack(srv, cfg.HttpCORSDomain, cfg.HttpVirtualHost, cfg.HttpCompression, rpcConcurrencyLimit, true)
 	var wsHandler http.Handler
 	if cfg.WebsocketEnabled {
 		wsHandler = srv.WebsocketHandler([]string{"*"}, nil, cfg.WebsocketCompression, logger)
@@ -958,7 +969,9 @@ func createEngineListener(cfg *httpcfg.HttpCfg, engineApi []rpc.API, logger log.
 
 	wsHandler := engineSrv.WebsocketHandler([]string{"*"}, jwtSecret, cfg.WebsocketCompression, logger)
 
-	engineHttpHandler := node.NewHTTPHandlerStack(engineSrv, nil /* authCors */, cfg.AuthRpcVirtualHost, cfg.HttpCompression)
+	// Engine API (auth) is the CL↔EL protocol — not user RPC. Do not tag with TxPriorityRPC
+	// so execution-engine DB operations use blocking Acquire instead of fail-fast TryAcquire.
+	engineHttpHandler := node.NewHTTPHandlerStack(engineSrv, nil /* authCors */, cfg.AuthRpcVirtualHost, cfg.HttpCompression, 0, false)
 
 	graphQLHandler := graphql.CreateHandler(engineApi)
 
diff --git a/cmd/rpcdaemon/cli/httpcfg/http_cfg.go b/cmd/rpcdaemon/cli/httpcfg/http_cfg.go
@@ -72,6 +72,7 @@ type HttpCfg struct {
 	RpcStreamingDisable               bool
 	RpcFiltersConfig                  rpchelper.FiltersConfig
 	DBReadConcurrency                 int
+	RpcMaxConcurrentRequests          int  // HTTP admission control limit; -1 = unlimited
 	TraceCompatibility                bool // Bug for bug compatibility for trace_ routines with OpenEthereum
 	GethCompatibility                 bool // Geth-compatible storage iteration order for debug_storageRangeAt
 	TxPoolApiAddr                     string
diff --git a/cmd/utils/flags.go b/cmd/utils/flags.go
@@ -394,6 +394,11 @@ var (
 		Usage: "Does limit amount of parallel db reads. Default: equal to GOMAXPROCS (or number of CPU)",
 		Value: min(max(10, runtime.GOMAXPROCS(-1)*64), 9_000),
 	}
+	RpcMaxConcurrentRequestsFlag = cli.IntFlag{
+		Name:  "rpc.max.concurrency",
+		Usage: "Maximum number of concurrent HTTP RPC requests (HTTP admission control). 0 = use db.read.concurrency, -1 = unlimited (no admission control)",
+		Value: 0,
+	}
 	RpcAccessListFlag = cli.StringFlag{
 		Name:  "rpc.accessList",
 		Usage: "Specify granular (method-by-method) API allowlist",
diff --git a/db/kv/kv_interface.go b/db/kv/kv_interface.go
@@ -726,6 +726,23 @@ var (
 	//DbGcSelfPnlMergeCalls  = metrics.NewCounter(`db_gc_pnl{phase="slef_merge_calls"}`)                //nolint
 )
 
+// ErrServerOverloaded is returned by BeginRo when the DB semaphore is full and the caller is an RPC handler.
+var ErrServerOverloaded = errors.New("server overloaded, retry later")
+
+type nonBlockingAcquireKey struct{}
+
+// WithNonBlockingAcquire tags ctx to request fail-fast semaphore acquisition in BeginRo.
+// When set, BeginRo uses TryAcquire and returns ErrServerOverloaded immediately if the
+// read-tx semaphore is full, instead of blocking until a slot is available.
+func WithNonBlockingAcquire(ctx context.Context) context.Context {
+	return context.WithValue(ctx, nonBlockingAcquireKey{}, struct{}{})
+}
+
+// IsNonBlockingAcquire reports whether ctx was tagged by WithNonBlockingAcquire.
+func IsNonBlockingAcquire(ctx context.Context) bool {
+	return ctx.Value(nonBlockingAcquireKey{}) != nil
+}
+
 type Closer interface {
 	Close()
 }
diff --git a/db/kv/mdbx/kv_mdbx.go b/db/kv/mdbx/kv_mdbx.go
@@ -46,8 +46,11 @@ import (
 	"github.com/erigontech/erigon/db/kv/dbcfg"
 	"github.com/erigontech/erigon/db/kv/order"
 	"github.com/erigontech/erigon/db/kv/stream"
+	"github.com/erigontech/erigon/diagnostics/metrics"
 )
 
+var dbRoTxOverloaded = metrics.GetOrCreateCounter(`db_rotx_overloaded_total`)
+
 func init() {
 	mdbx.MapFullErrorMessage += " You can try remove the database files (e.g., by running rm -rf /path/to/db)"
 }
@@ -582,8 +585,13 @@ func (db *MdbxKV) BeginRo(ctx context.Context) (txn kv.Tx, err error) {
 		return nil, errors.New("db closed")
 	}
 
-	// will return nil err if context is cancelled (may appear to acquire the semaphore)
-	if semErr := db.roTxsLimiter.Acquire(ctx, 1); semErr != nil {
+	if kv.IsNonBlockingAcquire(ctx) {
+		if !db.roTxsLimiter.TryAcquire(1) {
+			db.trackTxEnd()
+			dbRoTxOverloaded.Inc()
+			return nil, kv.ErrServerOverloaded
+		}
+	} else if semErr := db.roTxsLimiter.Acquire(ctx, 1); semErr != nil {
 		db.trackTxEnd()
 		return nil, fmt.Errorf("mdbx.MdbxKV.BeginRo: roTxsLimiter error %w", semErr)
 	}
diff --git a/db/kv/remotedbserver/remotedbserver.go b/db/kv/remotedbserver/remotedbserver.go
@@ -135,7 +135,7 @@ func (s *KvServer) begin(ctx context.Context) (id uint64, err error) {
 	}
 	s.txsMapLock.Lock()
 	defer s.txsMapLock.Unlock()
-	tx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic
+	tx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic // tx is stored in s.txs and rolled back by rollback(); defer would close it prematurely
 	if errBegin != nil {
 		return 0, errBegin
 	}
@@ -157,7 +157,7 @@ func (s *KvServer) renew(ctx context.Context, id uint64) (err error) {
 		defer tx.Unlock()
 		tx.Rollback()
 	}
-	newTx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic
+	newTx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic // tx is stored in s.txs and rolled back by rollback(); defer would close it prematurely
 	if errBegin != nil {
 		return fmt.Errorf("kvserver: %w", errBegin)
 	}
diff --git a/node/cli/default_flags.go b/node/cli/default_flags.go
@@ -84,6 +84,7 @@ var DefaultFlags = []cli.Flag{
 	&utils.RpcBatchConcurrencyFlag,
 	&utils.RpcStreamingDisableFlag,
 	&utils.DBReadConcurrencyFlag,
+	&utils.RpcMaxConcurrentRequestsFlag,
 	&utils.RpcAccessListFlag,
 	&utils.RpcTraceCompatFlag,
 	&utils.RpcGethCompatFlag,
diff --git a/node/cli/flags.go b/node/cli/flags.go
@@ -447,6 +447,7 @@ func setEmbeddedRpcDaemon(ctx *cli.Context, cfg *nodecfg.Config, logger log.Logg
 		RpcBatchConcurrency:       ctx.Uint(utils.RpcBatchConcurrencyFlag.Name),
 		RpcStreamingDisable:       ctx.Bool(utils.RpcStreamingDisableFlag.Name),
 		DBReadConcurrency:         ctx.Int(utils.DBReadConcurrencyFlag.Name),
+		RpcMaxConcurrentRequests:  ctx.Int(utils.RpcMaxConcurrentRequestsFlag.Name),
 		RpcAllowListFilePath:      ctx.String(utils.RpcAccessListFlag.Name),
 		RpcFiltersConfig: rpchelper.FiltersConfig{
 			RpcSubscriptionFiltersMaxLogs:      ctx.Int(RpcSubscriptionFiltersMaxLogsFlag.Name),
diff --git a/node/rpcstack.go b/node/rpcstack.go
@@ -35,6 +35,8 @@ import (
 	"github.com/rs/cors"
 
 	"github.com/erigontech/erigon/common/log/v3"
+	"github.com/erigontech/erigon/db/kv"
+	"github.com/erigontech/erigon/diagnostics/metrics"
 	"github.com/erigontech/erigon/rpc"
 	"github.com/erigontech/erigon/rpc/rpccfg"
 )
@@ -46,6 +48,10 @@ type httpConfig struct {
 	Vhosts             []string
 	Compression        bool
 	prefix             string // path prefix on which to mount http handler
+	// RpcConcurrencyLimit is the maximum number of concurrent HTTP RPC requests.
+	// Requests beyond this limit receive an immediate 503 before touching any middleware.
+	// 0 means unlimited (admission control disabled).
+	RpcConcurrencyLimit int64
 }
 
 // wsConfig is the JSON-RPC/Websocket configuration
@@ -193,6 +199,11 @@ func (h *httpServer) start() error {
 
 func (h *httpServer) ServeHTTP(w http.ResponseWriter, r *http.Request) {
 	// check if ws request and serve if ws enabled
+	// Note: WebSocket connections bypass rpcAdmissionHandler intentionally.
+	// HTTP admission control limits inflight requests per connection, but WebSocket
+	// is a persistent long-lived connection where the relevant limit is the number
+	// of concurrent connections, not inflight requests. A dedicated WebSocket
+	// connection limiter will be addressed in a separate PR.
 	ws := h.wsHandler.Load().(*rpcHandler)
 	if ws != nil && isWebsocket(r) {
 		if checkPath(r, h.wsConfig.prefix) {
@@ -280,7 +291,7 @@ func (h *httpServer) enableRPC(apis []rpc.API, config httpConfig, allowList rpc.
 	}
 	h.httpConfig = config
 	h.httpHandler.Store(&rpcHandler{
-		Handler: NewHTTPHandlerStack(srv, config.CorsAllowedOrigins, config.Vhosts, config.Compression),
+		Handler: NewHTTPHandlerStack(srv, config.CorsAllowedOrigins, config.Vhosts, config.Compression, config.RpcConcurrencyLimit, true),
 		server:  srv,
 	})
 	return nil
@@ -345,17 +356,60 @@ func isWebsocket(r *http.Request) bool {
 		strings.Contains(strings.ToLower(r.Header.Get("Connection")), "upgrade")
 }
 
-// NewHTTPHandlerStack returns wrapped http-related handlers
-func NewHTTPHandlerStack(srv http.Handler, cors []string, vhosts []string, compression bool) http.Handler {
-	// Wrap the CORS-handler within a host-handler
+// NewHTTPHandlerStack returns wrapped http-related handlers.
+// When tagAsRPC is true and rpcConcurrencyLimit > 0, enforces admission control
+// (503 if inflight > limit) to prevent goroutine pile-up under load.
+func NewHTTPHandlerStack(srv http.Handler, cors []string, vhosts []string, compression bool, rpcConcurrencyLimit int64, tagAsRPC bool) http.Handler {
 	handler := newCorsHandler(srv, cors)
 	handler = newVHostHandler(vhosts, handler)
 	if compression {
 		handler = newGzipHandler(handler)
 	}
+	if tagAsRPC {
+		handler = newRPCAdmissionHandler(rpcConcurrencyLimit, handler)
+	}
 	return handler
 }
 
+// rpcAdmissionHandler limits the number of concurrent HTTP RPC requests.
+// Requests that exceed the limit receive an immediate HTTP 503 without going
+// through CORS, gzip, or JSON decoding.
+type rpcAdmissionHandler struct {
+	inflight atomic.Int64
+	limit    int64
+	next     http.Handler
+}
+
+var rpcAdmissionRejected = metrics.GetOrCreateCounter(`rpc_admission_rejected_total`)
+
+func newRPCAdmissionHandler(limit int64, next http.Handler) http.Handler {
+	return &rpcAdmissionHandler{limit: limit, next: next}
+}
+
+func (h *rpcAdmissionHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
+	if h.limit > 0 {
+		if h.inflight.Add(1) > h.limit {
+			h.inflight.Add(-1)
+			rpcAdmissionRejected.Inc()
+			w.Header().Set("Retry-After", "1")
+			// TODO: the 503 response here is plain text, not a JSON-RPC envelope.
+			// Similarly, when BeginRo returns ErrServerOverloaded (inner gate), the
+			// response is HTTP 200 with JSON-RPC code -32000 instead of -32005.
+			// Both paths should return HTTP 503 + JSON-RPC {"error":{"code":-32005,...}}.
+			// This requires buffering the response in rpc/http.go and will be
+			// addressed in a separate PR.
+			http.Error(w, "server overloaded, retry later", http.StatusServiceUnavailable)
+			return
+		}
+		defer h.inflight.Add(-1)
+	}
+	ctx := r.Context()
+	if h.limit > 0 {
+		ctx = kv.WithNonBlockingAcquire(ctx)
+	}
+	h.next.ServeHTTP(w, r.WithContext(ctx))
+}
+
 func newCorsHandler(srv http.Handler, allowedOrigins []string) http.Handler {
 	// disable CORS support if user has not specified a custom CORS configuration
 	if len(allowedOrigins) == 0 {
diff --git a/node/rpcstack_test.go b/node/rpcstack_test.go
@@ -24,9 +24,11 @@ import (
 	"fmt"
 	"io"
 	"net/http"
+	"net/http/httptest"
 	"net/url"
 	"strconv"
 	"strings"
+	"sync"
 	"testing"
 
 	"github.com/gorilla/websocket"
@@ -245,6 +247,11 @@ func Test_checkPath(t *testing.T) {
 func createAndStartServer(t *testing.T, conf *httpConfig, ws bool, wsConf *wsConfig) *httpServer {
 	t.Helper()
 
+	// Ensure RpcConcurrencyLimit is always set so admission control wiring is exercised.
+	// A high value avoids interfering with existing tests while still activating the path.
+	if conf.RpcConcurrencyLimit == 0 {
+		conf.RpcConcurrencyLimit = 1000
+	}
 	srv := newHTTPServer(testlog.Logger(t, log.LvlError), rpccfg.DefaultHTTPTimeouts)
 	require.NoError(t, srv.enableRPC(nil, *conf, nil))
 	if ws {
@@ -375,3 +382,63 @@ func TestHTTP2H2C(t *testing.T) {
 	require.NoError(t, err)
 	assert.Contains(t, string(result), "jsonrpc", "expected JSON-RPC response")
 }
+
+// TestRPCAdmissionHandler verifies that rpcAdmissionHandler correctly limits
+// concurrent requests and returns HTTP 503 when the limit is exceeded.
+func TestRPCAdmissionHandler(t *testing.T) {
+	okHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.WriteHeader(http.StatusOK)
+	})
+
+	t.Run("disabled when limit is zero", func(t *testing.T) {
+		h := newRPCAdmissionHandler(0, okHandler)
+		rec := httptest.NewRecorder()
+		h.ServeHTTP(rec, httptest.NewRequest(http.MethodPost, "/", nil))
+		assert.Equal(t, http.StatusOK, rec.Code)
+	})
+
+	t.Run("allows requests under the limit", func(t *testing.T) {
+		h := newRPCAdmissionHandler(5, okHandler)
+		rec := httptest.NewRecorder()
+		h.ServeHTTP(rec, httptest.NewRequest(http.MethodPost, "/", nil))
+		assert.Equal(t, http.StatusOK, rec.Code)
+	})
+
+	t.Run("returns 503 when limit is exceeded", func(t *testing.T) {
+		// Use a gate channel to hold inflight requests open long enough to
+		// trigger the limit.
+		gate := make(chan struct{})
+		blockingHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+			<-gate
+			w.WriteHeader(http.StatusOK)
+		})
+
+		const limit = 2
+		h := newRPCAdmissionHandler(limit, blockingHandler)
+
+		var wg sync.WaitGroup
+		for i := 0; i < limit; i++ {
+			wg.Add(1)
+			go func() {
+				defer wg.Done()
+				h.ServeHTTP(httptest.NewRecorder(), httptest.NewRequest(http.MethodPost, "/", nil))
+			}()
+		}
+
+		// Give goroutines time to enter the handler and increment the counter.
+		// We busy-wait on the inflight counter rather than sleeping.
+		admission := h.(*rpcAdmissionHandler)
+		for admission.inflight.Load() < limit {
+			// spin
+		}
+
+		// Now the limit is reached — next request must be rejected.
+		rec := httptest.NewRecorder()
+		h.ServeHTTP(rec, httptest.NewRequest(http.MethodPost, "/", nil))
+		assert.Equal(t, http.StatusServiceUnavailable, rec.Code)
+
+		// Release the held requests.
+		close(gate)
+		wg.Wait()
+	})
+}

Original file line number	Diff line number	Diff line change
`@@ -135,7 +135,7 @@ func (s *KvServer) begin(ctx context.Context) (id uint64, err error) {`
`135`	`135`	`}`
`136`	`136`	`s.txsMapLock.Lock()`
`137`	`137`	`defer s.txsMapLock.Unlock()`
`138`		`- tx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic`
	`138`	`+ tx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic // tx is stored in s.txs and rolled back by rollback(); defer would close it prematurely`
`139`	`139`	`if errBegin != nil {`
`140`	`140`	`return 0, errBegin`
`141`	`141`	`}`
`@@ -157,7 +157,7 @@ func (s *KvServer) renew(ctx context.Context, id uint64) (err error) {`
`157`	`157`	`defer tx.Unlock()`
`158`	`158`	`tx.Rollback()`
`159`	`159`	`}`
`160`		`- newTx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic`
	`160`	`+ newTx, errBegin := s.kv.BeginTemporalRo(ctx) //nolint:gocritic // tx is stored in s.txs and rolled back by rollback(); defer would close it prematurely`
`161`	`161`	`if errBegin != nil {`
`162`	`162`	`return fmt.Errorf("kvserver: %w", errBegin)`
`163`	`163`	`}`