-
Notifications
You must be signed in to change notification settings - Fork 215
Description
The issue I encountered is that after disabling irqbalance and manually balancing interrupt binding, the scx_simple scheduler shows a 30% performance degradation compared to using irqbalance.
My test machine has 4 NUMA nodes, each with 32 CPUs.
The test case involves 40 Redis instances running redis-benchmark for SET and GET operations.
[root@sd home]# /home/linux/tools/sched_ext/build/bin/scx_simple
local=4 global=1
local=811 global=22
local=1689 global=46
local=2556 global=67
local=3402 global=101
local=4213 global=130
local=5095 global=150
local=5938 global=201
local=6784 global=228
local=7630 global=264
local=8429 global=281
local=9289 global=295
local=10524 global=403
local=143708 global=1485
local=296284 global=2335
local=448743 global=2832
local=562712 global=2962
local=564082 global=3126
……
Under the default system configuration, the 24 interrupts of the network card were bound to NUMA 0.
IRQ 101 -> CPU MASK: 00000000,00000000,00000000,00000001
IRQ 102 -> CPU MASK: 00000000,00000000,00000000,00000002
IRQ 103 -> CPU MASK: 00000000,00000000,00000000,00000004
IRQ 104 -> CPU MASK: 00000000,00000000,00000000,00000008
IRQ 105 -> CPU MASK: 00000000,00000000,00000000,00000010
IRQ 106 -> CPU MASK: 00000000,00000000,00000000,00000020
IRQ 107 -> CPU MASK: 00000000,00000000,00000000,00000040
IRQ 108 -> CPU MASK: 00000000,00000000,00000000,00000080
IRQ 109 -> CPU MASK: 00000000,00000000,00000000,00000100
IRQ 110 -> CPU MASK: 00000000,00000000,00000000,00000200
IRQ 111 -> CPU MASK: 00000000,00000000,00000000,00000400
IRQ 112 -> CPU MASK: 00000000,00000000,00000000,00000800
IRQ 113 -> CPU MASK: 00000000,00000000,00000000,00001000
IRQ 114 -> CPU MASK: 00000000,00000000,00000000,00002000
IRQ 115 -> CPU MASK: 00000000,00000000,00000000,00004000
IRQ 116 -> CPU MASK: 00000000,00000000,00000000,00008000
IRQ 117 -> CPU MASK: 00000000,00000000,00000000,00010000
IRQ 118 -> CPU MASK: 00000000,00000000,00000000,00020000
IRQ 119 -> CPU MASK: 00000000,00000000,00000000,00040000
IRQ 120 -> CPU MASK: 00000000,00000000,00000000,00080000
IRQ 121 -> CPU MASK: 00000000,00000000,00000000,00100000
IRQ 122 -> CPU MASK: 00000000,00000000,00000000,00200000
IRQ 123 -> CPU MASK: 00000000,00000000,00000000,00400000
IRQ 124 -> CPU MASK: 00000000,00000000,00000000,00800000
The average performance results for SET and GET were approximately 23827.40 and 23920.64 requests per second, respectively.
Then, irqbalance was disabled. 6 interrupts were bound to each NUMA node, distributed evenly across cores at equal intervals.
IRQ 101 -> CPU MASK: 00000000,00000000,00000000,00000001
IRQ 102 -> CPU MASK: 00000000,00000000,00000000,00000040
IRQ 103 -> CPU MASK: 00000000,00000000,00000000,00001000
IRQ 104 -> CPU MASK: 00000000,00000000,00000000,00040000
IRQ 105 -> CPU MASK: 00000000,00000000,00000000,01000000
IRQ 106 -> CPU MASK: 00000000,00000000,00000000,40000000
IRQ 107 -> CPU MASK: 00000000,00000000,00000001,00000000
IRQ 108 -> CPU MASK: 00000000,00000000,00000040,00000000
IRQ 109 -> CPU MASK: 00000000,00000000,00001000,00000000
IRQ 110 -> CPU MASK: 00000000,00000000,00040000,00000000
IRQ 111 -> CPU MASK: 00000000,00000000,01000000,00000000
IRQ 112 -> CPU MASK: 00000000,00000000,40000000,00000000
IRQ 113 -> CPU MASK: 00000000,00000001,00000000,00000000
IRQ 114 -> CPU MASK: 00000000,00000040,00000000,00000000
IRQ 115 -> CPU MASK: 00000000,00001000,00000000,00000000
IRQ 116 -> CPU MASK: 00000000,00040000,00000000,00000000
IRQ 117 -> CPU MASK: 00000000,01000000,00000000,00000000
IRQ 118 -> CPU MASK: 00000000,40000000,00000000,00000000
IRQ 119 -> CPU MASK: 00000001,00000000,00000000,00000000
IRQ 120 -> CPU MASK: 00000040,00000000,00000000,00000000
IRQ 121 -> CPU MASK: 00001000,00000000,00000000,00000000
IRQ 122 -> CPU MASK: 00040000,00000000,00000000,00000000
IRQ 123 -> CPU MASK: 01000000,00000000,00000000,00000000
IRQ 124 -> CPU MASK: 40000000,00000000,00000000,00000000
The subsequent average performance results for SET and GET were approximately 15116.20 and 16338.75 requests per second, respectively.
23827.40 --> 15116.20
23920.64 --> 16338.75
This represents a degradation of 36.56% for SET operations and 31.70% for GET operations.
Additionally, the following two tests were conducted:
1、Under the same test conditions, using schedulers such as scx_central and scx_prev also resulted in varying degrees of performance degradation.
2、Under the same test conditions but without using scx_simple, testing changes to the kernel scheduler showed a performance drop of less than 1%.