Linux 内核进程调度时间片一般是HZ的倒数,HZ在编译的时候一般设置为1000,倒数也就是1ms,也就是每个进程的时间片是1ms(早年是10ms–HZ 为100的时候),如果进程1阻塞让出CPU进入调度队列,这个时候调度队列前还有两个进程2/3在排队,也就是最差会在2ms后才轮到1被调度执行。负载决定了排队等待调度队列的长短,如果轮到调度的进程已经ready那么性能没有浪费,反之如果轮到被调度但是没有ready(比如网络回包没到达)相当浪费了一次调度
sched_min_granularity_ns is the most prominent setting. In the original sched-design-CFS.txt this was described as the only “tunable” setting, “to tune the scheduler from ‘desktop’ (low latencies) to ‘server’ (good batching) workloads.”
In other words, we can change this setting to reduce overheads from context-switching, and therefore improve throughput at the cost of responsiveness (“latency”).
The CFS setting as mimicking the previous build-time setting, CONFIG_HZ. In the first version of the CFS code, the default value was 1 ms, equivalent to 1000 Hz for “desktop” usage. Other supported values of CONFIG_HZ were 250 Hz (the default), and 100 Hz for the “server” end. 100 Hz was also useful when running Linux on very slow CPUs, this was one of the reasons given when CONFIG_HZ was first added as an build setting on X86.
model name : Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz 2 physical CPUs, 26 cores/CPU, 2 hardware threads/core = 104 hw threads total -- No CPU affinity -- 10000000 system calls in 1144720626ns (114.5ns/syscall) 2000000 process context switches in 6280519812ns (3140.3ns/ctxsw) 2000000 thread context switches in 6417846724ns (3208.9ns/ctxsw) 2000000 thread context switches in 147035970ns (73.5ns/ctxsw) -- With CPU affinity -- 10000000 system calls in 1109675081ns (111.0ns/syscall) 2000000 process context switches in 4204573541ns (2102.3ns/ctxsw) 2000000 thread context switches in 2740739815ns (1370.4ns/ctxsw) 2000000 thread context switches in 474815006ns (237.4ns/ctxsw) -- With CPU affinity to CPU 0 -- 10000000 system calls in 1039827099ns (104.0ns/syscall) 2000000 process context switches in 5622932975ns (2811.5ns/ctxsw) 2000000 thread context switches in 5697704164ns (2848.9ns/ctxsw) 2000000 thread context switches in 143474146ns (71.7ns/ctxsw) ---------- model name : Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz 2 physical CPUs, 16 cores/CPU, 2 hardware threads/core = 64 hw threads total -- No CPU affinity -- 10000000 system calls in 772827735ns (77.3ns/syscall) 2000000 process context switches in 4009838007ns (2004.9ns/ctxsw) 2000000 thread context switches in 5234823470ns (2617.4ns/ctxsw) 2000000 thread context switches in 193276269ns (96.6ns/ctxsw) -- With CPU affinity -- 10000000 system calls in 746578449ns (74.7ns/syscall) 2000000 process context switches in 3598569493ns (1799.3ns/ctxsw) 2000000 thread context switches in 2475733882ns (1237.9ns/ctxsw) 2000000 thread context switches in 381484302ns (190.7ns/ctxsw) -- With CPU affinity to CPU 0 -- 10000000 system calls in 746674401ns (74.7ns/syscall) 2000000 process context switches in 4129856807ns (2064.9ns/ctxsw) 2000000 thread context switches in 4226458450ns (2113.2ns/ctxsw) 2000000 thread context switches in 193047255ns (96.5ns/ctxsw) --------- model name : Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz 2 physical CPUs, 24 cores/CPU, 2 hardware threads/core = 96 hw threads total -- No CPU affinity -- 10000000 system calls in 765013680ns (76.5ns/syscall) 2000000 process context switches in 5906908170ns (2953.5ns/ctxsw) 2000000 thread context switches in 6741875538ns (3370.9ns/ctxsw) 2000000 thread context switches in 173271254ns (86.6ns/ctxsw) -- With CPU affinity -- 10000000 system calls in 764139687ns (76.4ns/syscall) 2000000 process context switches in 4040915457ns (2020.5ns/ctxsw) 2000000 thread context switches in 2327904634ns (1164.0ns/ctxsw) 2000000 thread context switches in 378847082ns (189.4ns/ctxsw) -- With CPU affinity to CPU 0 -- 10000000 system calls in 762375921ns (76.2ns/syscall) 2000000 process context switches in 5827318932ns (2913.7ns/ctxsw) 2000000 thread context switches in 6360562477ns (3180.3ns/ctxsw) 2000000 thread context switches in 173019064ns (86.5ns/ctxsw) --------ECS model name : Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz 1 physical CPUs, 2 cores/CPU, 2 hardware threads/core = 4 hw threads total -- No CPU affinity -- 10000000 system calls in 561242906ns (56.1ns/syscall) 2000000 process context switches in 3025706345ns (1512.9ns/ctxsw) 2000000 thread context switches in 3333843503ns (1666.9ns/ctxsw) 2000000 thread context switches in 145410372ns (72.7ns/ctxsw) -- With CPU affinity -- 10000000 system calls in 586742944ns (58.7ns/syscall) 2000000 process context switches in 2369203084ns (1184.6ns/ctxsw) 2000000 thread context switches in 1929627973ns (964.8ns/ctxsw) 2000000 thread context switches in 335827569ns (167.9ns/ctxsw) -- With CPU affinity to CPU 0 -- 10000000 system calls in 630259940ns (63.0ns/syscall) 2000000 process context switches in 3027444795ns (1513.7ns/ctxsw) 2000000 thread context switches in 3172677638ns (1586.3ns/ctxsw) 2000000 thread context switches in 144168251ns (72.1ns/ctxsw) ---------kupeng 920 2 physical CPUs, 96 cores/CPU, 1 hardware threads/core = 192 hw threads total -- No CPU affinity -- 10000000 system calls in 1216730780ns (121.7ns/syscall) 2000000 process context switches in 4653366132ns (2326.7ns/ctxsw) 2000000 thread context switches in 4689966324ns (2345.0ns/ctxsw) 2000000 thread context switches in 167871167ns (83.9ns/ctxsw) -- With CPU affinity -- 10000000 system calls in 1220106854ns (122.0ns/syscall) 2000000 process context switches in 3420506934ns (1710.3ns/ctxsw) 2000000 thread context switches in 2962106029ns (1481.1ns/ctxsw) 2000000 thread context switches in 543325133ns (271.7ns/ctxsw) -- With CPU affinity to CPU 0 -- 10000000 system calls in 1216466158ns (121.6ns/syscall) 2000000 process context switches in 2797948549ns (1399.0ns/ctxsw) 2000000 thread context switches in 3119316050ns (1559.7ns/ctxsw) 2000000 thread context switches in 167728516ns (83.9ns/ctxsw)