09-25-2018, 02:10 PM
Hi all
Is there anyone here who understands how to "tune" the Linux task scheduler? My problem is that at 4.18.9 it does not obviously favour the big cores for CPU heavy jobs which is crazy.
I wrote a little script to demonstrate what I mean - it uses an additional package stress as an example that can spawn n jobs for t seconds. In my case I run the load for 10 seconds, and after 5 seconds have a look which CPUs are in use:
On the 4.4.138 (Ayufan) kernel things are pretty much to be expected - CPUs 4 & 5 (the big A72 cores) are always used:
On 4.18.9 (despite all my efforts configuring the kernel) it is more luck whether CPUs 4 or 5 do any work:
There does seem to be a kernel CONFIG option at 4.4, ARM_ROCKCHIP_CPUFREQ, that could explain the 4.4 behaviour. And is not available at 4.18.9. But in reality I would expect stock/mainline 4.18 scheduling to cope properly with big.LITTLE CPUs?
Any thoughts/suggestions welcomed.
Is there anyone here who understands how to "tune" the Linux task scheduler? My problem is that at 4.18.9 it does not obviously favour the big cores for CPU heavy jobs which is crazy.
I wrote a little script to demonstrate what I mean - it uses an additional package stress as an example that can spawn n jobs for t seconds. In my case I run the load for 10 seconds, and after 5 seconds have a look which CPUs are in use:
Code:
stress -c 1 -t 10 &
sleep 5
ps -L -o pid,lwp,pcpu,cpuid,time
sleep 6
stress -c 2 -t 10 &
sleep 5
ps -L -o pid,lwp,pcpu,cpuid,time
sleep 6
stress -c 3 -t 10 &
sleep 5
ps -o pid,lwp,pcpu,cpuid,time
sleep 6
stress -c 4 -t 10 &
sleep 5
ps -L -o pid,lwp,pcpu,cpuid,time
On the 4.4.138 (Ayufan) kernel things are pretty much to be expected - CPUs 4 & 5 (the big A72 cores) are always used:
Code:
$ ./testcpu
stress: info: [1043] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
741 741 0.0 4 00:00:00
1042 1042 0.0 0 00:00:00
1043 1043 0.0 4 00:00:00
1045 1045 100 4 00:00:05
1047 1047 0.0 5 00:00:00
stress: info: [1043] successful run completed in 10s
stress: info: [1050] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
741 741 0.0 4 00:00:00
1042 1042 0.0 5 00:00:00
1050 1050 0.0 5 00:00:00
1052 1052 98.6 5 00:00:04
1053 1053 100 4 00:00:05
1055 1055 0.0 0 00:00:00
stress: info: [1050] successful run completed in 10s
stress: info: [1058] dispatching hogs: 3 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
741 741 0.0 4 00:00:00
1042 1042 0.0 5 00:00:00
1058 1058 0.0 4 00:00:00
1060 1060 99.8 0 00:00:04
1061 1061 100 5 00:00:05
1062 1062 100 4 00:00:05
1064 1064 0.0 3 00:00:00
stress: info: [1058] successful run completed in 10s
stress: info: [1067] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
741 741 0.0 4 00:00:00
1042 1042 0.0 4 00:00:00
1067 1067 0.0 2 00:00:00
1069 1069 100 1 00:00:05
1070 1070 100 4 00:00:05
1071 1071 100 3 00:00:05
1072 1072 100 5 00:00:05
1074 1074 0.0 0 00:00:00
stress: info: [1067] successful run completed in 10s
On 4.18.9 (despite all my efforts configuring the kernel) it is more luck whether CPUs 4 or 5 do any work:
Code:
$ ./testcpu
stress: info: [2375] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
2354 2354 0.1 3 00:00:00
2374 2374 0.0 0 00:00:00
2375 2375 0.0 4 00:00:00
2377 2377 83.5 2 00:00:05
2379 2379 1.0 5 00:00:00
stress: info: [2375] successful run completed in 10s
stress: info: [2382] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
2354 2354 0.0 3 00:00:00
2374 2374 0.0 0 00:00:00
2382 2382 0.0 1 00:00:00
2384 2384 100 3 00:00:05
2385 2385 100 4 00:00:05
2387 2387 0.0 0 00:00:00
stress: info: [2382] successful run completed in 10s
stress: info: [2390] dispatching hogs: 3 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
2354 2354 0.0 3 00:00:00
2374 2374 0.0 1 00:00:00
2390 2390 0.0 4 00:00:00
2392 2392 100 3 00:00:05
2393 2393 100 5 00:00:05
2394 2394 100 2 00:00:05
2396 2396 0.0 0 00:00:00
stress: info: [2390] successful run completed in 10s
stress: info: [2399] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
PID LWP %CPU CPUID TIME
2354 2354 0.0 3 00:00:00
2374 2374 0.0 3 00:00:00
2399 2399 0.0 5 00:00:00
2401 2401 100 0 00:00:05
2402 2402 100 4 00:00:05
2403 2403 100 5 00:00:05
2404 2404 100 2 00:00:05
2406 2406 0.0 1 00:00:00
stress: info: [2399] successful run completed in 10s
There does seem to be a kernel CONFIG option at 4.4, ARM_ROCKCHIP_CPUFREQ, that could explain the 4.4 behaviour. And is not available at 4.18.9. But in reality I would expect stock/mainline 4.18 scheduling to cope properly with big.LITTLE CPUs?
Any thoughts/suggestions welcomed.
- ROCKPro64 v2.1 2GB, 16Gb eMMC for rootfs, SX8200Pro 512GB NVMe for /home, HDMI video & sound, Bluetooth keyboard & mouse. Arch (6.2 kernel, Openbox desktop) for general purpose daily PC.
- PinePhone Pro Explorer Edition, daily driver, rk2aw & U-boot on SPI, Arch/SXMO & Arch/phosh on eMMC
- PinePhone BraveHeart now v1.2b 3/32Gb, Tow-boot with Arch/SXMO on eMMC