Howto test and tune Gigabit Networking (1000M)
#11
Just FYI: Using Armbian/Xenial with our default settings without any tuning I get 930 MBits/sec on average with Pine64+:

Code:
root@armbian:/var/git/Armbian# iperf3 -c 192.168.83.64
Connecting to host 192.168.83.64, port 5201
[  4] local 192.168.83.115 port 60392 connected to 192.168.83.64 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   112 MBytes   938 Mbits/sec    0    356 KBytes      
[  4]   1.00-2.00   sec   112 MBytes   941 Mbits/sec    0    376 KBytes      
[  4]   2.00-3.00   sec   112 MBytes   943 Mbits/sec    0    376 KBytes      
[  4]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    376 KBytes      
[  4]   4.00-5.00   sec   112 MBytes   938 Mbits/sec    0    376 KBytes      
[  4]   5.00-6.00   sec   113 MBytes   947 Mbits/sec    0    376 KBytes      
[  4]   6.00-7.00   sec   112 MBytes   940 Mbits/sec    0    395 KBytes      
[  4]   7.00-8.00   sec   112 MBytes   942 Mbits/sec    0    395 KBytes      
[  4]   8.00-9.00   sec   112 MBytes   942 Mbits/sec    0    395 KBytes      
[  4]   9.00-10.00  sec   112 MBytes   942 Mbits/sec    0    395 KBytes      
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec                  receiver

root@pine64:~# iperf3 -c 192.168.83.115
Connecting to host 192.168.83.115, port 5201
[  4] local 192.168.83.64 port 39363 connected to 192.168.83.115 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   114 MBytes   954 Mbits/sec    0   1.05 MBytes      
[  4]   1.00-2.00   sec   110 MBytes   922 Mbits/sec    0   1.24 MBytes      
[  4]   2.00-3.01   sec   110 MBytes   918 Mbits/sec    0   1.24 MBytes      
[  4]   3.01-4.00   sec   109 MBytes   917 Mbits/sec    0   1.24 MBytes      
[  4]   4.00-5.01   sec   110 MBytes   918 Mbits/sec    0   1.24 MBytes      
[  4]   5.01-6.01   sec   110 MBytes   923 Mbits/sec    0   1.24 MBytes      
[  4]   6.01-7.00   sec   109 MBytes   918 Mbits/sec    0   1.24 MBytes      
[  4]   7.00-8.00   sec   110 MBytes   923 Mbits/sec    0   1.24 MBytes      
[  4]   8.00-9.00   sec   109 MBytes   912 Mbits/sec    0   1.24 MBytes      
[  4]   9.00-10.00  sec   110 MBytes   923 Mbits/sec    0   1.24 MBytes      
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.07 GBytes   923 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.07 GBytes   920 Mbits/sec                  receiver

Important: 
  • iperf/iperf3 act single threaded in one operation mode, so you might end up with network performance in reality bottlenecked by CPU and the benchmarking tool in question (test methodology being wrong)
  • Different Linux distros use different compiler settings which might also affect performance (for example, the sysbench binary shows 'performance numbers' 30 percent better when running with Ubuntu Xenial compared to Debian Jessie -- just by using other compiler switches when creating the distro's packages). I leave it up to the audience to test this out with the various distros available for Pine64 whether/how iperf/iperf3 are also affected (testing is easy, just try out various distros with iperf/iperf3 against a host that is known to exceed 900 Mbits/sec, reduce Pine64's cpufreq to 480 MHz and record the numbers and you see how your benchmarking tool 'performs')
  • The cpufreq governor might make a difference since for example with ondemand cpufreq scaling works way too slow if only network activity increases. So in case the wrong tool and methodology is combined (iperf instead of iperf3 and default test execution of 10 seconds) you get wrong numbers or at least misleading ones. Never use any of these benchmarking tools without also running tools like htop and also longsleep's pine64_health.sh or Armbian's 'sudo armbianmonitor -m' (or install RPi-Monitor, it's easy and costs you only a few minutes)
  • With longsleep's network tuning script applied performance drops down to ~800 Mbits/sec in Armbian -- this should be avoided when running Armbian (see updated explanation in Armbian forum, only variation increases a lot)
  • Armbian images are always created from scratch and have not been tampered with by humans. So such mistakes as known from the official Pine64 Linux images leading to ARP nightmares will not happen. We set the MAC address on first boot.
  • My test setup consists of a x86 box connected to the same el cheapo Gbit switch as Pine64 and a couple of other SBC. Most importantly Pine64 is powered reliably (that means not using Micro USB but the pins on the Euler header) and stays cool. It's known that RTL8211E Gbit PHY as used on Pine64+ needs an additional 350mW, it's known that using Micro USB for DC-IN is crap (consumption increase leads to voltage drops), it's also known that an overheating SoC consumes more energy than a cool one (details) which might add to powering/undervoltage related problems
  • Iperf/iperf3 numbers don't tell the whole story, they're useful to rule out severe problems but only when used in active benchmarking mode trying to check for the real bottlenecks.

Details: http://forum.armbian.com/index.php/topic...5-devices/

So when throughput differences between various Pine64 are reported without telling which distro (compiler switches affecting the benchmark tool's 'performance') and which cpufreq scaling governor is used (affecting speed of adjusting CPU clockspeed) only numbers without meaning were generated. Always check with htop CPU utilization since iperf/iperf3 are single threaded in one mode and keep in mind that thermal settings (overheating/throttling) therefore might affect 'network throughput'.

As an example: Pine64 running mainline kernel with new Ethernet driver written from scratch shows these iperf numbers: 430/620 Mbits/secs depending on direction. Why? Since with mainline kernel we still have no access to PMIC so dvfs / cpufreq scaling isn't working and the SoC is clocked conservatively with just 408 MHz instead of jumping between 480 and 1152 as with BSP kernel using our community settings. And that's the simple reason why 'network performance' is affected there: since iperf/iperf3 are bottlenecked by lower CPU clockspeed on mainline kernel. In reality iperf/iperf3 are reporting CPU clockspeed and not network throughput (and the same applies to IO performance as well, A64 is a tablet SoC where everything related to network and IO depends on CPU clockspeed -- mainline kernel means 408 MHz now and this explains the slower numbers). 

Always know your tools and never trust them. Do active benchmarking Smile
  Reply


Messages In This Thread
RE: Howto test and tune Gigabit Networking (1000M) - by tkaiser - 08-29-2016, 12:14 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Howto use MAC address as printed on the label on the back longsleep 3 6,690 08-21-2016, 07:31 AM
Last Post: MarkHaysHarris777

Forum Jump:


Users browsing this thread: 8 Guest(s)