PINE64
no neon? - Printable Version

+- PINE64 (https://forum.pine64.org)
+-- Forum: ROCK64 (https://forum.pine64.org/forumdisplay.php?fid=85)
+--- Forum: Linux on Rock64 (https://forum.pine64.org/forumdisplay.php?fid=88)
+--- Thread: no neon? (/showthread.php?tid=7013)



no neon? - ab1jx - 01-01-2019

The Rock64 doesn't do Neon?  That's a shock. I was trying to build cpuminer https://github.com/pooler/cpuminer

I have it built and running on 2 or 3 Rpi 3's then I turned to the Rock64.  I get the obnoxious message that the compiler cannot create executables, which can mean anything, there was just some error compiling a test program.

This doesn't work on my Rock64 (does on Pis)
./configure CFLAGS="-O3 -mfpu=neon"

If I drop out the neon part so it's just
./configure CFLAGS="-O3"

Then it works.  But you get just compiled C, not the ARM assembly language that Pooler wrote.  So it's slower than a Pi.  I get 0.97 khash/s instead of 1.32 or so.  (Per thread)

If I do cat /proc/cpuinfo that mentions neon on a Pi, not on a Rock64.


RE: no neon? - ab1jx - 01-02-2019

It's not that it doesn't have Neon, it's that the assembly code was written for 32-bit ARMs.  See https://github.com/pooler/cpuminer/issues/177  Should work fine for other programs.


RE: no neon? - pas059 - 01-03-2019

Hi,
i use 64 bits ARMv8 with neon and this works on rokc64. As i remember 32 bits ARM neon assembly or intrinsics instructions are not recognised on rock64
regards


RE: no neon? - ab1jx - 01-03-2019

I can make it work but it's no faster than the compiled C, something on the order of 0.95 khash/s (per thread) which is about the same as a Raspberry Pi can do without the assembly.  On the Pi I can configure with
./configure CFLAGS="-O3 -mfpu=neon"
which uses the assembly and it speeds up to 1.17 - 1.3 khash/s

On aarch64 it's not valid to use the -mfpu=neon flag so the configure step fails.  The github link led me to https://stackoverflow.com/questions/29851128/gcc-arm64-aarch64-unrecognized-command-line-option-mfpu-neon which led me to https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/AArch64-Options.html#AArch64-Options

I spent most of an hour trying various options but never saw any speedup.  Cpuminer has (at this point) separate assembly language sections for arm, ppc, x64, x86, but not aarch64 which the Rock64 is.

I taught myself x86 assembly in the early 1990s but on an 8088 (8 bit) machine.  The register names are mostly different when you jump even to a 16 bit machine.  Somewhere it says that 32 bit arm is a whole different backend than aarch64.

I'd like to see a variation on cpuminer that ran a couple threads on the CPU and a couple on the GPU but that's far-fetched.  And an ASIC is about 1000 times faster anyway so it's pointless.


RE: no neon? - pas059 - 01-04-2019

(01-01-2019, 10:14 PM)ab1jx Wrote: If I do cat /proc/cpuinfo that mentions neon on a Pi, not on a Rock64.

If I do cat /proc/cpuinfo on my Rock64 under Armbian, it mentions asimd, which is neon, i assume
Regards


RE: no neon? - ab1jx - 01-04-2019

Yes, I think so, I see "fp asimd evtstrm aes pmull sha1 sha2 crc32" from my Ayufan one. But the cpuminer configure script fails because it's looking for neon in a 32-bit context, plus it has handwritten (I think) assembly language code only for 32bit arm. It gives me the "compiler cannot create executables" error because the test program it tries to compile as part of the configure process chokes on the "-mfpu=neon" flag, since neon is built in with aarch64.