no neon?
#4
I can make it work but it's no faster than the compiled C, something on the order of 0.95 khash/s (per thread) which is about the same as a Raspberry Pi can do without the assembly.  On the Pi I can configure with
./configure CFLAGS="-O3 -mfpu=neon"
which uses the assembly and it speeds up to 1.17 - 1.3 khash/s

On aarch64 it's not valid to use the -mfpu=neon flag so the configure step fails.  The github link led me to https://stackoverflow.com/questions/2985...-mfpu-neon which led me to https://gcc.gnu.org/onlinedocs/gcc-4.8.2...64-Options

I spent most of an hour trying various options but never saw any speedup.  Cpuminer has (at this point) separate assembly language sections for arm, ppc, x64, x86, but not aarch64 which the Rock64 is.

I taught myself x86 assembly in the early 1990s but on an 8088 (8 bit) machine.  The register names are mostly different when you jump even to a 16 bit machine.  Somewhere it says that 32 bit arm is a whole different backend than aarch64.

I'd like to see a variation on cpuminer that ran a couple threads on the CPU and a couple on the GPU but that's far-fetched.  And an ASIC is about 1000 times faster anyway so it's pointless.
  Reply


Messages In This Thread
no neon? - by ab1jx - 01-01-2019, 10:14 PM
RE: no neon? - by ab1jx - 01-02-2019, 10:57 PM
RE: no neon? - by pas059 - 01-03-2019, 05:30 AM
RE: no neon? - by ab1jx - 01-03-2019, 12:50 PM
RE: no neon? - by pas059 - 01-04-2019, 05:12 AM
RE: no neon? - by ab1jx - 01-04-2019, 11:16 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  gcc neon assembler pas059 8 10,227 08-14-2018, 10:10 AM
Last Post: pas059

Forum Jump:


Users browsing this thread: 2 Guest(s)