gcc neon assembler
#1
Hi,

When i add in-line neon assembly code in a C++ program, i get the error: "Unknown mnemonic".

My version of gcc is: gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0.

Someone can help me?

regards
  Reply
#2
The GCC manual is something like 900 pages but there are tons of commandline switches, you probably need an -march or something. https://gcc.gnu.org/onlinedocs/gcc/index...C_Contents I thought I saw something about NEON in there last time I was looking for something else.
  Reply
#3
hi,
i had tested with this option: '-march=armv8-a+simd neon2.c' (or with arm8.1/2/3/4...) but this gives always the same errors like:

/tmp/cc6QtiJV.s:22: Error: unknown mnemonic `vadd.i16' -- `vadd.i16 q0,q1,q2'

The strange thing is that using intrinsics works!

any idea?
regards
  Reply
#4
(07-21-2018, 08:52 AM)pas059 Wrote: hi,
i had tested with this option: '-march=armv8-a+simd neon2.c' (or with arm8.1/2/3/4...) but this gives always the same errors like:

/tmp/cc6QtiJV.s:22: Error: unknown mnemonic `vadd.i16' -- `vadd.i16 q0,q1,q2'

The strange thing is that using intrinsics works!

any idea?
regards
are you using aarch32 compilers? what bitness your code targets?
ANT - my hobby OS for x86 and ARM.
  Reply
#5
hi,
as i specify armv8... as march option, i assume that this an arch64 that is used.
gcc comes with ayufan's ubuntu 18.04 image, so i assume that this arch64 is the default ($gcc -dumpmachine gives aarch64-linux-gnu).
regards
  Reply
#6
(07-23-2018, 01:47 AM)pas059 Wrote: hi,
as i specify armv8... as march option, i assume that this an arch64 that is used.
gcc comes with ayufan's ubuntu 18.04 image, so i assume that this arch64 is the default ($gcc -dumpmachine gives aarch64-linux-gnu).
regards
the instructions you showed are NOT a64 simd instructions, it's a32 simd instructions. Smile take a look at ARM ARM, where everything is described. And there is an alphabetical list of appropriate instructions too.
ANT - my hobby OS for x86 and ARM.
  Reply
#7
hi,
i didn't know until today that Arm had changed the mnemonics between ARMv7 and ARMv8 Blush . In all the documents i have (and which are about intrinsics) all the equivalences are given for ARMv7. Indeed, using ARMv8 mnemonics, this compiles better  Blush  Big Grin . thanks you z4v4l
  Reply
#8
(07-23-2018, 11:37 AM)pas059 Wrote: hi,
i didn't know until today that Arm had changed the mnemonics between ARMv7 and ARMv8 Blush . In all the documents i have (and which are about intrinsics) all the equivalences are given for ARMv7. Indeed, using ARMv8 mnemonics, this compiles better  Blush  Big Grin . thanks you z4v4l
it's not just a mnemonics change, it's a totally different ISA. Smile
ANT - my hobby OS for x86 and ARM.
  Reply
#9
Hi,
just some news.
So, i rewrote some functions from Neon/intrinsics to Neon/assembler in the hope of a performances improvment. Just after finishing the translations intrinsics/assembler, i was very enthousiast  Smile , because the assembler version was running 4 time faster than the intrinsic, but in fact this result was obtained with the debug versions. With the release versions, the results were quasi identical, and perhaps even more that the intrinsic versions runs a little faster than the assembler Big Grin  , but, honestly, there is no significant gap. The only noticable diffenrece is on the code size which is 3 times more important with the intrinsic version.
So, my conclusion, in my cases, is that the compiler generates very fast code using intrinsic.

Other remark, this algorithm (image processing), initially written on a PC/Windows,runs in less than one 1 msec on 1 core of an i7-4790 processor at 3.6GHz, and it  takes ~4msec on 1 core of my rock64 (the ARM/Neon version is a little more optimzed than the intel version which also uses intrinsics). At the beginning, i thought that the difference will be more important, but, thanks to Neon, the final result is a little better than expected, and a rock64 is much cheap than an intel solution, and consumes much less power.
I think that something which could increases the performance of the Rock64 will be a faster memory; the one of the rock64 works in 32bits, altought the processor supports 64 bits access. Smile
regards
  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  no neon? ab1jx 5 7,319 01-04-2019, 11:16 AM
Last Post: ab1jx

Forum Jump:


Users browsing this thread: 5 Guest(s)