gcc neon assembler - Printable Version +- PINE64 (https://forum.pine64.org) +-- Forum: ROCK64 (https://forum.pine64.org/forumdisplay.php?fid=85) +--- Forum: Linux on Rock64 (https://forum.pine64.org/forumdisplay.php?fid=88) +--- Thread: gcc neon assembler (/showthread.php?tid=6287) |
gcc neon assembler - pas059 - 07-16-2018 Hi, When i add in-line neon assembly code in a C++ program, i get the error: "Unknown mnemonic". My version of gcc is: gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0. Someone can help me? regards RE: gcc neon assembler - ab1jx - 07-21-2018 The GCC manual is something like 900 pages but there are tons of commandline switches, you probably need an -march or something. https://gcc.gnu.org/onlinedocs/gcc/index.html#SEC_Contents I thought I saw something about NEON in there last time I was looking for something else. RE: gcc neon assembler - pas059 - 07-21-2018 hi, i had tested with this option: '-march=armv8-a+simd neon2.c' (or with arm8.1/2/3/4...) but this gives always the same errors like: /tmp/cc6QtiJV.s:22: Error: unknown mnemonic `vadd.i16' -- `vadd.i16 q0,q1,q2' The strange thing is that using intrinsics works! any idea? regards RE: gcc neon assembler - z4v4l - 07-21-2018 (07-21-2018, 08:52 AM)pas059 Wrote: hi,are you using aarch32 compilers? what bitness your code targets? RE: gcc neon assembler - pas059 - 07-23-2018 hi, as i specify armv8... as march option, i assume that this an arch64 that is used. gcc comes with ayufan's ubuntu 18.04 image, so i assume that this arch64 is the default ($gcc -dumpmachine gives aarch64-linux-gnu). regards RE: gcc neon assembler - z4v4l - 07-23-2018 (07-23-2018, 01:47 AM)pas059 Wrote: hi,the instructions you showed are NOT a64 simd instructions, it's a32 simd instructions. take a look at ARM ARM, where everything is described. And there is an alphabetical list of appropriate instructions too. RE: gcc neon assembler - pas059 - 07-23-2018 hi, i didn't know until today that Arm had changed the mnemonics between ARMv7 and ARMv8 . In all the documents i have (and which are about intrinsics) all the equivalences are given for ARMv7. Indeed, using ARMv8 mnemonics, this compiles better . thanks you z4v4l RE: gcc neon assembler - z4v4l - 07-23-2018 (07-23-2018, 11:37 AM)pas059 Wrote: hi,it's not just a mnemonics change, it's a totally different ISA. RE: gcc neon assembler - pas059 - 08-14-2018 Hi, just some news. So, i rewrote some functions from Neon/intrinsics to Neon/assembler in the hope of a performances improvment. Just after finishing the translations intrinsics/assembler, i was very enthousiast , because the assembler version was running 4 time faster than the intrinsic, but in fact this result was obtained with the debug versions. With the release versions, the results were quasi identical, and perhaps even more that the intrinsic versions runs a little faster than the assembler , but, honestly, there is no significant gap. The only noticable diffenrece is on the code size which is 3 times more important with the intrinsic version. So, my conclusion, in my cases, is that the compiler generates very fast code using intrinsic. Other remark, this algorithm (image processing), initially written on a PC/Windows,runs in less than one 1 msec on 1 core of an i7-4790 processor at 3.6GHz, and it takes ~4msec on 1 core of my rock64 (the ARM/Neon version is a little more optimzed than the intel version which also uses intrinsics). At the beginning, i thought that the difference will be more important, but, thanks to Neon, the final result is a little better than expected, and a rock64 is much cheap than an intel solution, and consumes much less power. I think that something which could increases the performance of the Rock64 will be a faster memory; the one of the rock64 works in 32bits, altought the processor supports 64 bits access. regards |