View Single Post
Posts: 191 | Thanked: 415 times | Joined on Jan 2012
#475
Originally Posted by Skry View Post
If we think about the compiler options used, in practice we would be optimizing for cortex-a8 with neon and thumb2. This would be a benefit for other similar devices too.
From the debian wiki:
Code:
NEON

NEON is an extension of the VFP which allows for very efficient manipulation of matrices, and vector data in general. This is notably useful for processing audio and video data, or for fast memcpy().

Programs usually take advantage of NEON thanks to hand-crafted assembly routines. GCC can automatically vectorize code and generate NEON instructions, however this tends to have limited success. It would seem sensible NOT to require NEON in a new port since some modern ARMv7 SoCs such as Marvell Dove and NVidia Tegra2 don't implement it.

It is also possible to use NEON instructions for regular scalar floating point code, and this can give significant (2-3x) speedup on Cortex-A8 hardware. However GCC does not currently implement this, and it is not always applicable as NEON instructions are not fully IEEE compliant.