[dpdk-dev] [PATCH v6] arch/arm: optimization for memcpy on ARM64

Thomas Monjalon thomas at monjalon.net
Sat Jan 20 17:21:13 CET 2018


19/01/2018 07:10, Herbert Guan:
> This patch provides an option to do rte_memcpy() using 'restrict'
> qualifier, which can induce GCC to do optimizations by using more
> efficient instructions, providing some performance gain over memcpy()
> on some ARM64 platforms/enviroments.
> 
> The memory copy performance differs between different ARM64
> platforms. And a more recent glibc (e.g. 2.23 or later)
> can provide a better memcpy() performance compared to old glibc
> versions. It's always suggested to use a more recent glibc if
> possible, from which the entire system can get benefit. If for some
> reason an old glibc has to be used, this patch is provided for an
> alternative.
> 
> This implementation can improve memory copy on some ARM64
> platforms, when an old glibc (e.g. 2.19, 2.17...) is being used.
> It is disabled by default and needs "RTE_ARCH_ARM64_MEMCPY"
> defined to activate. It's not always proving better performance
> than memcpy() so users need to run DPDK unit test
> "memcpy_perf_autotest" and customize parameters in "customization
> section" in rte_memcpy_64.h for best performance.
> 
> Compiler version will also impact the rte_memcpy() performance.
> It's observed on some platforms and with the same code, GCC 7.2.0
> compiled binary can provide better performance than GCC 4.8.5. It's
> suggested to use GCC 5.4.0 or later.
> 
> Signed-off-by: Herbert Guan <herbert.guan at arm.com>
> Acked-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>

Applied, thanks



More information about the dev mailing list