[dpdk-dev] [PATCH 0/4] Optimize memcpy for AVX512 platforms

Stephen Hemminger stephen at networkplumber.org
Thu Jan 14 17:48:32 CET 2016


On Thu, 14 Jan 2016 01:13:18 -0500
Zhihong Wang <zhihong.wang at intel.com> wrote:

> This patch set optimizes DPDK memcpy for AVX512 platforms, to make full
> utilization of hardware resources and deliver high performance.
> 
> In current DPDK, memcpy holds a large proportion of execution time in
> libs like Vhost, especially for large packets, and this patch can bring
> considerable benefits.
> 
> The implementation is based on the current DPDK memcpy framework, some
> background introduction can be found in these threads:
> http://dpdk.org/ml/archives/dev/2014-November/008158.html
> http://dpdk.org/ml/archives/dev/2015-January/011800.html
> 
> Code changes are:
> 
>   1. Read CPUID to check if AVX512 is supported by CPU
> 
>   2. Predefine AVX512 macro if AVX512 is enabled by compiler
> 
>   3. Implement AVX512 memcpy and choose the right implementation based on
>      predefined macros
> 
>   4. Decide alignment unit for memcpy perf test based on predefined macros
> 
> Zhihong Wang (4):
>   lib/librte_eal: Identify AVX512 CPU flag
>   mk: Predefine AVX512 macro for compiler
>   lib/librte_eal: Optimize memcpy for AVX512 platforms
>   app/test: Adjust alignment unit for memcpy perf test
> 
>  app/test/test_memcpy_perf.c                        |   6 +
>  .../common/include/arch/x86/rte_cpuflags.h         |   2 +
>  .../common/include/arch/x86/rte_memcpy.h           | 247 ++++++++++++++++++++-
>  mk/rte.cpuflags.mk                                 |   4 +
>  4 files changed, 255 insertions(+), 4 deletions(-)
> 

This really looks like code that could benefit from Gcc
function multiversioning. The current cpuflags model is useless/flawed
in real product deployment


More information about the dev mailing list