[dpdk-dev] [PATCH 0/4] Optimize memcpy for AVX512 platforms
Stephen Hemminger
stephen at networkplumber.org
Thu Jan 14 17:48:32 CET 2016
On Thu, 14 Jan 2016 01:13:18 -0500
Zhihong Wang <zhihong.wang at intel.com> wrote:
> This patch set optimizes DPDK memcpy for AVX512 platforms, to make full
> utilization of hardware resources and deliver high performance.
>
> In current DPDK, memcpy holds a large proportion of execution time in
> libs like Vhost, especially for large packets, and this patch can bring
> considerable benefits.
>
> The implementation is based on the current DPDK memcpy framework, some
> background introduction can be found in these threads:
> http://dpdk.org/ml/archives/dev/2014-November/008158.html
> http://dpdk.org/ml/archives/dev/2015-January/011800.html
>
> Code changes are:
>
> 1. Read CPUID to check if AVX512 is supported by CPU
>
> 2. Predefine AVX512 macro if AVX512 is enabled by compiler
>
> 3. Implement AVX512 memcpy and choose the right implementation based on
> predefined macros
>
> 4. Decide alignment unit for memcpy perf test based on predefined macros
>
> Zhihong Wang (4):
> lib/librte_eal: Identify AVX512 CPU flag
> mk: Predefine AVX512 macro for compiler
> lib/librte_eal: Optimize memcpy for AVX512 platforms
> app/test: Adjust alignment unit for memcpy perf test
>
> app/test/test_memcpy_perf.c | 6 +
> .../common/include/arch/x86/rte_cpuflags.h | 2 +
> .../common/include/arch/x86/rte_memcpy.h | 247 ++++++++++++++++++++-
> mk/rte.cpuflags.mk | 4 +
> 4 files changed, 255 insertions(+), 4 deletions(-)
>
This really looks like code that could benefit from Gcc
function multiversioning. The current cpuflags model is useless/flawed
in real product deployment
More information about the dev
mailing list