[dpdk-dev] [PATCH v3 3/3] lib/eal: add temporal store memcpy support on AMD platform

Thomas Monjalon thomas at monjalon.net
Tue Oct 26 18:14:08 CEST 2021


26/10/2021 17:56, Aman Kumar:
> This patch provides a rte_memcpy* call with temporal stores.
> Use -Dcpu_instruction_set=znverX with build to enable this API.
> 
> Signed-off-by: Aman Kumar <aman.kumar at vvdntech.in>
> ---
>  config/x86/meson.build           |   2 +
>  lib/eal/x86/include/rte_memcpy.h | 114 +++++++++++++++++++++++++++++++

It looks better as C code.
Do you achieve the same performance as the asm version?

> +#if defined RTE_MEMCPY_AMDEPYC
[...]
> +static __rte_always_inline void *
> +rte_memcpy_aligned_tstore16_generic(void *dst, void *src, int len)

So to be clear, an application will benefit of this optimization if
1/ DPDK is specifically compiled for AMD
2/ the application is compiled with above DPDK build (because of inlinining)

I guess there is no good way to benefit from the optimization
without specific compilation, because of inlining constraint.
Another design, with less constraint but less performance,
would be to have a function pointer assigned at runtime based on the CPU.




More information about the dev mailing list