[dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization
Zhihong Wang
zhihong.wang at intel.com
Thu Jan 29 03:38:43 CET 2015
This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
It also extends memcpy test coverage with unaligned cases and more test points.
Optimization techniques are summarized below:
1. Utilize full cache bandwidth
2. Enforce aligned stores
3. Apply load address alignment based on architecture features
4. Make load/store address available as early as possible
5. General optimization techniques like inlining, branch reducing, prefetch pattern access
--------------
Changes in v2:
1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast build
2. Modified macro definition for better code readability & safety
Zhihong Wang (4):
app/test: Disabled VTA for memcpy test in app/test/Makefile
app/test: Removed unnecessary test cases in app/test/test_memcpy.c
app/test: Extended test coverage in app/test/test_memcpy_perf.c
lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE
and AVX platforms
app/test/Makefile | 6 +
app/test/test_memcpy.c | 52 +-
app/test/test_memcpy_perf.c | 220 ++++---
.../common/include/arch/x86/rte_memcpy.h | 680 +++++++++++++++------
4 files changed, 654 insertions(+), 304 deletions(-)
--
1.9.3
More information about the dev
mailing list