patch 'eal/x86: improve multiple of 64 bytes memcpy performance' has been queued to stable release 22.11.3

Xueming Li xuemingl at nvidia.com
Sun Jun 25 08:33:55 CEST 2023


Hi,

FYI, your patch has been queued to stable release 22.11.3

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 06/27/23. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://git.dpdk.org/dpdk-stable/log/?h=22.11-staging

This queued commit can be viewed at:
https://git.dpdk.org/dpdk-stable/commit/?h=22.11-staging&id=5ecf2e459d480631a3dbe7b77157ace8cef76b33

Thanks.

Xueming Li <xuemingl at nvidia.com>

---
>From 5ecf2e459d480631a3dbe7b77157ace8cef76b33 Mon Sep 17 00:00:00 2001
From: Leyi Rong <leyi.rong at intel.com>
Date: Wed, 29 Mar 2023 17:16:58 +0800
Subject: [PATCH] eal/x86: improve multiple of 64 bytes memcpy performance
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: Xueming Li <xuemingl at nvidia.com>

[ upstream commit 2ef17be88e8b26f871cfb0265227341e36f486ea ]

In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
block copy loops if the size is a multiple of 64. So, let the catch-up
copy the last 64 bytes in this case.

Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")

Suggested-by: Morten Brørup <mb at smartsharesystems.com>
Signed-off-by: Leyi Rong <leyi.rong at intel.com>
Reviewed-by: Morten Brørup <mb at smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson at intel.com>
Reviewed-by: David Marchand <david.marchand at redhat.com>
---
 lib/eal/x86/include/rte_memcpy.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
index d4d7a5cfc8..fd151be708 100644
--- a/lib/eal/x86/include/rte_memcpy.h
+++ b/lib/eal/x86/include/rte_memcpy.h
@@ -846,7 +846,7 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
 	}
 
 	/* Copy 64 bytes blocks */
-	for (; n >= 64; n -= 64) {
+	for (; n > 64; n -= 64) {
 		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
 		dst = (uint8_t *)dst + 64;
 		src = (const uint8_t *)src + 64;
-- 
2.25.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2023-06-25 14:31:59.018237100 +0800
+++ 0017-eal-x86-improve-multiple-of-64-bytes-memcpy-performa.patch	2023-06-25 14:31:58.295773900 +0800
@@ -1 +1 @@
-From 2ef17be88e8b26f871cfb0265227341e36f486ea Mon Sep 17 00:00:00 2001
+From 5ecf2e459d480631a3dbe7b77157ace8cef76b33 Mon Sep 17 00:00:00 2001
@@ -7,0 +8,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit 2ef17be88e8b26f871cfb0265227341e36f486ea ]


More information about the stable mailing list