patch 'net/iavf: remove incorrect 16B descriptor read block' has been queued to stable release 23.11.1

Xueming Li xuemingl at nvidia.com
Sat Apr 13 14:48:26 CEST 2024


Hi,

FYI, your patch has been queued to stable release 23.11.1

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 04/15/24. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://git.dpdk.org/dpdk-stable/log/?h=23.11-staging

This queued commit can be viewed at:
https://git.dpdk.org/dpdk-stable/commit/?h=23.11-staging&id=72093d3d41b3a9fcad9010accc7f55e79f205cc9

Thanks.

Xueming Li <xuemingl at nvidia.com>

---
>From 72093d3d41b3a9fcad9010accc7f55e79f205cc9 Mon Sep 17 00:00:00 2001
From: Bruce Richardson <bruce.richardson at intel.com>
Date: Tue, 23 Jan 2024 11:40:50 +0000
Subject: [PATCH] net/iavf: remove incorrect 16B descriptor read block
Cc: Xueming Li <xuemingl at nvidia.com>

[ upstream commit d4ade5d02d188fcbe51871c5a5d66ef075ca0f86 ]

By default, the driver works with 32B descriptors, but has a separate
descriptor read block for reading two descriptors at a time when using
16B descriptors. However, the 32B reads used are not guaranteed to be
atomic, which will cause issues if that is not the case on a system,
since the descriptors may be read in an undefined order.  Remove the
block, to avoid issues, and just use the regular descriptor reading path
for 16B descriptors, if that support is enabled at build time.

Fixes: af0c246a3800 ("net/iavf: enable AVX2 for iavf")

Signed-off-by: Bruce Richardson <bruce.richardson at intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov at intel.com>
---
 drivers/net/iavf/iavf_rxtx_vec_avx2.c | 80 ++++++++-------------------
 1 file changed, 24 insertions(+), 56 deletions(-)

diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx2.c b/drivers/net/iavf/iavf_rxtx_vec_avx2.c
index 510b4d8f1c..49d41af953 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_avx2.c
+++ b/drivers/net/iavf/iavf_rxtx_vec_avx2.c
@@ -193,62 +193,30 @@ _iavf_recv_raw_pkts_vec_avx2(struct iavf_rx_queue *rxq,
 			 _mm256_loadu_si256((void *)&sw_ring[i + 4]));
 #endif
 
-		__m256i raw_desc0_1, raw_desc2_3, raw_desc4_5, raw_desc6_7;
-#ifdef RTE_LIBRTE_IAVF_16BYTE_RX_DESC
-		/* for AVX we need alignment otherwise loads are not atomic */
-		if (avx_aligned) {
-			/* load in descriptors, 2 at a time, in reverse order */
-			raw_desc6_7 = _mm256_load_si256((void *)(rxdp + 6));
-			rte_compiler_barrier();
-			raw_desc4_5 = _mm256_load_si256((void *)(rxdp + 4));
-			rte_compiler_barrier();
-			raw_desc2_3 = _mm256_load_si256((void *)(rxdp + 2));
-			rte_compiler_barrier();
-			raw_desc0_1 = _mm256_load_si256((void *)(rxdp + 0));
-		} else
-#endif
-		{
-			const __m128i raw_desc7 =
-				_mm_load_si128((void *)(rxdp + 7));
-			rte_compiler_barrier();
-			const __m128i raw_desc6 =
-				_mm_load_si128((void *)(rxdp + 6));
-			rte_compiler_barrier();
-			const __m128i raw_desc5 =
-				_mm_load_si128((void *)(rxdp + 5));
-			rte_compiler_barrier();
-			const __m128i raw_desc4 =
-				_mm_load_si128((void *)(rxdp + 4));
-			rte_compiler_barrier();
-			const __m128i raw_desc3 =
-				_mm_load_si128((void *)(rxdp + 3));
-			rte_compiler_barrier();
-			const __m128i raw_desc2 =
-				_mm_load_si128((void *)(rxdp + 2));
-			rte_compiler_barrier();
-			const __m128i raw_desc1 =
-				_mm_load_si128((void *)(rxdp + 1));
-			rte_compiler_barrier();
-			const __m128i raw_desc0 =
-				_mm_load_si128((void *)(rxdp + 0));
-
-			raw_desc6_7 =
-				_mm256_inserti128_si256
-					(_mm256_castsi128_si256(raw_desc6),
-					 raw_desc7, 1);
-			raw_desc4_5 =
-				_mm256_inserti128_si256
-					(_mm256_castsi128_si256(raw_desc4),
-					 raw_desc5, 1);
-			raw_desc2_3 =
-				_mm256_inserti128_si256
-					(_mm256_castsi128_si256(raw_desc2),
-					 raw_desc3, 1);
-			raw_desc0_1 =
-				_mm256_inserti128_si256
-					(_mm256_castsi128_si256(raw_desc0),
-					 raw_desc1, 1);
-		}
+		const __m128i raw_desc7 = _mm_load_si128((void *)(rxdp + 7));
+		rte_compiler_barrier();
+		const __m128i raw_desc6 = _mm_load_si128((void *)(rxdp + 6));
+		rte_compiler_barrier();
+		const __m128i raw_desc5 = _mm_load_si128((void *)(rxdp + 5));
+		rte_compiler_barrier();
+		const __m128i raw_desc4 = _mm_load_si128((void *)(rxdp + 4));
+		rte_compiler_barrier();
+		const __m128i raw_desc3 = _mm_load_si128((void *)(rxdp + 3));
+		rte_compiler_barrier();
+		const __m128i raw_desc2 = _mm_load_si128((void *)(rxdp + 2));
+		rte_compiler_barrier();
+		const __m128i raw_desc1 = _mm_load_si128((void *)(rxdp + 1));
+		rte_compiler_barrier();
+		const __m128i raw_desc0 = _mm_load_si128((void *)(rxdp + 0));
+
+		const __m256i raw_desc6_7 =
+			_mm256_inserti128_si256(_mm256_castsi128_si256(raw_desc6), raw_desc7, 1);
+		const __m256i raw_desc4_5 =
+			_mm256_inserti128_si256(_mm256_castsi128_si256(raw_desc4), raw_desc5, 1);
+		const __m256i raw_desc2_3 =
+			_mm256_inserti128_si256(_mm256_castsi128_si256(raw_desc2), raw_desc3, 1);
+		const __m256i raw_desc0_1 =
+			_mm256_inserti128_si256(_mm256_castsi128_si256(raw_desc0), raw_desc1, 1);
 
 		if (split_packet) {
 			int j;
-- 
2.34.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2024-04-13 20:43:05.929289815 +0800
+++ 0026-net-iavf-remove-incorrect-16B-descriptor-read-block.patch	2024-04-13 20:43:04.937754010 +0800
@@ -1 +1 @@
-From d4ade5d02d188fcbe51871c5a5d66ef075ca0f86 Mon Sep 17 00:00:00 2001
+From 72093d3d41b3a9fcad9010accc7f55e79f205cc9 Mon Sep 17 00:00:00 2001
@@ -4,0 +5,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit d4ade5d02d188fcbe51871c5a5d66ef075ca0f86 ]
@@ -15 +17,0 @@
-Cc: stable at dpdk.org


More information about the stable mailing list