patch 'net/bnxt: fix reordering in NEON Rx' has been queued to stable release 20.11.6

Xueming Li xuemingl at nvidia.com
Tue Jun 21 10:01:48 CEST 2022


Hi,

FYI, your patch has been queued to stable release 20.11.6

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 06/23/22. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/steevenlee/dpdk

This queued commit can be viewed at:
https://github.com/steevenlee/dpdk/commit/1dc7ed3e26af939e3138183affef017962d613e4

Thanks.

Xueming Li <xuemingl at nvidia.com>

---
>From 1dc7ed3e26af939e3138183affef017962d613e4 Mon Sep 17 00:00:00 2001
From: Ruifeng Wang <ruifeng.wang at arm.com>
Date: Wed, 13 Apr 2022 18:31:56 +0800
Subject: [PATCH] net/bnxt: fix reordering in NEON Rx
Cc: Xueming Li <xuemingl at nvidia.com>

[ upstream commit e7f2effc9220dc5d71b0bb550bcc903badc7bac4 ]

Rx descriptor contains a valid bit which indicates readiness of the rest
of descriptor words. Hence, the word contains valid bit must be read
prior to other words.

In NEON vector path, two contiguous 8B descriptor are loaded to a single
NEON register. Given vector load ensures no 16B atomicity, read of the
word that includes valid bit could be reordered after read of other words.
In this case, data could be invalid.

Reloaded lower 64b after read barrier. This ensures what fetched is
correct.

Also fixed comments that not pertains to Arm platform architecture.

Fixes: deae85145c64 ("net/bnxt: handle multiple packets per loop in vector Rx")

Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde at broadcom.com>
---
 drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c
index 3cb94926fe..858e91bb9d 100644
--- a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c
+++ b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c
@@ -225,25 +225,38 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		}
 
 		/*
-		 * Load the four current descriptors into SSE registers in
-		 * reverse order to ensure consistent state.
+		 * Load the four current descriptors into NEON registers.
+		 * IO barriers are used to ensure consistent state.
 		 */
 		rxcmp1[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 7]);
 		rte_io_rmb();
+		/* Reload lower 64b of descriptors to make it ordered after info3_v. */
+		rxcmp1[3] = vreinterpretq_u32_u64(vld1q_lane_u64
+				((void *)&cpr->cp_desc_ring[cons + 7],
+				vreinterpretq_u64_u32(rxcmp1[3]), 0));
 		rxcmp[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 6]);
 
 		rxcmp1[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 5]);
 		rte_io_rmb();
+		rxcmp1[2] = vreinterpretq_u32_u64(vld1q_lane_u64
+				((void *)&cpr->cp_desc_ring[cons + 5],
+				vreinterpretq_u64_u32(rxcmp1[2]), 0));
 		rxcmp[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 4]);
 
 		t1 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[2], rxcmp1[3]));
 
 		rxcmp1[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 3]);
 		rte_io_rmb();
+		rxcmp1[1] = vreinterpretq_u32_u64(vld1q_lane_u64
+				((void *)&cpr->cp_desc_ring[cons + 3],
+				vreinterpretq_u64_u32(rxcmp1[1]), 0));
 		rxcmp[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 2]);
 
 		rxcmp1[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 1]);
 		rte_io_rmb();
+		rxcmp1[0] = vreinterpretq_u32_u64(vld1q_lane_u64
+				((void *)&cpr->cp_desc_ring[cons + 1],
+				vreinterpretq_u64_u32(rxcmp1[0]), 0));
 		rxcmp[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 0]);
 
 		t0 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[0], rxcmp1[1]));
-- 
2.35.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2022-06-21 15:37:51.232399478 +0800
+++ 0042-net-bnxt-fix-reordering-in-NEON-Rx.patch	2022-06-21 15:37:49.037784585 +0800
@@ -1 +1 @@
-From e7f2effc9220dc5d71b0bb550bcc903badc7bac4 Mon Sep 17 00:00:00 2001
+From 1dc7ed3e26af939e3138183affef017962d613e4 Mon Sep 17 00:00:00 2001
@@ -4,0 +5,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit e7f2effc9220dc5d71b0bb550bcc903badc7bac4 ]
@@ -21 +23,0 @@
-Cc: stable at dpdk.org
@@ -30 +32 @@
-index 779e23ac4f..32f8e59b3a 100644
+index 3cb94926fe..858e91bb9d 100644
@@ -33 +35 @@
-@@ -231,25 +231,38 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+@@ -225,25 +225,38 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)


More information about the stable mailing list