[dpdk-stable] patch 'net/i40e: fix risk in descriptor read in NEON Rx' has been queued to stable release 20.11.4

Xueming Li xuemingl at nvidia.com
Wed Nov 10 07:30:50 CET 2021


Hi,

FYI, your patch has been queued to stable release 20.11.4

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/12/21. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/steevenlee/dpdk

This queued commit can be viewed at:
https://github.com/steevenlee/dpdk/commit/364b772782cfb891c250376d6ce020dc13a508d5

Thanks.

Xueming Li <xuemingl at nvidia.com>

---
>From 364b772782cfb891c250376d6ce020dc13a508d5 Mon Sep 17 00:00:00 2001
From: Ruifeng Wang <ruifeng.wang at arm.com>
Date: Wed, 15 Sep 2021 16:33:38 +0800
Subject: [PATCH] net/i40e: fix risk in descriptor read in NEON Rx
Cc: Xueming Li <xuemingl at nvidia.com>

[ upstream commit 778602fe570a138224de94a38eca3ce2e344138c ]

Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.

In NEON vector PMD, vector load loads two contiguous 8B of
descriptor data into vector register. Given vector load ensures no
16B atomicity, read of the word that includes DD field could be
reordered after read of other words. In this case, some words could
contain invalid data.

Read barrier is added after read of qword1 that includes DD field.
And qword0 is reloaded to update vector register. This ensures
that the fetched data is correct.

Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")

Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 0df315b162..67b88e64ec 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -297,6 +297,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
 		descs[1] =  vld1q_u64((uint64_t *)(rxdp + 1));
 		descs[0] =  vld1q_u64((uint64_t *)(rxdp));
 
+		/* Use acquire fence to order loads of descriptor qwords */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+		/* A.2 reload qword0 to make it ordered after qword1 load */
+		descs[3] = vld1q_lane_u64((uint64_t *)(rxdp + 3), descs[3], 0);
+		descs[2] = vld1q_lane_u64((uint64_t *)(rxdp + 2), descs[2], 0);
+		descs[1] = vld1q_lane_u64((uint64_t *)(rxdp + 1), descs[1], 0);
+		descs[0] = vld1q_lane_u64((uint64_t *)(rxdp), descs[0], 0);
+
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);
 
-- 
2.33.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2021-11-10 14:17:09.315002961 +0800
+++ 0166-net-i40e-fix-risk-in-descriptor-read-in-NEON-Rx.patch	2021-11-10 14:17:01.977411885 +0800
@@ -1 +1 @@
-From 778602fe570a138224de94a38eca3ce2e344138c Mon Sep 17 00:00:00 2001
+From 364b772782cfb891c250376d6ce020dc13a508d5 Mon Sep 17 00:00:00 2001
@@ -4,0 +5,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit 778602fe570a138224de94a38eca3ce2e344138c ]
@@ -24 +26,0 @@
-Cc: stable at dpdk.org
@@ -33 +35 @@
-index b2683fda60..71191c7cc8 100644
+index 0df315b162..67b88e64ec 100644
@@ -36 +38 @@
-@@ -286,6 +286,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
+@@ -297,6 +297,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
@@ -48,3 +50,3 @@
- 		/* B.1 load 4 mbuf point */
- 		mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]);
- 		mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]);
+ 		/* B.2 copy 2 mbuf point into rx_pkts  */
+ 		vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);
+ 


More information about the stable mailing list