[dpdk-stable] patch 'net/mlx5: fix instruction hotspot on replenishing Rx buffer' has been queued to LTS release 17.11.6

Yongseok Koh yskoh at mellanox.com
Fri Mar 8 18:47:23 CET 2019


Hi,

FYI, your patch has been queued to LTS release 17.11.6

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objection by 03/13/19. So please
shout if anyone has objection.

Also note that after the patch there's a diff of the upstream commit vs the patch applied
to the branch. If the code is different (ie: not only metadata diffs), due for example to
a change in context or macro names, please double check it.

Thanks.

Yongseok

---
>From 11c39d0780379ffd44ba81c9f0f93bd15d4505c8 Mon Sep 17 00:00:00 2001
From: Yongseok Koh <yskoh at mellanox.com>
Date: Mon, 14 Jan 2019 13:16:22 -0800
Subject: [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer

[ backported from upstream commit 12d468a62bc19ca08ee9964dcb923f67f87fba7d ]

On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")

Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
Acked-by: Shahaf Shuler <shahafs at mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index 750559b8d..59ae83b56 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -116,8 +116,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-					      RTE_PKTMBUF_HEADROOM);
+		uintptr_t buf_addr =
+			(uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
+			rte_pktmbuf_priv_size(rxq->mp);
+
+		assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64(buf_addr + RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKEY in WQEs. */
 		if (unlikely(!IS_SINGLE_MR(rxq->mr_ctrl.bh_n)))
 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
-- 
2.11.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2019-03-08 09:46:42.476434526 -0800
+++ 0045-net-mlx5-fix-instruction-hotspot-on-replenishing-Rx-.patch	2019-03-08 09:46:40.220409000 -0800
@@ -1,38 +1,42 @@
-From 12d468a62bc19ca08ee9964dcb923f67f87fba7d Mon Sep 17 00:00:00 2001
+From 11c39d0780379ffd44ba81c9f0f93bd15d4505c8 Mon Sep 17 00:00:00 2001
 From: Yongseok Koh <yskoh at mellanox.com>
 Date: Mon, 14 Jan 2019 13:16:22 -0800
 Subject: [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer
 
+[ backported from upstream commit 12d468a62bc19ca08ee9964dcb923f67f87fba7d ]
+
 On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
 to be accessed as it is static and easily calculated from the mbuf address.
 Accessing the mbuf content causes unnecessary load stall and it is worsened
 on ARM.
 
 Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
-Cc: stable at dpdk.org
 
 Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
 Acked-by: Shahaf Shuler <shahafs at mellanox.com>
 ---
- drivers/net/mlx5/mlx5_rxtx_vec.h | 5 ++++-
- 1 file changed, 4 insertions(+), 1 deletion(-)
+ drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
-index fda7004e2..5df8e291e 100644
+index 750559b8d..59ae83b56 100644
 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
 +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
-@@ -102,7 +102,10 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
+@@ -116,8 +116,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
  		return;
  	}
  	for (i = 0; i < n; ++i) {
 -		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
-+		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+-					      RTE_PKTMBUF_HEADROOM);
++		uintptr_t buf_addr =
++			(uintptr_t)elts[i] + sizeof(struct rte_mbuf) +
++			rte_pktmbuf_priv_size(rxq->mp);
 +
-+		assert(buf_addr == elts[i]->buf_addr);
-+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
- 					      RTE_PKTMBUF_HEADROOM);
- 		/* If there's only one MR, no need to replace LKey in WQE. */
- 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))
++		assert(buf_addr == (uintptr_t)elts[i]->buf_addr);
++		wq[i].addr = rte_cpu_to_be_64(buf_addr + RTE_PKTMBUF_HEADROOM);
+ 		/* If there's only one MR, no need to replace LKEY in WQEs. */
+ 		if (unlikely(!IS_SINGLE_MR(rxq->mr_ctrl.bh_n)))
+ 			wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]);
 -- 
 2.11.0
 


More information about the stable mailing list