[v5,2/2] net/mlx5: fix instruction hotspot on replenishing Rx buffer

Message ID 20190114211622.6900-2-yskoh@mellanox.com (mailing list archive)
State Accepted, archived
Headers
Series None |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Yongseok Koh Jan. 14, 2019, 9:16 p.m. UTC
  On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---

v5:
* no change

v4:
* no change

v3:
* rte_mbuf_buf_addr_default() -> rte_mbuf_buf_addr()

v2:
* use the newly introduced API - rte_mbuf_buf_addr_default()
* fix error in assert

 drivers/net/mlx5/mlx5_rxtx_vec.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
  

Comments

Kevin Traynor Feb. 6, 2019, 3:54 p.m. UTC | #1
On 01/14/2019 09:16 PM, Yongseok Koh wrote:
> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
> to be accessed as it is static and easily calculated from the mbuf address.
> Accessing the mbuf content causes unnecessary load stall and it is worsened
> on ARM.
> 
> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
> Cc: stable@dpdk.org
> 

This is using the API introduced in 1/2, so it's not really suitable for
backport. Maybe you want to send an alternative for stable?

> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
  
Kevin Traynor Feb. 21, 2019, 7:10 p.m. UTC | #2
Hi Yongseok,

Can you let me know how you want to proceed with the below. I think we
could just drop as it's a performance optimization, or maybe you have a
different idea.

thanks,
Kevin.

On 06/02/2019 15:54, Kevin Traynor wrote:
> On 01/14/2019 09:16 PM, Yongseok Koh wrote:
>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
>> to be accessed as it is static and easily calculated from the mbuf address.
>> Accessing the mbuf content causes unnecessary load stall and it is worsened
>> on ARM.
>>
>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
>> Cc: stable@dpdk.org
>>
> 
> This is using the API introduced in 1/2, so it's not really suitable for
> backport. Maybe you want to send an alternative for stable?
> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
>> ---
  
Yongseok Koh March 8, 2019, 2:05 a.m. UTC | #3
Oops, I missed this email somehow,
probably out of mind during my long vacation. :-)

I've also encountered this issue with 17.11.6.
http://git.dpdk.org/dpdk-stable/commit/?h=17.11&id=63f06f3fccc87c55adb33248e5c68a0175d213f1

I'll send a backport to you.


Sorry for late reply.
Yongseok

> On Feb 21, 2019, at 11:10 AM, Kevin Traynor <ktraynor@redhat.com> wrote:
> 
> Hi Yongseok,
> 
> Can you let me know how you want to proceed with the below. I think we
> could just drop as it's a performance optimization, or maybe you have a
> different idea.
> 
> thanks,
> Kevin.
> 
> On 06/02/2019 15:54, Kevin Traynor wrote:
>> On 01/14/2019 09:16 PM, Yongseok Koh wrote:
>>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
>>> to be accessed as it is static and easily calculated from the mbuf address.
>>> Accessing the mbuf content causes unnecessary load stall and it is worsened
>>> on ARM.
>>> 
>>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx")
>>> Cc: stable@dpdk.org
>>> 
>> 
>> This is using the API introduced in 1/2, so it's not really suitable for
>> backport. Maybe you want to send an alternative for stable?
>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
>>> ---
>
  

Patch

diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
index fda7004e2d..5df8e291e6 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
@@ -102,7 +102,10 @@  mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
 		return;
 	}
 	for (i = 0; i < n; ++i) {
-		wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr +
+		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
+
+		assert(buf_addr == elts[i]->buf_addr);
+		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
 					      RTE_PKTMBUF_HEADROOM);
 		/* If there's only one MR, no need to replace LKey in WQE. */
 		if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1))