[dpdk-stable] [dpdk-dev] [PATCH] net/mlx5: revert mbuf address calculation for x86

Yongseok Koh yskoh at mellanox.com
Wed Mar 27 23:21:19 CET 2019


> On Mar 27, 2019, at 4:51 AM, Kevin Traynor <ktraynor at redhat.com> wrote:
> 
> On 25/03/2019 19:13, Yongseok Koh wrote:
>> When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
>> loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
>> vulnerable to load stall. For x86, reducing the number of instructions
>> seems to matter most.
>> 
>> For x86, this is simply a load but for other architectures, it is
>> calculated from the address of mbuf structure by rte_mbuf_buf_addr()
>> without having to load the first cacheline of the mbuf.
>> 
> 
> Hi Yongseok,
> 
>> Fixes: 12d468a62bc1 ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")
> 
> A similar backport was just added into 18.11.1-RC2, should it be
> reverted? I'm not keen to put another fix for it in for 18.11.1 at this
> stage, I think it can be part of 18.11.2. WDYT?

I spoke with Kevin and we decided to drop the old fix.
I have also dropped it from 17.11.6-rc1.

This new fix will be merged to 18.11.2.
I'll merge it to 17.11.6 (or 17.11.7) if it is merged in the master.


thanks,
Yongseok

>> Cc: stable at dpdk.org
>> 
>> Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
>> ---
>> drivers/net/mlx5/mlx5_rxtx_vec.h | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
>> index 5df8e291e6..4220b08dd2 100644
>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
>> @@ -102,9 +102,21 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
>> 		return;
>> 	}
>> 	for (i = 0; i < n; ++i) {
>> -		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
>> +		void *buf_addr;
>> 
>> +		/*
>> +		 * Load the virtual address for Rx WQE. non-x86 processors
>> +		 * (mostly RISC such as ARM and Power) are more vulnerable to
>> +		 * load stall. For x86, reducing the number of instructions
>> +		 * seems to matter most.
>> +		 */
>> +#ifdef RTE_ARCH_X86_64
>> +		buf_addr = elts[i]->buf_addr;
>> +		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
>> +#else
>> +		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
>> 		assert(buf_addr == elts[i]->buf_addr);
>> +#endif
>> 		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
>> 					      RTE_PKTMBUF_HEADROOM);
>> 		/* If there's only one MR, no need to replace LKey in WQE. */
>> 
> 



More information about the stable mailing list