[dpdk-dev] [PATCH] net/mlx5: revert mbuf address calculation for x86

Kevin Traynor ktraynor at redhat.com
Wed Mar 27 12:51:30 CET 2019


On 25/03/2019 19:13, Yongseok Koh wrote:
> When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
> loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
> vulnerable to load stall. For x86, reducing the number of instructions
> seems to matter most.
> 
> For x86, this is simply a load but for other architectures, it is
> calculated from the address of mbuf structure by rte_mbuf_buf_addr()
> without having to load the first cacheline of the mbuf.
> 

Hi Yongseok,

> Fixes: 12d468a62bc1 ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")

A similar backport was just added into 18.11.1-RC2, should it be
reverted? I'm not keen to put another fix for it in for 18.11.1 at this
stage, I think it can be part of 18.11.2. WDYT?

thanks,
Kevin.

> Cc: stable at dpdk.org
> 
> Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_rxtx_vec.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h
> index 5df8e291e6..4220b08dd2 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -102,9 +102,21 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n)
>  		return;
>  	}
>  	for (i = 0; i < n; ++i) {
> -		void *buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
> +		void *buf_addr;
>  
> +		/*
> +		 * Load the virtual address for Rx WQE. non-x86 processors
> +		 * (mostly RISC such as ARM and Power) are more vulnerable to
> +		 * load stall. For x86, reducing the number of instructions
> +		 * seems to matter most.
> +		 */
> +#ifdef RTE_ARCH_X86_64
> +		buf_addr = elts[i]->buf_addr;
> +		assert(buf_addr == rte_mbuf_buf_addr(elts[i], rxq->mp));
> +#else
> +		buf_addr = rte_mbuf_buf_addr(elts[i], rxq->mp);
>  		assert(buf_addr == elts[i]->buf_addr);
> +#endif
>  		wq[i].addr = rte_cpu_to_be_64((uintptr_t)buf_addr +
>  					      RTE_PKTMBUF_HEADROOM);
>  		/* If there's only one MR, no need to replace LKey in WQE. */
> 



More information about the dev mailing list