[PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector path

Ruifeng Wang Ruifeng.Wang at arm.com
Thu Feb 10 07:24:50 CET 2022


Ping.
Please could you help to review this patch?

Thanks.
Ruifeng

> -----Original Message-----
> From: Ruifeng Wang <ruifeng.wang at arm.com>
> Sent: Tuesday, January 4, 2022 11:01 AM
> To: matan at nvidia.com; viacheslavo at nvidia.com
> Cc: dev at dpdk.org; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; stable at dpdk.org; nd <nd at arm.com>;
> Ruifeng Wang <Ruifeng.Wang at arm.com>
> Subject: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector path
> 
> In NEON vector PMD, vector load loads two contiguous 8B of descriptor data
> into vector register. Given vector load ensures no 16B atomicity, read of the
> word that includes op_own field could be reordered after read of other
> words. In this case, some words could contain invalid data.
> 
> Reloaded qword0 after read barrier to update vector register. This ensures
> that the fetched data is correct.
> 
> Testpmd single core test on N1SDP/ThunderX2 showed no performance
> drop.
> 
> Fixes: 1742c2d9fab0 ("net/mlx5: fix synchronization on polling Rx
> completions")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
> ---
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index b1d16baa61..b1ec615b51 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -647,6 +647,14 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq,
> volatile struct mlx5_cqe *cq,
>  		c0 = vld1q_u64((uint64_t *)(p0 + 48));
>  		/* Synchronize for loading the rest of blocks. */
>  		rte_io_rmb();
> +		/* B.0 (CQE 3) reload lower half of the block. */
> +		c3 = vld1q_lane_u64((uint64_t *)(p3 + 48), c3, 0);
> +		/* B.0 (CQE 2) reload lower half of the block. */
> +		c2 = vld1q_lane_u64((uint64_t *)(p2 + 48), c2, 0);
> +		/* B.0 (CQE 1) reload lower half of the block. */
> +		c1 = vld1q_lane_u64((uint64_t *)(p1 + 48), c1, 0);
> +		/* B.0 (CQE 0) reload lower half of the block. */
> +		c0 = vld1q_lane_u64((uint64_t *)(p0 + 48), c0, 0);
>  		/* Prefetch next 4 CQEs. */
>  		if (pkts_n - pos >= 2 * MLX5_VPMD_DESCS_PER_LOOP) {
>  			unsigned int next = pos +
> MLX5_VPMD_DESCS_PER_LOOP;
> --
> 2.25.1



More information about the stable mailing list