[PATCH 19.11] net/i40e: fix risk in descriptor read in NEON Rx

Christian Ehrhardt christian.ehrhardt at canonical.com
Wed Dec 1 11:31:24 CET 2021


On Wed, Dec 1, 2021 at 8:48 AM Ruifeng Wang <ruifeng.wang at arm.com> wrote:
>
> [ upstream commit 778602fe570a138224de94a38eca3ce2e344138c ]
>

Thanks, applied

> Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
> that the rest of the descriptor words have valid values. Hence, the
> word containing DD bit must be read first before reading the rest of
> the descriptor words.
>
> In NEON vector PMD, vector load loads two contiguous 8B of
> descriptor data into vector register. Given vector load ensures no
> 16B atomicity, read of the word that includes DD field could be
> reordered after read of other words. In this case, some words could
> contain invalid data.
>
> Read barrier is added after read of qword1 that includes DD field.
> And qword0 is reloaded to update vector register. This ensures
> that the fetched data is correct.
>
> Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.
>
> Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
>
> Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_neon.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> index bd1e0490d..0da6b37da 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> @@ -299,6 +299,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>                 /* B.2 copy 2 mbuf point into rx_pkts  */
>                 vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);
>
> +               /* Use acquire fence to order loads of descriptor qwords */
> +               __atomic_thread_fence(__ATOMIC_ACQUIRE);
> +               /* A.2 reload qword0 to make it ordered after qword1 load */
> +               descs[3] = vld1q_lane_u64((uint64_t *)(rxdp + 3), descs[3], 0);
> +               descs[2] = vld1q_lane_u64((uint64_t *)(rxdp + 2), descs[2], 0);
> +               descs[1] = vld1q_lane_u64((uint64_t *)(rxdp + 1), descs[1], 0);
> +               descs[0] = vld1q_lane_u64((uint64_t *)(rxdp), descs[0], 0);
> +
>                 if (split_packet) {
>                         rte_mbuf_prefetch_part2(rx_pkts[pos]);
>                         rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
> --
> 2.25.1
>


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


More information about the stable mailing list