[PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector path

Ruifeng Wang Ruifeng.Wang at arm.com
Wed Jun 29 13:41:29 CEST 2022


> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo at nvidia.com>
> Sent: Wednesday, June 29, 2022 3:55 PM
> To: Ruifeng Wang <Ruifeng.Wang at arm.com>; Ali Alnubani
> <alialnu at nvidia.com>; Matan Azrad <matan at nvidia.com>
> Cc: dev at dpdk.org; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; stable at dpdk.org; nd <nd at arm.com>;
> nd <nd at arm.com>
> Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector
> path
> 
> Hi, Ruifeng
> 
> > -----Original Message-----
> > From: Ruifeng Wang <Ruifeng.Wang at arm.com>
> > Sent: Monday, June 27, 2022 14:08
> > To: Slava Ovsiienko <viacheslavo at nvidia.com>; Ali Alnubani
> > <alialnu at nvidia.com>; Matan Azrad <matan at nvidia.com>
> > Cc: dev at dpdk.org; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>;
> > stable at dpdk.org; nd <nd at arm.com>; nd <nd at arm.com>
> > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON
> > vector path
> >
> > > -----Original Message-----
> > > From: Slava Ovsiienko <viacheslavo at nvidia.com>
> > > Sent: Monday, June 20, 2022 1:38 PM
> > > To: Ali Alnubani <alialnu at nvidia.com>; Ruifeng Wang
> > > <Ruifeng.Wang at arm.com>; Matan Azrad <matan at nvidia.com>
> > > Cc: dev at dpdk.org; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli at arm.com>; stable at dpdk.org; nd <nd at arm.com>
> > > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in
> > > NEON vector path
> > >
> > > Hi, Ruifeng
> >
> > Hi Slava,
> >
> > Thanks for your review.
> > >
> > > My apologies for review delay.
> >
> > Apologies too. I was on something else.
> >
> > > As far I understand the hypothetical problem scenario is:
> > > - CPU core reorders reading of qwords of 16B vector
> > > - core reads the second 8B of CQE (old CQE values)
> > > - CQE update
> > > - core reads the first 8B of CQE (new CQE values)
> >
> > Yes, This is the problem.
> > >
> > > How the re-reading of CQEs can resolve the issue?
> > > This wrong scenario might happen on the second read and we would run
> > > into the same issue.
> >
> > Here we are trying to ordering reading of a 16B vector (8B with op_own
> > - high, and 8B without op_own - low).
> > The first read will load 16B. The second read will load and update low
> > 8B (no op_own).
> OK, I got the point, thank you for the explanations.
> Can we avoid the first reading of low 8B (no containing CQE owning field)?
> 
> I mean to update this part to read only upper 8Bs:
>                 /* B.0 (CQE 3) load a block having op_own. */
>                 c3 = vld1q_u64((uint64_t *)(p3 + 48));
>                 /* B.0 (CQE 2) load a block having op_own. */
>                 c2 = vld1q_u64((uint64_t *)(p2 + 48));
>                 /* B.0 (CQE 1) load a block having op_own. */
>                 c1 = vld1q_u64((uint64_t *)(p1 + 48));
>                 /* B.0 (CQE 0) load a block having op_own. */
>                 c0 = vld1q_u64((uint64_t *)(p0 + 48));
>                 /* Synchronize for loading the rest of blocks. */
>                 rte_io_rmb();
> 
> Because lower 8Bs will be overlapped with the second read (in your patch)
> and barrier ensures the correct order.

Hi Slava,

Yes, your suggestion is valid.
Actually, I tried that approach: load higher 8B + barrier + load lower 8B + combine the two 8Bs into a vector.
It also has no observable performance impact but generates more instructions compared to the current patch (the 'combine' operation).
So I followed current approach. 

Thanks.
> 
> 
> With best regards,
> Slava


More information about the stable mailing list