[PATCH 2/2] ethdev: fix race condition in fast-path ops setup

Ruifeng Wang Ruifeng.Wang at arm.com
Tue Feb 21 08:24:19 CET 2023


> -----Original Message-----
> From: fengchengwen <fengchengwen at huawei.com>
> Sent: Monday, February 20, 2023 2:58 PM
> To: Ashok Kaladi <ashok.k.kaladi at intel.com>; jerinj at marvell.com; thomas at monjalon.net
> Cc: dev at dpdk.org; s.v.naga.harish.k at intel.com; erik.g.carrillo at intel.com;
> abhinandan.gujjar at intel.com; stable at dpdk.org; Ruifeng Wang <Ruifeng.Wang at arm.com>
> Subject: Re: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup
> 
> On 2023/2/20 14:08, Ashok Kaladi wrote:
> > If ethdev enqueue or dequeue function is called during
> > eth_dev_fp_ops_setup(), it may get pre-empted after setting the
> > function pointers, but before setting the pointer to port data.
> > In this case the newly registered enqueue/dequeue function will use
> > dummy port data and end up in seg fault.
> >
> > This patch moves the updation of each data pointers before updating
> > corresponding function pointers.
> >
> > Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate
> > structure")
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Ashok Kaladi <ashok.k.kaladi at intel.com>
> >
> > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > index 48090c879a..a0232c669f 100644
> > --- a/lib/ethdev/ethdev_private.c
> > +++ b/lib/ethdev/ethdev_private.c
> > @@ -270,17 +270,17 @@ void
> >  eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
> >  		const struct rte_eth_dev *dev)
> >  {
> > +	fpo->rxq.data = dev->data->rx_queues;
> >  	fpo->rx_pkt_burst = dev->rx_pkt_burst;
> > +	fpo->txq.data = dev->data->tx_queues;
> >  	fpo->tx_pkt_burst = dev->tx_pkt_burst;
> >  	fpo->tx_pkt_prepare = dev->tx_pkt_prepare;
> >  	fpo->rx_queue_count = dev->rx_queue_count;
> >  	fpo->rx_descriptor_status = dev->rx_descriptor_status;
> >  	fpo->tx_descriptor_status = dev->tx_descriptor_status;
> >
> > -	fpo->rxq.data = dev->data->rx_queues;
> >  	fpo->rxq.clbk = (void **)(uintptr_t)dev->post_rx_burst_cbs;
> >
> > -	fpo->txq.data = dev->data->tx_queues;
> >  	fpo->txq.clbk = (void **)(uintptr_t)dev->pre_tx_burst_cbs;
> 
> Hi Ashok,
> 
> The modification is OK for the x86 platform (which has strong memory order, and will keep
> write-after-write order in here, and read-after-read in rte_eth_rx/tx_burst), but for
> other weak memory order (like ARM platform) will fail.
> 
> For the weak memory order, suggest add write-mb in here, and read-mb in
> rte_eth_rx/tx_burst.
> But the read-mb in rte_eth_rx/tx_burst will affect performance, especially the variable
> will changes only once when start.
> 
> So I suggest use write-mb + delay in here:
>    fpo->rxq.data = dev->data->rx_queues;
>    fpo->txq.data = dev->data->tx_queues;
>    mdelay(5); // delay e.g. 5ms
>    fpo->rx_pkt_burst = dev->rx_pkt_burst;
>    fpo->tx_pkt_burst = dev->tx_pkt_burst;
> 
> And also cc ARMv8 maintainer.

Thanks Chengwen for the heads up.
Agree that moving the queue data assignment around won't solve the problem on systems with relaxed memory ordering.
Even with write-mb/read-mb in eth_dev_fp_ops_setup/rte_eth_rx_burst is not perfectly fine. There is a chance that
dummy enqueue/dequeue function will use updated queue data and mess it up.
Adding delay in eth_dev_fp_ops_setup is not a good way. But I haven't found a solution that doesn't hurt fast path performance.

> 
> >  }
> >
> >


More information about the stable mailing list