[PATCH 2/2] ethdev: fix race condition in fast-path ops setup

Konstantin Ananyev konstantin.ananyev at huawei.com
Wed Feb 22 11:41:10 CET 2023



> -----Original Message-----
> From: fengchengwen <fengchengwen at huawei.com>
> Sent: Wednesday, February 22, 2023 1:07 AM
> To: Stephen Hemminger <stephen at networkplumber.org>; Ruifeng Wang <Ruifeng.Wang at arm.com>
> Cc: Ashok Kaladi <ashok.k.kaladi at intel.com>; jerinj at marvell.com; thomas at monjalon.net; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; dev at dpdk.org; s.v.naga.harish.k at intel.com; erik.g.carrillo at intel.com;
> abhinandan.gujjar at intel.com; stable at dpdk.org; nd <nd at arm.com>
> Subject: Re: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup
> 
> On 2023/2/22 1:00, Stephen Hemminger wrote:
> > On Tue, 21 Feb 2023 07:24:19 +0000
> > Ruifeng Wang <Ruifeng.Wang at arm.com> wrote:
> >
> >>> -----Original Message-----
> >>> From: fengchengwen <fengchengwen at huawei.com>
> >>> Sent: Monday, February 20, 2023 2:58 PM
> >>> To: Ashok Kaladi <ashok.k.kaladi at intel.com>; jerinj at marvell.com; thomas at monjalon.net
> >>> Cc: dev at dpdk.org; s.v.naga.harish.k at intel.com; erik.g.carrillo at intel.com;
> >>> abhinandan.gujjar at intel.com; stable at dpdk.org; Ruifeng Wang <Ruifeng.Wang at arm.com>
> >>> Subject: Re: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup
> >>>
> >>> On 2023/2/20 14:08, Ashok Kaladi wrote:
> >>>> If ethdev enqueue or dequeue function is called during
> >>>> eth_dev_fp_ops_setup(), it may get pre-empted after setting the
> >>>> function pointers, but before setting the pointer to port data.
> >>>> In this case the newly registered enqueue/dequeue function will use
> >>>> dummy port data and end up in seg fault.
> >>>>
> >>>> This patch moves the updation of each data pointers before updating
> >>>> corresponding function pointers.
> >>>>
> >>>> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate
> >>>> structure")
> >>>> Cc: stable at dpdk.org
> >
> > Why is something calling enqueue/dequeue when device is not fully started.
> > A correctly written application would not call rx/tx burst until after
> > ethdev start had finished.
> 
> Please refer the eb0d471a894 (ethdev: add proactive error handling mode), when driver
> recover itself, the application may still invoke enqueue/dequeue API.

Right now DPDK ethdev layer *does not* provide synchronization mechanisms
between data-path and control-path functions.
That was a deliberate deisgn choice. If we want to change that rule, then I suppose
we need a community consensus for it. 
I think that if the driver wants to provide some sort of error recovery procedure,
then it has to provide some synchronization mechanism inside it between data-path
and control-path functions.
Actually looking at eb0d471a894 (ethdev: add proactive error handling mode),
and following patches I wonder how it creeped in?
It seems we just introduced a loophole for race condition with this approach...
It probably needs to be either deprecated or reworked.

> 
> >
> > Would something like this work better?
> >
> > Note: there is another bug in current code. The check for link state interrupt
> > and link_ops could return -ENOTSUP and leave device in indeterminate state.
> > The check should be done before calling PMD.
> >
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> > index 0266cc82acb6..d6c163ed85e7 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -1582,6 +1582,14 @@ rte_eth_dev_start(uint16_t port_id)
> >  		return 0;
> >  	}
> >
> > +	if (dev->data->dev_conf.intr_conf.lsc == 0 &&
> > +	    dev->dev_ops->link_update == NULL) {
> > +		RTE_ETHDEV_LOG(INFO,
> > +			       "Device with port_id=%"PRIu16" link update not supported\n",
> > +			       port_id);
> > +			return -ENOTSUP;
> > +	}
> > +
> >  	ret = rte_eth_dev_info_get(port_id, &dev_info);
> >  	if (ret != 0)
> >  		return ret;
> > @@ -1591,9 +1599,7 @@ rte_eth_dev_start(uint16_t port_id)
> >  		eth_dev_mac_restore(dev, &dev_info);
> >
> >  	diag = (*dev->dev_ops->dev_start)(dev);
> > -	if (diag == 0)
> > -		dev->data->dev_started = 1;
> > -	else
> > +	if (diag != 0)
> >  		return eth_err(port_id, diag);
> >
> >  	ret = eth_dev_config_restore(dev, &dev_info, port_id);
> > @@ -1611,16 +1617,18 @@ rte_eth_dev_start(uint16_t port_id)
> >  		return ret;
> >  	}
> >
> > -	if (dev->data->dev_conf.intr_conf.lsc == 0) {
> > -		if (*dev->dev_ops->link_update == NULL)
> > -			return -ENOTSUP;
> > -		(*dev->dev_ops->link_update)(dev, 0);
> > -	}
> > -
> >  	/* expose selection of PMD fast-path functions */
> >  	eth_dev_fp_ops_setup(rte_eth_fp_ops + port_id, dev);
> >
> > +	/* ensure state is set before marking device ready */
> > +	rte_smp_wmb();
> > +
> >  	rte_ethdev_trace_start(port_id);
> > +
> > +	/* Update current link state */
> > +	if (dev->data->dev_conf.intr_conf.lsc == 0)
> > +		(*dev->dev_ops->link_update)(dev, 0);
> > +
> >  	return 0;
> >  }
> >
> >
> > .
> >



More information about the stable mailing list