[PATCH 2/2] ethdev: fix race condition in fast-path ops setup

fengchengwen fengchengwen at huawei.com
Wed Feb 22 02:07:01 CET 2023


On 2023/2/22 1:00, Stephen Hemminger wrote:
> On Tue, 21 Feb 2023 07:24:19 +0000
> Ruifeng Wang <Ruifeng.Wang at arm.com> wrote:
> 
>>> -----Original Message-----
>>> From: fengchengwen <fengchengwen at huawei.com>
>>> Sent: Monday, February 20, 2023 2:58 PM
>>> To: Ashok Kaladi <ashok.k.kaladi at intel.com>; jerinj at marvell.com; thomas at monjalon.net
>>> Cc: dev at dpdk.org; s.v.naga.harish.k at intel.com; erik.g.carrillo at intel.com;
>>> abhinandan.gujjar at intel.com; stable at dpdk.org; Ruifeng Wang <Ruifeng.Wang at arm.com>
>>> Subject: Re: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup
>>>
>>> On 2023/2/20 14:08, Ashok Kaladi wrote:  
>>>> If ethdev enqueue or dequeue function is called during
>>>> eth_dev_fp_ops_setup(), it may get pre-empted after setting the
>>>> function pointers, but before setting the pointer to port data.
>>>> In this case the newly registered enqueue/dequeue function will use
>>>> dummy port data and end up in seg fault.
>>>>
>>>> This patch moves the updation of each data pointers before updating
>>>> corresponding function pointers.
>>>>
>>>> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate
>>>> structure")
>>>> Cc: stable at dpdk.org
> 
> Why is something calling enqueue/dequeue when device is not fully started.
> A correctly written application would not call rx/tx burst until after
> ethdev start had finished.

Please refer the eb0d471a894 (ethdev: add proactive error handling mode), when driver
recover itself, the application may still invoke enqueue/dequeue API.

> 
> Would something like this work better?
> 
> Note: there is another bug in current code. The check for link state interrupt
> and link_ops could return -ENOTSUP and leave device in indeterminate state.
> The check should be done before calling PMD.
> 
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 0266cc82acb6..d6c163ed85e7 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1582,6 +1582,14 @@ rte_eth_dev_start(uint16_t port_id)
>  		return 0;
>  	}
>  
> +	if (dev->data->dev_conf.intr_conf.lsc == 0 &&
> +	    dev->dev_ops->link_update == NULL) {
> +		RTE_ETHDEV_LOG(INFO,
> +			       "Device with port_id=%"PRIu16" link update not supported\n",
> +			       port_id);
> +			return -ENOTSUP;
> +	}
> +
>  	ret = rte_eth_dev_info_get(port_id, &dev_info);
>  	if (ret != 0)
>  		return ret;
> @@ -1591,9 +1599,7 @@ rte_eth_dev_start(uint16_t port_id)
>  		eth_dev_mac_restore(dev, &dev_info);
>  
>  	diag = (*dev->dev_ops->dev_start)(dev);
> -	if (diag == 0)
> -		dev->data->dev_started = 1;
> -	else
> +	if (diag != 0)
>  		return eth_err(port_id, diag);
>  
>  	ret = eth_dev_config_restore(dev, &dev_info, port_id);
> @@ -1611,16 +1617,18 @@ rte_eth_dev_start(uint16_t port_id)
>  		return ret;
>  	}
>  
> -	if (dev->data->dev_conf.intr_conf.lsc == 0) {
> -		if (*dev->dev_ops->link_update == NULL)
> -			return -ENOTSUP;
> -		(*dev->dev_ops->link_update)(dev, 0);
> -	}
> -
>  	/* expose selection of PMD fast-path functions */
>  	eth_dev_fp_ops_setup(rte_eth_fp_ops + port_id, dev);
>  
> +	/* ensure state is set before marking device ready */
> +	rte_smp_wmb();
> +
>  	rte_ethdev_trace_start(port_id);
> +
> +	/* Update current link state */
> +	if (dev->data->dev_conf.intr_conf.lsc == 0)
> +		(*dev->dev_ops->link_update)(dev, 0);
> +
>  	return 0;
>  }
>  
> 
> .
> 


More information about the dev mailing list