[dpdk-stable] [PATCH] net/ixgbe: fix link state timing issue on fiber ports

Lijian Zhang Lijian.Zhang at arm.com
Thu Mar 19 11:51:21 CET 2020


This issue is firstly observed with an ixgbe NIC in VPP project, which is software switching application based on DPDK.
There's a daemon thread running in background keeping polling hardware link status, using ixgbe_dev_link_update_share().
Once flag IXGBE_FLAG_NEED_LINK_CONFIG is set, ixgbe_dev_link_update_share() will just return link down status without actually polling hardware status.

In the issue, flag IXGBE_FLAG_NEED_LINK_CONFIG is always set, and never be cleared, meaning ixgbe_dev_link_update_share() cannot get hardware status, but always get link down status.

The condition causing IXGBE_FLAG_NEED_LINK_CONFIG always set is as below.

The ixgbe_dev_link_update_share() is always running in the background.
1. In the beginning, IXGBE_FLAG_NEED_LINK_CONFIG is 0 and it is link down status.
2. ixgbe_dev_link_update_share() will set IXGBE_FLAG_NEED_LINK_CONFIG to 1
3. Then it triggers ixgbe_dev_setup_link_thread_handler() thread to configure the interface.
4. At the end of configuring thread, ixgbe_dev_setup_link_thread_handler() will clear the flag IXGBE_FLAG_NEED_LINK_CONFIG.
5. With IXGBE_FLAG_NEED_LINK_CONFIG being cleared, ixgbe_dev_link_update_share() can poll hardware link status in the next round.

But when the user is setting interface link up or down in the CLI, it will call ixgbe_dev_start() or ixgbe_dev_stop(). In both function, they will call ixgbe_dev_cancel_link_thread() to interrupt any running configuring thread (which is running in above step 3 and step 4), without clearing the flag IXGBE_FLAG_NEED_LINK_CONFIG. This will leave IXGBE_FLAG_NEED_LINK_CONFIG always set, and ixgbe_dev_link_update_share() cannot get hardware status.
Thanks.

> -----Original Message-----
> From: Phil Yang <phil.yang at arm.com>
> Sent: 2020年3月19日 14:42
> To: dev at dpdk.org; konstantin.ananyev at intel.com; wenzhuo.lu at intel.com
> Cc: qi.z.zhang at intel.com; Lijian Zhang <Lijian.Zhang at arm.com>; Gavin Hu
> <Gavin.Hu at arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>; stable at dpdk.org
> Subject: [PATCH] net/ixgbe: fix link state timing issue on fiber ports
> 
> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it is possible
> when a port is started to experience a timing issue which prevents the link
> from ever being fully set up.
> 
> In ixgbe_dev_link_update_share(), if the media type is fiber and the link is
> down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback to
> ixgbe_dev_setup_link_thread_handler() is scheduled which should try to set up
> the link and clear the flag afterwards.
> 
> If the device is started before the flag is cleared, the scheduled callback is
> cancelled. This causes the flag to remain set and subsequent calls to
> ixgbe_dev_link_update_share() return without trying to retrieve the link state
> because the flag is set.
> 
> In ixgbe_dev_cancel_link_thread(), after cancelling the callback, unset the flag
> on the device to avoid this condition.
> 
> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
> Cc: stable at dpdk.org
> 
> Bugzilla ID: 388
> 
> Signed-off-by: Phil Yang <phil.yang at arm.com>
> Signed-off-by: Lijian Zhang <lijian.zhang at arm.com>
> Reviewed-by: Gavin Hu <gavin.hu at arm.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 23b3f5b..2b65750 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -4147,11 +4147,19 @@ static void
>  ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)  {
>  	struct ixgbe_adapter *ad = dev->data->dev_private;
> +	struct ixgbe_interrupt *intr =
> +		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
>  	void *retval;
> 
>  	if (rte_atomic32_read(&ad->link_thread_running)) {
>  		pthread_cancel(ad->link_thread_tid);
>  		pthread_join(ad->link_thread_tid, &retval);
> +		/* clear this flag once the thread has been
> +		 * cancelled, to avoid link status error in
> +		 * case unfinished threads cannot clean up
> +		 * this flag.
> +		 */
> +		intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
>  		rte_atomic32_clear(&ad->link_thread_running);
>  	}
>  }
> @@ -4262,8 +4270,12 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> *dev,
> 
>  	if (link_up == 0) {
>  		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> -			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
>  			if (rte_atomic32_test_and_set(&ad-
> >link_thread_running)) {
> +				/* To avoid race condition between threads,
> set
> +				 * the IXGBE_FLAG_NEED_LINK_CONFIG flag
> only
> +				 * when there is no link thread running.
> +				 */
> +				intr->flags |=
> IXGBE_FLAG_NEED_LINK_CONFIG;
>  				if (rte_ctrl_thread_create(&ad-
> >link_thread_tid,
>  					"ixgbe-link-handler",
>  					NULL,
> --
> 2.7.4



More information about the stable mailing list