[dpdk-dev,1/2] net/mlx5: support device removal event

Message ID 1502627112-53405-1-git-send-email-matan@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Matan Azrad Aug. 13, 2017, 12:25 p.m. UTC
  Extend the LSC event handling to support the device removal as well.
The Verbs library may send several related events, which are
different from LSC event.

The mlx5 event handling has been made capable of receiving and
signaling several event types at once.

This support includes next:
1. Removal event detection according to the user configuration.
2. Calling to all registered mlx5 removal callbacks.
3. Capabilities extension to include removal interrupt handling.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        |   2 +-
 drivers/net/mlx5/mlx5_ethdev.c | 100 +++++++++++++++++++++++++++--------------
 2 files changed, 68 insertions(+), 34 deletions(-)

Hi 
This patch based on top of last Nelio mlx5 cleanup patches.
  

Comments

Nélio Laranjeiro Aug. 23, 2017, 9:40 a.m. UTC | #1
Hi Matan,

On Sun, Aug 13, 2017 at 03:25:11PM +0300, Matan Azrad wrote:
> Extend the LSC event handling to support the device removal as well.
> The Verbs library may send several related events, which are
> different from LSC event.
> 
> The mlx5 event handling has been made capable of receiving and
> signaling several event types at once.
> 
> This support includes next:
> 1. Removal event detection according to the user configuration.
> 2. Calling to all registered mlx5 removal callbacks.
> 3. Capabilities extension to include removal interrupt handling.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c        |   2 +-
>  drivers/net/mlx5/mlx5_ethdev.c | 100 +++++++++++++++++++++++++++--------------
>  2 files changed, 68 insertions(+), 34 deletions(-)
> 
> Hi 
> This patch based on top of last Nelio mlx5 cleanup patches.
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index bd66a7c..1a3d7f1 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -865,7 +865,7 @@ static struct rte_pci_driver mlx5_driver = {
>  	},
>  	.id_table = mlx5_pci_id_map,
>  	.probe = mlx5_pci_probe,
> -	.drv_flags = RTE_PCI_DRV_INTR_LSC,
> +	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
>  };
>  
>  /**
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index 57f6237..404d8f4 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -1112,47 +1112,75 @@ mlx5_ibv_device_to_pci_addr(const struct ibv_device *device,
>  }
>  
>  /**
> - * Link status handler.
> + * Update the link status.
> + * Set alarm if the device link status is inconsistent.

Adding such comment should also comment about the issue this alarm is solving
i.e. why the link is inconsistent and why the alarm help to fix the issue.

>   *
>   * @param priv
>   *   Pointer to private structure.
> - * @param dev
> - *   Pointer to the rte_eth_dev structure.
>   *
>   * @return
> - *   Nonzero if the callback process can be called immediately.
> + *   Zero if alarm is not set and the link status is consistent.
>   */
>  static int
> -priv_dev_link_status_handler(struct priv *priv, struct rte_eth_dev *dev)
> +priv_link_status_alarm_update(struct priv *priv)

The old name is more accurate, the fact we need to program an alarm is a work
around to get the correct status from ethtool.  If it was possible to avoid
it, this alarm would not exists.

> +{
> +	struct rte_eth_link *link = &priv->dev->data->dev_link;
> +
> +	mlx5_link_update(priv->dev, 0);
> +	if (((link->link_speed == 0) && link->link_status) ||
> +		((link->link_speed != 0) && !link->link_status)) {
> +		if (!priv->pending_alarm) {
> +			/* Inconsistent status, check again later. */
> +			priv->pending_alarm = 1;
> +			rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US,
> +				mlx5_dev_link_status_handler,
> +				priv->dev);
> +		}
> +		return 1;
> +	} else if (unlikely(priv->pending_alarm)) {
> +		/* In case of link interrupt while link alarm was setting. */
> +		priv->pending_alarm = 0;
> +		rte_eal_alarm_cancel(mlx5_dev_link_status_handler, priv->dev);
> +	}
> +	return 0;
> +}
> +
>[...]
>  
> @@ -1172,11 +1200,11 @@ mlx5_dev_link_status_handler(void *arg)
>  	priv_lock(priv);
>  	assert(priv->pending_alarm == 1);
>  	priv->pending_alarm = 0;
> -	ret = priv_dev_link_status_handler(priv, dev);
> +	ret = priv_link_status_alarm_update(priv);

It is not clear, this calls an alarm_update without getting the link status,
the function name is "link_status_handler" why does the behavior does not
reflect the function name?

It is too confusing to be integrated as is, we had several bugs in this part of the
code, keep it clear, by keeping the old functions name.

Thanks,
  
Matan Azrad Aug. 23, 2017, 7:44 p.m. UTC | #2
Hi Nelio

> -----Original Message-----
> From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> Sent: Wednesday, August 23, 2017 12:41 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org
> Subject: Re: [PATCH 1/2] net/mlx5: support device removal event
> 
> Hi Matan,
> 
> On Sun, Aug 13, 2017 at 03:25:11PM +0300, Matan Azrad wrote:
> > Extend the LSC event handling to support the device removal as well.
> > The Verbs library may send several related events, which are different
> > from LSC event.
> >
> > The mlx5 event handling has been made capable of receiving and
> > signaling several event types at once.
> >
> > This support includes next:
> > 1. Removal event detection according to the user configuration.
> > 2. Calling to all registered mlx5 removal callbacks.
> > 3. Capabilities extension to include removal interrupt handling.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5.c        |   2 +-
> >  drivers/net/mlx5/mlx5_ethdev.c | 100
> > +++++++++++++++++++++++++++--------------
> >  2 files changed, 68 insertions(+), 34 deletions(-)
> >
> > Hi
> > This patch based on top of last Nelio mlx5 cleanup patches.
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > bd66a7c..1a3d7f1 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -865,7 +865,7 @@ static struct rte_pci_driver mlx5_driver = {
> >  	},
> >  	.id_table = mlx5_pci_id_map,
> >  	.probe = mlx5_pci_probe,
> > -	.drv_flags = RTE_PCI_DRV_INTR_LSC,
> > +	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
> >  };
> >
> >  /**
> > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > b/drivers/net/mlx5/mlx5_ethdev.c index 57f6237..404d8f4 100644
> > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > @@ -1112,47 +1112,75 @@ mlx5_ibv_device_to_pci_addr(const struct
> > ibv_device *device,  }
> >
> >  /**
> > - * Link status handler.
> > + * Update the link status.
> > + * Set alarm if the device link status is inconsistent.
> 
> Adding such comment should also comment about the issue this alarm is
> solving i.e. why the link is inconsistent and why the alarm help to fix the
> issue.
> 
I didn't see any comments about that in the old code , Hence I didn't write it.
I think you right and this could be added.(even before this patch).

> >   *
> >   * @param priv
> >   *   Pointer to private structure.
> > - * @param dev
> > - *   Pointer to the rte_eth_dev structure.
> >   *
> >   * @return
> > - *   Nonzero if the callback process can be called immediately.
> > + *   Zero if alarm is not set and the link status is consistent.
> >   */
> >  static int
> > -priv_dev_link_status_handler(struct priv *priv, struct rte_eth_dev
> > *dev)
> > +priv_link_status_alarm_update(struct priv *priv)
> 	
> The old name is more accurate, the fact we need to program an alarm is a
> work around to get the correct status from ethtool.  If it was possible to avoid
> it, this alarm would not exists.
> 
Probably because of the git +- format and this specific patch you got confuse here.
Actually priv_link_status_alarm_update function is a new function and don't replace priv_dev_link_status_handler function.

The new name is priv_dev_status_handler since
now it is not just a link but also remove handler.
(maybe more interrupt types in the future)


> > +{
> > +	struct rte_eth_link *link = &priv->dev->data->dev_link;
> > +
> > +	mlx5_link_update(priv->dev, 0);
> > +	if (((link->link_speed == 0) && link->link_status) ||
> > +		((link->link_speed != 0) && !link->link_status)) {
> > +		if (!priv->pending_alarm) {
> > +			/* Inconsistent status, check again later. */
> > +			priv->pending_alarm = 1;
> > +			rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US,
> > +				mlx5_dev_link_status_handler,
> > +				priv->dev);
> > +		}
> > +		return 1;
> > +	} else if (unlikely(priv->pending_alarm)) {
> > +		/* In case of link interrupt while link alarm was setting. */
> > +		priv->pending_alarm = 0;
> > +		rte_eal_alarm_cancel(mlx5_dev_link_status_handler, priv-
> >dev);
> > +	}
> > +	return 0;
> > +}
> > +
> >[...]
> >
> > @@ -1172,11 +1200,11 @@ mlx5_dev_link_status_handler(void *arg)
> >  	priv_lock(priv);
> >  	assert(priv->pending_alarm == 1);
> >  	priv->pending_alarm = 0;
> > -	ret = priv_dev_link_status_handler(priv, dev);
> > +	ret = priv_link_status_alarm_update(priv);
> 
> It is not clear, this calls an alarm_update without getting the link status, the
> function name is "link_status_handler" why does the behavior does not
> reflect the function name?
> 
> It is too confusing to be integrated as is, we had several bugs in this part of
> the code, keep it clear, by keeping the old functions name.
> 
Just to explain what was changed in link functions:

priv_dev_link_status_handler name changed 
to priv_dev_status_handler as I already explained.

Some of priv_dev_status_handler code was passed to
new function named priv_link_status_alarm_update.

This function updates the link status and sets\removes the
inconsistency link alarm if needed.
So, it updates the link status and the alarm setting.
I open for other name suggestions :)

I did this because I think the alarm handler(mlx5_dev_link_status_handler)
shouldn't call to priv_dev_status_handler for trying to update
the link again since:
1.We can't know who is calling (the interrupt or alarm) and the logic is different
accordingly:
In case of interrupt we must to update the link only when the interrupt type is LCS.
In case of alarm we always should call to link update.
2. It doesn't need to read new events from Verbs(it is not new interrupt).
Therefore, the alarm handler just calls to the new function.

So, the new function called ether by priv_dev_status_handler 
in case of LCS interrupt or by mlx5_dev_link_status_handler for
another chance to get consistent link status.

> Thanks,
> 
> --
> Nélio Laranjeiro
> 6WIND

Regards
Matan Azrad
  
Nélio Laranjeiro Aug. 24, 2017, 7:38 a.m. UTC | #3
On Wed, Aug 23, 2017 at 07:44:45PM +0000, Matan Azrad wrote:
> Hi Nelio
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > Sent: Wednesday, August 23, 2017 12:41 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org
> > Subject: Re: [PATCH 1/2] net/mlx5: support device removal event
> > 
> > Hi Matan,
> > 
> > On Sun, Aug 13, 2017 at 03:25:11PM +0300, Matan Azrad wrote:
> > > Extend the LSC event handling to support the device removal as well.
> > > The Verbs library may send several related events, which are different
> > > from LSC event.
> > >
> > > The mlx5 event handling has been made capable of receiving and
> > > signaling several event types at once.
> > >
> > > This support includes next:
> > > 1. Removal event detection according to the user configuration.
> > > 2. Calling to all registered mlx5 removal callbacks.
> > > 3. Capabilities extension to include removal interrupt handling.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5.c        |   2 +-
> > >  drivers/net/mlx5/mlx5_ethdev.c | 100
> > > +++++++++++++++++++++++++++--------------
> > >  2 files changed, 68 insertions(+), 34 deletions(-)
> > >
> > > Hi
> > > This patch based on top of last Nelio mlx5 cleanup patches.
> > >
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > bd66a7c..1a3d7f1 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -865,7 +865,7 @@ static struct rte_pci_driver mlx5_driver = {
> > >  	},
> > >  	.id_table = mlx5_pci_id_map,
> > >  	.probe = mlx5_pci_probe,
> > > -	.drv_flags = RTE_PCI_DRV_INTR_LSC,
> > > +	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
> > >  };
> > >
> > >  /**
> > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > > b/drivers/net/mlx5/mlx5_ethdev.c index 57f6237..404d8f4 100644
> > > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > > @@ -1112,47 +1112,75 @@ mlx5_ibv_device_to_pci_addr(const struct
> > > ibv_device *device,  }
> > >
> > >  /**
> > > - * Link status handler.
> > > + * Update the link status.
> > > + * Set alarm if the device link status is inconsistent.
> > 
> > Adding such comment should also comment about the issue this alarm is
> > solving i.e. why the link is inconsistent and why the alarm help to fix the
> > issue.
> > 
> I didn't see any comments about that in the old code , Hence I didn't write it.

Normal as the alarm is a work around specifically necessary to Mellanox PMD.
Now you explicitly announce that this function program an alarm, the question
is why is it necessary?

> I think you right and this could be added.(even before this patch).

No, in the current code, it update the link, if it inconsistent it tries to
have a link correct ASAP.  There is no need to inform this function will
program an alarm, it is internal cooking.

> > >   *
> > >   * @param priv
> > >   *   Pointer to private structure.
> > > - * @param dev
> > > - *   Pointer to the rte_eth_dev structure.
> > >   *
> > >   * @return
> > > - *   Nonzero if the callback process can be called immediately.
> > > + *   Zero if alarm is not set and the link status is consistent.
> > >   */
> > >  static int
> > > -priv_dev_link_status_handler(struct priv *priv, struct rte_eth_dev
> > > *dev)
> > > +priv_link_status_alarm_update(struct priv *priv)
> > 	
> > The old name is more accurate, the fact we need to program an alarm is a
> > work around to get the correct status from ethtool.  If it was possible to avoid
> > it, this alarm would not exists.
> > 
> Probably because of the git +- format and this specific patch you got confuse here.

No I applied your patch and read your code.  You did not understand my
comment.

>[...]

When I read:

>  void
>  mlx5_dev_link_status_handler(void *arg)
>  {
>         struct rte_eth_dev *dev = arg;
>         struct priv *priv = dev->data->dev_private;
>         int ret;
> 
>         priv_lock(priv);
>         assert(priv->pending_alarm == 1);
>         priv->pending_alarm = 0;
> -       ret = priv_dev_link_status_handler(priv, dev);
> +       ret = priv_link_status_alarm_update(priv);
>         priv_unlock(priv);
> -       if (ret)
> +       if (!ret)
>                 _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC, NULL,
> -                                             NULL);
> +                       NULL);
>  }

I am expecting to find something related to a link update, what I see is an alarm
update.  I don't expect to update an alarm but a link.  The names and action
are inconsistent i.e. mlx5_dev_link_status_handler() should handle a link not
an alarm.

I understand there is a need to add more function levels, but the
priv_link_status_alarm_update() should be renamed to something like
priv_link_status_update().

Regards,
  
Matan Azrad Aug. 24, 2017, 2:33 p.m. UTC | #4
Hi Nelio

> -----Original Message-----
> From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> Sent: Thursday, August 24, 2017 10:38 AM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org
> Subject: Re: [PATCH 1/2] net/mlx5: support device removal event
> 
> On Wed, Aug 23, 2017 at 07:44:45PM +0000, Matan Azrad wrote:
> > Hi Nelio
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > > Sent: Wednesday, August 23, 2017 12:41 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org
> > > Subject: Re: [PATCH 1/2] net/mlx5: support device removal event
> > >
> > > Hi Matan,
> > >
> > > On Sun, Aug 13, 2017 at 03:25:11PM +0300, Matan Azrad wrote:
> > > > Extend the LSC event handling to support the device removal as well.
> > > > The Verbs library may send several related events, which are
> > > > different from LSC event.
> > > >
> > > > The mlx5 event handling has been made capable of receiving and
> > > > signaling several event types at once.
> > > >
> > > > This support includes next:
> > > > 1. Removal event detection according to the user configuration.
> > > > 2. Calling to all registered mlx5 removal callbacks.
> > > > 3. Capabilities extension to include removal interrupt handling.
> > > >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > ---
> > > >  drivers/net/mlx5/mlx5.c        |   2 +-
> > > >  drivers/net/mlx5/mlx5_ethdev.c | 100
> > > > +++++++++++++++++++++++++++--------------
> > > >  2 files changed, 68 insertions(+), 34 deletions(-)
> > > >
> > > > Hi
> > > > This patch based on top of last Nelio mlx5 cleanup patches.
> > > >
> > > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> > > > index
> > > > bd66a7c..1a3d7f1 100644
> > > > --- a/drivers/net/mlx5/mlx5.c
> > > > +++ b/drivers/net/mlx5/mlx5.c
> > > > @@ -865,7 +865,7 @@ static struct rte_pci_driver mlx5_driver = {
> > > >  	},
> > > >  	.id_table = mlx5_pci_id_map,
> > > >  	.probe = mlx5_pci_probe,
> > > > -	.drv_flags = RTE_PCI_DRV_INTR_LSC,
> > > > +	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
> > > >  };
> > > >
> > > >  /**
> > > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > > > b/drivers/net/mlx5/mlx5_ethdev.c index 57f6237..404d8f4 100644
> > > > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > > > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > > > @@ -1112,47 +1112,75 @@ mlx5_ibv_device_to_pci_addr(const struct
> > > > ibv_device *device,  }
> > > >
> > > >  /**
> > > > - * Link status handler.
> > > > + * Update the link status.
> > > > + * Set alarm if the device link status is inconsistent.
> > >
> > > Adding such comment should also comment about the issue this alarm
> > > is solving i.e. why the link is inconsistent and why the alarm help
> > > to fix the issue.
> > >
> > I didn't see any comments about that in the old code , Hence I didn't write
> it.
> 
> Normal as the alarm is a work around specifically necessary to Mellanox PMD.
> Now you explicitly announce that this function program an alarm, the
> question is why is it necessary?
> 

> > I think you right and this could be added.(even before this patch).
> 
> No, in the current code, it update the link, if it inconsistent it tries to have a
> link correct ASAP.  There is no need to inform this function will program an
> alarm, it is internal cooking.
> 
> > > >   *
> > > >   * @param priv
> > > >   *   Pointer to private structure.
> > > > - * @param dev
> > > > - *   Pointer to the rte_eth_dev structure.
> > > >   *
> > > >   * @return
> > > > - *   Nonzero if the callback process can be called immediately.
> > > > + *   Zero if alarm is not set and the link status is consistent.
> > > >   */
> > > >  static int
> > > > -priv_dev_link_status_handler(struct priv *priv, struct
> > > > rte_eth_dev
> > > > *dev)
> > > > +priv_link_status_alarm_update(struct priv *priv)
> > >
> > > The old name is more accurate, the fact we need to program an alarm
> > > is a work around to get the correct status from ethtool.  If it was
> > > possible to avoid it, this alarm would not exists.
> > >
> > Probably because of the git +- format and this specific patch you got
> confuse here.
> 
> No I applied your patch and read your code.  You did not understand my
> comment.
>
I thought it because you said "old name" related to a new function name :) 
 
> >[...]
> 
> When I read:
> 
> >  void
> >  mlx5_dev_link_status_handler(void *arg)  {
> >         struct rte_eth_dev *dev = arg;
> >         struct priv *priv = dev->data->dev_private;
> >         int ret;
> >
> >         priv_lock(priv);
> >         assert(priv->pending_alarm == 1);
> >         priv->pending_alarm = 0;
> > -       ret = priv_dev_link_status_handler(priv, dev);
> > +       ret = priv_link_status_alarm_update(priv);
> >         priv_unlock(priv);
> > -       if (ret)
> > +       if (!ret)
> >                 _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC,
> NULL,
> > -                                             NULL);
> > +                       NULL);
> >  }
> 
> I am expecting to find something related to a link update, what I see is an
> alarm update.  I don't expect to update an alarm but a link.  The names and
> action are inconsistent i.e. mlx5_dev_link_status_handler() should handle a
> link not an alarm.
> 
> I understand there is a need to add more function levels, but the
> priv_link_status_alarm_update() should be renamed to something like
> priv_link_status_update().

OK, I think I understand you.

Because the alarm is a workaround you don't think it should be mentioned
in function description or function name.
(also the function subject should be the link status and not the alarm)
I can agree with you about it.
And I will create v2 with your suggestion - priv_link_status_update.

The return value description can stay as in old code semantic:
Zero if the callback process can be called immediately.

Are you agree?

Maybe we can tell something about the alarm and inconsistent reason
In this function description or internal comment for future code review.
If you want it, please suggest comment.

Thank you.
> 
> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND

Regards
Matan Azrad
  
Nélio Laranjeiro Aug. 25, 2017, 8:29 a.m. UTC | #5
On Thu, Aug 24, 2017 at 02:33:43PM +0000, Matan Azrad wrote:
> Hi Nelio
>[...] 
> > 
> > I am expecting to find something related to a link update, what I see is an
> > alarm update.  I don't expect to update an alarm but a link.  The names and
> > action are inconsistent i.e. mlx5_dev_link_status_handler() should handle a
> > link not an alarm.
> > 
> > I understand there is a need to add more function levels, but the
> > priv_link_status_alarm_update() should be renamed to something like
> > priv_link_status_update().
> 
> OK, I think I understand you.
> 
> Because the alarm is a workaround you don't think it should be mentioned
> in function description or function name.
> (also the function subject should be the link status and not the alarm)
> I can agree with you about it.
> And I will create v2 with your suggestion - priv_link_status_update.

Thanks,

> The return value description can stay as in old code semantic:
> Zero if the callback process can be called immediately.
> 
> Are you agree?

Yes.

> Maybe we can tell something about the alarm and inconsistent reason
> In this function description or internal comment for future code review.
> If you want it, please suggest comment.

Yes the comment can added internally.

Thanks,
  

Patch

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index bd66a7c..1a3d7f1 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -865,7 +865,7 @@  static struct rte_pci_driver mlx5_driver = {
 	},
 	.id_table = mlx5_pci_id_map,
 	.probe = mlx5_pci_probe,
-	.drv_flags = RTE_PCI_DRV_INTR_LSC,
+	.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 57f6237..404d8f4 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1112,47 +1112,75 @@  mlx5_ibv_device_to_pci_addr(const struct ibv_device *device,
 }
 
 /**
- * Link status handler.
+ * Update the link status.
+ * Set alarm if the device link status is inconsistent.
  *
  * @param priv
  *   Pointer to private structure.
- * @param dev
- *   Pointer to the rte_eth_dev structure.
  *
  * @return
- *   Nonzero if the callback process can be called immediately.
+ *   Zero if alarm is not set and the link status is consistent.
  */
 static int
-priv_dev_link_status_handler(struct priv *priv, struct rte_eth_dev *dev)
+priv_link_status_alarm_update(struct priv *priv)
+{
+	struct rte_eth_link *link = &priv->dev->data->dev_link;
+
+	mlx5_link_update(priv->dev, 0);
+	if (((link->link_speed == 0) && link->link_status) ||
+		((link->link_speed != 0) && !link->link_status)) {
+		if (!priv->pending_alarm) {
+			/* Inconsistent status, check again later. */
+			priv->pending_alarm = 1;
+			rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US,
+				mlx5_dev_link_status_handler,
+				priv->dev);
+		}
+		return 1;
+	} else if (unlikely(priv->pending_alarm)) {
+		/* In case of link interrupt while link alarm was setting. */
+		priv->pending_alarm = 0;
+		rte_eal_alarm_cancel(mlx5_dev_link_status_handler, priv->dev);
+	}
+	return 0;
+}
+
+/**
+ * Device status handler.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param events
+ *   Pointer to event flags holder.
+ *
+ * @return
+ *   Events bitmap of callback process which can be called immediately.
+ */
+static uint32_t
+priv_dev_status_handler(struct priv *priv)
 {
 	struct ibv_async_event event;
-	struct rte_eth_link *link = &dev->data->dev_link;
-	int ret = 0;
+	uint32_t ret = 0;
 
 	/* Read all message and acknowledge them. */
 	for (;;) {
 		if (ibv_get_async_event(priv->ctx, &event))
 			break;
-
-		if (event.event_type != IBV_EVENT_PORT_ACTIVE &&
-		    event.event_type != IBV_EVENT_PORT_ERR)
+		if ((event.event_type == IBV_EVENT_PORT_ACTIVE ||
+			event.event_type == IBV_EVENT_PORT_ERR) &&
+			(priv->dev->data->dev_conf.intr_conf.lsc == 1))
+			ret |= (1 << RTE_ETH_EVENT_INTR_LSC);
+		else if (event.event_type == IBV_EVENT_DEVICE_FATAL &&
+			priv->dev->data->dev_conf.intr_conf.rmv == 1)
+			ret |= (1 << RTE_ETH_EVENT_INTR_RMV);
+		else
 			DEBUG("event type %d on port %d not handled",
-			      event.event_type, event.element.port_num);
+				event.event_type, event.element.port_num);
 		ibv_ack_async_event(&event);
 	}
-	mlx5_link_update(dev, 0);
-	if (((link->link_speed == 0) && link->link_status) ||
-	    ((link->link_speed != 0) && !link->link_status)) {
-		if (!priv->pending_alarm) {
-			/* Inconsistent status, check again later. */
-			priv->pending_alarm = 1;
-			rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US,
-					  mlx5_dev_link_status_handler,
-					  dev);
-		}
-	} else {
-		ret = 1;
-	}
+	if (ret & (1 << RTE_ETH_EVENT_INTR_LSC))
+		if (priv_link_status_alarm_update(priv))
+			ret &= ~(1 << RTE_ETH_EVENT_INTR_LSC);
 	return ret;
 }
 
@@ -1172,11 +1200,11 @@  mlx5_dev_link_status_handler(void *arg)
 	priv_lock(priv);
 	assert(priv->pending_alarm == 1);
 	priv->pending_alarm = 0;
-	ret = priv_dev_link_status_handler(priv, dev);
+	ret = priv_link_status_alarm_update(priv);
 	priv_unlock(priv);
-	if (ret)
+	if (!ret)
 		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC, NULL,
-					      NULL);
+			NULL);
 }
 
 /**
@@ -1192,14 +1220,17 @@  mlx5_dev_interrupt_handler(void *cb_arg)
 {
 	struct rte_eth_dev *dev = cb_arg;
 	struct priv *priv = dev->data->dev_private;
-	int ret;
+	uint32_t events;
 
 	priv_lock(priv);
-	ret = priv_dev_link_status_handler(priv, dev);
+	events = priv_dev_status_handler(priv);
 	priv_unlock(priv);
-	if (ret)
+	if (events & (1 << RTE_ETH_EVENT_INTR_LSC))
 		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC, NULL,
-					      NULL);
+			NULL);
+	if (events & (1 << RTE_ETH_EVENT_INTR_RMV))
+		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RMV, NULL,
+			NULL);
 }
 
 /**
@@ -1213,7 +1244,8 @@  mlx5_dev_interrupt_handler(void *cb_arg)
 void
 priv_dev_interrupt_handler_uninstall(struct priv *priv, struct rte_eth_dev *dev)
 {
-	if (!dev->data->dev_conf.intr_conf.lsc)
+	if (!dev->data->dev_conf.intr_conf.lsc &&
+		!dev->data->dev_conf.intr_conf.rmv)
 		return;
 	rte_intr_callback_unregister(&priv->intr_handle,
 				     mlx5_dev_interrupt_handler,
@@ -1238,7 +1270,8 @@  priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 {
 	int rc, flags;
 
-	if (!dev->data->dev_conf.intr_conf.lsc)
+	if (!dev->data->dev_conf.intr_conf.lsc &&
+		!dev->data->dev_conf.intr_conf.rmv)
 		return;
 	assert(priv->ctx->async_fd > 0);
 	flags = fcntl(priv->ctx->async_fd, F_GETFL);
@@ -1246,6 +1279,7 @@  priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 	if (rc < 0) {
 		INFO("failed to change file descriptor async event queue");
 		dev->data->dev_conf.intr_conf.lsc = 0;
+		dev->data->dev_conf.intr_conf.rmv = 0;
 	} else {
 		priv->intr_handle.fd = priv->ctx->async_fd;
 		priv->intr_handle.type = RTE_INTR_HANDLE_EXT;