[dpdk-dev] net/failsafe: do not probe device if plugged out

Message ID 20170712182812.18404-1-thomas@monjalon.net (mailing list archive)
State Rejected, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Thomas Monjalon July 12, 2017, 6:28 p.m. UTC
  When probing a sub-devices which does not exist in the system,
some errors are logged:
	EAL: Cannot find unplugged device (0002:00:02.0)
	PMD: net_failsafe: ERROR: sub_device 0 probe failed

It is normal to have these errors when initializing the failsafe
device and its sub-devices.
But we do not need to log them at each probe.
And it is even less relevant if the sub-device has been plugged out.

The unavailable devices are skipped after the first probe, considering
two exceptions:
- The vdevs are always probed because they do not really exist on the
virtual bus before probing them.
- The sub-device list given by an executed command line may change
from one call to the next one, therefore it is considered to be always
the first one. Anyway, such external command should check the
device availability before passing it to the failsafe PMD.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
This patch is on top of the failsafe series which is
pending for integration in 17.08-rc2.
---
 drivers/net/failsafe/failsafe_eal.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
  

Comments

Gaëtan Rivet July 12, 2017, 8:39 p.m. UTC | #1
Hi Thomas,

Nice idea. A few remarks below:

On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
> When probing a sub-devices which does not exist in the system,
> some errors are logged:
> 	EAL: Cannot find unplugged device (0002:00:02.0)
> 	PMD: net_failsafe: ERROR: sub_device 0 probe failed
> 
> It is normal to have these errors when initializing the failsafe
> device and its sub-devices.
> But we do not need to log them at each probe.
> And it is even less relevant if the sub-device has been plugged out.
> 
> The unavailable devices are skipped after the first probe, considering
> two exceptions:
> - The vdevs are always probed because they do not really exist on the
> virtual bus before probing them.
> - The sub-device list given by an executed command line may change
> from one call to the next one, therefore it is considered to be always
> the first one. Anyway, such external command should check the
> device availability before passing it to the failsafe PMD.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
> This patch is on top of the failsafe series which is
> pending for integration in 17.08-rc2.
> ---
>  drivers/net/failsafe/failsafe_eal.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
> index 3321dda71..5181066ed 100644
> --- a/drivers/net/failsafe/failsafe_eal.c
> +++ b/drivers/net/failsafe/failsafe_eal.c
> @@ -49,6 +49,11 @@ fs_find_ethdev(const struct rte_device *dev)

This needs to be rebased on the latest version.
Sorry I forgot to supersede the previous one.

>  	return NULL;
>  }
>  
> +static int cmp_dev_name(const struct rte_device *dev, const void *name)
> +{
> +	return strcmp(dev->name, name);
> +}
> +
>  static int
>  fs_bus_init(struct rte_eth_dev *dev)
>  {
> @@ -56,12 +61,24 @@ fs_bus_init(struct rte_eth_dev *dev)
>  	struct rte_device *rdev;
>  	struct rte_devargs *da;
>  	uint8_t i;
> +	static int first_init = 1;

I did not do that before within this PMD, but a boolean would be better
there.

>  	int ret;
>  
>  	FOREACH_SUBDEV(sdev, i, dev) {
>  		if (sdev->state != DEV_PARSED)
>  			continue;
>  		da = &sdev->devargs;
> +

Superfluous line.

> +		/* skip plugged out devices */
> +		if (! first_init
> +				&& sdev->cmdline == NULL
> +				&& strcmp(da->bus->name, "vdev") != 0) {

Use first_init == false instead of negation.
&& should be at the end of the line instead of the start of the next
one.
Indentation is wrong.

> +			da->bus->scan();
> +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
> +				continue; /* device not found */

da->bus->find_device instead of bus->find_device.
This function cannot find the device back currently on the PCI bus,
blocking the plugging of VF.

The PCI bus will scan the VF while no rte_devargs exists to
describe it within the global list. If the device exists, it will
detect it, allocate it and then set its name.
Without any rte_devargs, the name of a PCI device falls back to its
canonical name (DomBDF instead of BDF). The name comparison with
da->name can only succeed if the slave was declared using the DomBDF
format.

The fix is to do a deep copy of the rte_devargs (the API has been
sent previously with the rte_devargs rework but I have since removed
it) and insert it using rte_eal_devargs_insert(). This is essentially
the solution I used for the rte_eal_hotplug_add() fix[1].

The alternative fix is to propose an API for buses to transform device
names into their canonical form on demand... And it would certainly only
be useful for the PCI bus.

The issue is discussed there:
[1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html

> +		}
> +
> +		/* probe device */
>  		rdev = rte_eal_hotplug_add(da->bus->name,
>  					   da->name,
>  					   da->args);
> @@ -84,6 +101,10 @@ fs_bus_init(struct rte_eth_dev *dev)
>  		ETH(sdev)->state = RTE_ETH_DEV_DEFERRED;
>  		sdev->state = DEV_PROBED;
>  	}
> +

Superfluous line.

> +	if (first_init)
> +		first_init = 0;

first_init = false;

> +
>  	return 0;
>  }
>  
> -- 
> 2.13.2
>
  
Thomas Monjalon July 13, 2017, 6:52 a.m. UTC | #2
12/07/2017 22:39, Gaëtan Rivet:
> Hi Thomas,
> 
> Nice idea. A few remarks below:
> 
> On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
> >  	FOREACH_SUBDEV(sdev, i, dev) {
> >  		if (sdev->state != DEV_PARSED)
> >  			continue;
> >  		da = &sdev->devargs;
> > +
> 
> Superfluous line.

I don't think so :) It is isolating the "skip" block with its comment.

> > +		/* skip plugged out devices */
> > +		if (! first_init
> > +				&& sdev->cmdline == NULL
> > +				&& strcmp(da->bus->name, "vdev") != 0) {
> 
> Use first_init == false instead of negation.
> && should be at the end of the line instead of the start of the next
> one.

Yes

> Indentation is wrong.

No, the coding style is to put 2 tabs for continuation lines.

> > +			da->bus->scan();
> > +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
> > +				continue; /* device not found */
> 
> da->bus->find_device instead of bus->find_device.
> This function cannot find the device back currently on the PCI bus,
> blocking the plugging of VF.
> 
> The PCI bus will scan the VF while no rte_devargs exists to
> describe it within the global list. If the device exists, it will
> detect it, allocate it and then set its name.
> Without any rte_devargs, the name of a PCI device falls back to its
> canonical name (DomBDF instead of BDF). The name comparison with
> da->name can only succeed if the slave was declared using the DomBDF
> format.
> 
> The fix is to do a deep copy of the rte_devargs (the API has been
> sent previously with the rte_devargs rework but I have since removed
> it) and insert it using rte_eal_devargs_insert(). This is essentially
> the solution I used for the rte_eal_hotplug_add() fix[1].
> 
> The alternative fix is to propose an API for buses to transform device
> names into their canonical form on demand... And it would certainly only
> be useful for the PCI bus.
> 
> The issue is discussed there:
> [1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html

OK, I was not aware of this exact issue.
So I will wait above fix.
  
Gaëtan Rivet July 13, 2017, 8:14 a.m. UTC | #3
On Thu, Jul 13, 2017 at 08:52:33AM +0200, Thomas Monjalon wrote:
> 12/07/2017 22:39, Gaëtan Rivet:
> > Hi Thomas,
> > 
> > Nice idea. A few remarks below:
> > 
> > On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
> > >  	FOREACH_SUBDEV(sdev, i, dev) {
> > >  		if (sdev->state != DEV_PARSED)
> > >  			continue;
> > >  		da = &sdev->devargs;
> > > +
> > 
> > Superfluous line.
> 
> I don't think so :) It is isolating the "skip" block with its comment.
> 
> > > +		/* skip plugged out devices */
> > > +		if (! first_init
> > > +				&& sdev->cmdline == NULL
> > > +				&& strcmp(da->bus->name, "vdev") != 0) {
> > 
> > Use first_init == false instead of negation.
> > && should be at the end of the line instead of the start of the next
> > one.
> 
> Yes
> 
> > Indentation is wrong.
> 
> No, the coding style is to put 2 tabs for continuation lines.
> 
> > > +			da->bus->scan();
> > > +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
> > > +				continue; /* device not found */
> > 
> > da->bus->find_device instead of bus->find_device.
> > This function cannot find the device back currently on the PCI bus,
> > blocking the plugging of VF.
> > 
> > The PCI bus will scan the VF while no rte_devargs exists to
> > describe it within the global list. If the device exists, it will
> > detect it, allocate it and then set its name.
> > Without any rte_devargs, the name of a PCI device falls back to its
> > canonical name (DomBDF instead of BDF). The name comparison with
> > da->name can only succeed if the slave was declared using the DomBDF
> > format.
> > 
> > The fix is to do a deep copy of the rte_devargs (the API has been
> > sent previously with the rte_devargs rework but I have since removed
> > it) and insert it using rte_eal_devargs_insert(). This is essentially
> > the solution I used for the rte_eal_hotplug_add() fix[1].
> > 
> > The alternative fix is to propose an API for buses to transform device
> > names into their canonical form on demand... And it would certainly only
> > be useful for the PCI bus.
> > 
> > The issue is discussed there:
> > [1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html
> 
> OK, I was not aware of this exact issue.
> So I will wait above fix.

The fix above will not solve this issue.
If you are waiting for a proper, general fix, I don't think it will
arrive anytime soon.

I have the rte_eal_devargs_clone and a working version that I can send
in-reply-to if you want. But that's an API extension while in RC2 so...
  
Thomas Monjalon July 13, 2017, 9:17 a.m. UTC | #4
13/07/2017 10:14, Gaëtan Rivet:
> On Thu, Jul 13, 2017 at 08:52:33AM +0200, Thomas Monjalon wrote:
> > 12/07/2017 22:39, Gaëtan Rivet:
> > > Hi Thomas,
> > > 
> > > Nice idea. A few remarks below:
> > > 
> > > On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
> > > >  	FOREACH_SUBDEV(sdev, i, dev) {
> > > >  		if (sdev->state != DEV_PARSED)
> > > >  			continue;
> > > >  		da = &sdev->devargs;
> > > > +
> > > 
> > > Superfluous line.
> > 
> > I don't think so :) It is isolating the "skip" block with its comment.
> > 
> > > > +		/* skip plugged out devices */
> > > > +		if (! first_init
> > > > +				&& sdev->cmdline == NULL
> > > > +				&& strcmp(da->bus->name, "vdev") != 0) {
> > > 
> > > Use first_init == false instead of negation.
> > > && should be at the end of the line instead of the start of the next
> > > one.
> > 
> > Yes
> > 
> > > Indentation is wrong.
> > 
> > No, the coding style is to put 2 tabs for continuation lines.
> > 
> > > > +			da->bus->scan();
> > > > +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
> > > > +				continue; /* device not found */
> > > 
> > > da->bus->find_device instead of bus->find_device.
> > > This function cannot find the device back currently on the PCI bus,
> > > blocking the plugging of VF.
> > > 
> > > The PCI bus will scan the VF while no rte_devargs exists to
> > > describe it within the global list. If the device exists, it will
> > > detect it, allocate it and then set its name.
> > > Without any rte_devargs, the name of a PCI device falls back to its
> > > canonical name (DomBDF instead of BDF). The name comparison with
> > > da->name can only succeed if the slave was declared using the DomBDF
> > > format.
> > > 
> > > The fix is to do a deep copy of the rte_devargs (the API has been
> > > sent previously with the rte_devargs rework but I have since removed
> > > it) and insert it using rte_eal_devargs_insert(). This is essentially
> > > the solution I used for the rte_eal_hotplug_add() fix[1].
> > > 
> > > The alternative fix is to propose an API for buses to transform device
> > > names into their canonical form on demand... And it would certainly only
> > > be useful for the PCI bus.
> > > 
> > > The issue is discussed there:
> > > [1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html
> > 
> > OK, I was not aware of this exact issue.
> > So I will wait above fix.
> 
> The fix above will not solve this issue.
> If you are waiting for a proper, general fix, I don't think it will
> arrive anytime soon.
> 
> I have the rte_eal_devargs_clone and a working version that I can send
> in-reply-to if you want. But that's an API extension while in RC2 so...

No worries, it can be fixed in 17.11.
  
Ferruh Yigit July 18, 2017, 8:39 a.m. UTC | #5
On 7/13/2017 10:17 AM, Thomas Monjalon wrote:
> 13/07/2017 10:14, Gaëtan Rivet:
>> On Thu, Jul 13, 2017 at 08:52:33AM +0200, Thomas Monjalon wrote:
>>> 12/07/2017 22:39, Gaëtan Rivet:
>>>> Hi Thomas,
>>>>
>>>> Nice idea. A few remarks below:
>>>>
>>>> On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
>>>>>  	FOREACH_SUBDEV(sdev, i, dev) {
>>>>>  		if (sdev->state != DEV_PARSED)
>>>>>  			continue;
>>>>>  		da = &sdev->devargs;
>>>>> +
>>>>
>>>> Superfluous line.
>>>
>>> I don't think so :) It is isolating the "skip" block with its comment.
>>>
>>>>> +		/* skip plugged out devices */
>>>>> +		if (! first_init
>>>>> +				&& sdev->cmdline == NULL
>>>>> +				&& strcmp(da->bus->name, "vdev") != 0) {
>>>>
>>>> Use first_init == false instead of negation.
>>>> && should be at the end of the line instead of the start of the next
>>>> one.
>>>
>>> Yes
>>>
>>>> Indentation is wrong.
>>>
>>> No, the coding style is to put 2 tabs for continuation lines.
>>>
>>>>> +			da->bus->scan();
>>>>> +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
>>>>> +				continue; /* device not found */
>>>>
>>>> da->bus->find_device instead of bus->find_device.
>>>> This function cannot find the device back currently on the PCI bus,
>>>> blocking the plugging of VF.
>>>>
>>>> The PCI bus will scan the VF while no rte_devargs exists to
>>>> describe it within the global list. If the device exists, it will
>>>> detect it, allocate it and then set its name.
>>>> Without any rte_devargs, the name of a PCI device falls back to its
>>>> canonical name (DomBDF instead of BDF). The name comparison with
>>>> da->name can only succeed if the slave was declared using the DomBDF
>>>> format.
>>>>
>>>> The fix is to do a deep copy of the rte_devargs (the API has been
>>>> sent previously with the rte_devargs rework but I have since removed
>>>> it) and insert it using rte_eal_devargs_insert(). This is essentially
>>>> the solution I used for the rte_eal_hotplug_add() fix[1].
>>>>
>>>> The alternative fix is to propose an API for buses to transform device
>>>> names into their canonical form on demand... And it would certainly only
>>>> be useful for the PCI bus.
>>>>
>>>> The issue is discussed there:
>>>> [1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html
>>>
>>> OK, I was not aware of this exact issue.
>>> So I will wait above fix.
>>
>> The fix above will not solve this issue.
>> If you are waiting for a proper, general fix, I don't think it will
>> arrive anytime soon.
>>
>> I have the rte_eal_devargs_clone and a working version that I can send
>> in-reply-to if you want. But that's an API extension while in RC2 so...
> 
> No worries, it can be fixed in 17.11.
> 

What is the status of the patch, is this patch rejected for this
release, or will be merged into existing failsafe?
  
Thomas Monjalon July 18, 2017, 8:49 p.m. UTC | #6
18/07/2017 11:39, Ferruh Yigit:
> On 7/13/2017 10:17 AM, Thomas Monjalon wrote:
> > 13/07/2017 10:14, Gaëtan Rivet:
> >> On Thu, Jul 13, 2017 at 08:52:33AM +0200, Thomas Monjalon wrote:
> >>> 12/07/2017 22:39, Gaëtan Rivet:
> >>>> Hi Thomas,
> >>>>
> >>>> Nice idea. A few remarks below:
> >>>>
> >>>> On Wed, Jul 12, 2017 at 08:28:12PM +0200, Thomas Monjalon wrote:
> >>>>>  	FOREACH_SUBDEV(sdev, i, dev) {
> >>>>>  		if (sdev->state != DEV_PARSED)
> >>>>>  			continue;
> >>>>>  		da = &sdev->devargs;
> >>>>> +
> >>>>
> >>>> Superfluous line.
> >>>
> >>> I don't think so :) It is isolating the "skip" block with its comment.
> >>>
> >>>>> +		/* skip plugged out devices */
> >>>>> +		if (! first_init
> >>>>> +				&& sdev->cmdline == NULL
> >>>>> +				&& strcmp(da->bus->name, "vdev") != 0) {
> >>>>
> >>>> Use first_init == false instead of negation.
> >>>> && should be at the end of the line instead of the start of the next
> >>>> one.
> >>>
> >>> Yes
> >>>
> >>>> Indentation is wrong.
> >>>
> >>> No, the coding style is to put 2 tabs for continuation lines.
> >>>
> >>>>> +			da->bus->scan();
> >>>>> +			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
> >>>>> +				continue; /* device not found */
> >>>>
> >>>> da->bus->find_device instead of bus->find_device.
> >>>> This function cannot find the device back currently on the PCI bus,
> >>>> blocking the plugging of VF.
> >>>>
> >>>> The PCI bus will scan the VF while no rte_devargs exists to
> >>>> describe it within the global list. If the device exists, it will
> >>>> detect it, allocate it and then set its name.
> >>>> Without any rte_devargs, the name of a PCI device falls back to its
> >>>> canonical name (DomBDF instead of BDF). The name comparison with
> >>>> da->name can only succeed if the slave was declared using the DomBDF
> >>>> format.
> >>>>
> >>>> The fix is to do a deep copy of the rte_devargs (the API has been
> >>>> sent previously with the rte_devargs rework but I have since removed
> >>>> it) and insert it using rte_eal_devargs_insert(). This is essentially
> >>>> the solution I used for the rte_eal_hotplug_add() fix[1].
> >>>>
> >>>> The alternative fix is to propose an API for buses to transform device
> >>>> names into their canonical form on demand... And it would certainly only
> >>>> be useful for the PCI bus.
> >>>>
> >>>> The issue is discussed there:
> >>>> [1]: http://dpdk.org/ml/archives/dev/2017-July/071155.html
> >>>
> >>> OK, I was not aware of this exact issue.
> >>> So I will wait above fix.
> >>
> >> The fix above will not solve this issue.
> >> If you are waiting for a proper, general fix, I don't think it will
> >> arrive anytime soon.
> >>
> >> I have the rte_eal_devargs_clone and a working version that I can send
> >> in-reply-to if you want. But that's an API extension while in RC2 so...
> > 
> > No worries, it can be fixed in 17.11.
> > 
> 
> What is the status of the patch, is this patch rejected for this
> release, or will be merged into existing failsafe?

I think this patch is rejected.
  

Patch

diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 3321dda71..5181066ed 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -49,6 +49,11 @@  fs_find_ethdev(const struct rte_device *dev)
 	return NULL;
 }
 
+static int cmp_dev_name(const struct rte_device *dev, const void *name)
+{
+	return strcmp(dev->name, name);
+}
+
 static int
 fs_bus_init(struct rte_eth_dev *dev)
 {
@@ -56,12 +61,24 @@  fs_bus_init(struct rte_eth_dev *dev)
 	struct rte_device *rdev;
 	struct rte_devargs *da;
 	uint8_t i;
+	static int first_init = 1;
 	int ret;
 
 	FOREACH_SUBDEV(sdev, i, dev) {
 		if (sdev->state != DEV_PARSED)
 			continue;
 		da = &sdev->devargs;
+
+		/* skip plugged out devices */
+		if (! first_init
+				&& sdev->cmdline == NULL
+				&& strcmp(da->bus->name, "vdev") != 0) {
+			da->bus->scan();
+			if (bus->find_device(NULL, cmp_dev_name, da->name) == NULL)
+				continue; /* device not found */
+		}
+
+		/* probe device */
 		rdev = rte_eal_hotplug_add(da->bus->name,
 					   da->name,
 					   da->args);
@@ -84,6 +101,10 @@  fs_bus_init(struct rte_eth_dev *dev)
 		ETH(sdev)->state = RTE_ETH_DEV_DEFERRED;
 		sdev->state = DEV_PROBED;
 	}
+
+	if (first_init)
+		first_init = 0;
+
 	return 0;
 }