[dpdk-dev] rte_eth_dev_attach returns 0, although device is not attached

Bruce Richardson bruce.richardson at intel.com
Fri Aug 5 14:29:42 CEST 2016


On Thu, Aug 04, 2016 at 04:47:25PM +0100, Ferruh Yigit wrote:
> On 8/4/2016 3:54 PM, Igor Ryzhov wrote:
> > 
> >> 4 авг. 2016 г., в 16:21, Ferruh Yigit <ferruh.yigit at intel.com
> >> <mailto:ferruh.yigit at intel.com>> написал(а):
> >>
> >> On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
> >>> Hello Ferruh,
> >>>
> >>>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit at intel.com
> >>>> <mailto:ferruh.yigit at intel.com>> написал(а):
> >>>>
> >>>> Hi Igor,
> >>>>
> >>>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
> >>>>> Hello.
> >>>>>
> >>>>> Function rte_eth_dev_attach can return false positive result.
> >>>>> It happens because rte_eal_pci_probe_one returns zero if no driver
> >>>>> is found for the device:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret < 0)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>> (pci_probe_all_drivers returns 1 in that case)
> >>>>>
> >>>>> For example, it can be easily reproduced by trying to attach virtio
> >>>>> device, managed by kernel driver.
> >>>>
> >>>> You are right, and I did able to reproduce this issue with virtio as you
> >>>> suggest.
> >>>>
> >>>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
> >>>> Perhaps a dev->attached check needs to be added into this function.
> >>
> >> With a second check, rte_eth_dev_get_port_by_addr() catches it if the
> >> driver is missing.
> >>
> >> But for virtio case, problem is not missing driver.
> >> Problem is eth_virtio_dev_init() is returning a positive value on fail.
> >>
> >> Call stack is:
> >> rte_eal_pci_probe_one
> >>    pci_probe_all_drivers
> >>        rte_eal_pci_probe_one_driver
> >>            rte_eth_dev_init
> >>               eth_virtio_dev_init
> >>
> >> So rte_eal_pci_probe_one_driver() also returns positive value, as no
> >> driver found, and rte_eth_dev_get_port_by_addr() returns a valid
> >> port_id, since rte_eth_dev_init() allocated an eth_dev.
> >>
> >> Briefly, this can be fixed in virtio pmd, instead of eal pci.
> >>
> >>>>
> >>>>>
> >>>>> I think it should be:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>
> >>>> Your proposal looks good to me. Will you send a patch?
> >>>
> >>
> >> Original code silently ignores the if driver is missing for that dev,
> >> although it is still questionable, I think we can keep this as it is.
> >>
> >>> Patch sent.
> >>
> >> Sorry for this, but can you please test with following modification in
> >> virtio:
> >> index 07d6449..c74eeee 100644
> >> --- a/drivers/net/virtio/virtio_ethdev.c
> >> +++ b/drivers/net/virtio/virtio_ethdev.c
> >> @@ -1156,7 +1156,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
> >>        if (pci_dev) {
> >>                ret = vtpci_init(pci_dev, hw, &dev_flags);
> >>                if (ret)
> >> -                       return ret;
> >> +                       return -1;
> >>        }
> >>
> >>        /* Reset the device although not necessary at startup */
> > 
> > I think it's not a good change, because it will break the idea of this
> > patch - http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83
> 
> Yes, breaks this one, I wasn't aware of this patch. But in this patch,
> commit log says: "return 1 to tell the upper layer we
> don't take over this device.", I am not sure upper layer designed for this.
> 
> > 
> > Also, with your patch the application will not start, because
> > rte_eal_pci_probe will fail:
> > 
> > if (ret < 0)
> > rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
> >  " cannot be used\n", dev->addr.domain, dev->addr.bus,
> >  dev->addr.devid, dev->addr.function);
> 
> Yes it fails, and this looks like intended behavior. This failure is
> correct according code.
> 
> > 
> > And now I think that maybe we should change the way rte_eal_pci_probe works.
> > I think we shouldn't stop the application if just one of PCI devices is
> > not probed successfully.
> 
> Agreed. Overall rte_exit() usage already discussed a few times.
> 
> I think best option is:
> - don't exit app if rte_eal_pci_probe() fails, only print an error.

Whether or not the pci probe exits the app or not, I think it should signal
a serious error if the probe fails and a device was explicitly whitelisted on
the commandline. Given the user explicitly requested the device, a failure to
use is probably a problem which requires the user to fix before running the
app.

/Bruce



More information about the dev mailing list