[dpdk-dev] [PATCH] eal: fix to set the rte_device ptr's device args before hotplug

Gaetan Rivet grive at u256.net
Tue Feb 11 09:38:44 CET 2020


[...]
>>> (gdb) p dev2
>>> $5 = (struct rte_pci_device *) 0x54de5e0
>>> (gdb) p /x *dev2
>>> $6 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device =
>>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver =
>>> 0x3a6b7f0,
>>>       bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain =
>>> 0x0, bus = 0x6, devid = 0x2, function = 0x0}
>>>
>>> we hit the pci_device ( 0x54de5e0) that was created during the first
>>> time probe of the parent device ?
>>> Since it is already probed, it goes to line 381 where it does frees
>>> the just allocated 'pci_device'  inside pci_scan_one() via 'free(dev)'
>>>
>>> As you can see this pci_device does not have it's devargs set (rightly
>>> so as initially there were no devargs for this device)
>>>
>>> But as shown in the stack trace in my previous reply, when
>>> pci_find_device() walks the rte_pci_devices list , it finds this
>>> earlier probed device (without devargs)
>>>
>>
>> Alright, I think you are right, this is what is happening.
>> The issue IMO, is that the PCI scan is thus hitting an already existing device, but we have
>> missed the case where the new device has more info that the previous one (linked devargs).
>>
> That is correct
> 
>>> pci_find_device (start=0x0, cmp=0x4b92d8 <cmp_dev_name>,
>>> data=0x55a4af8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426
>>> 426                             return &pdev->device;
>>> (gdb) p pdev
>>> $15 = (struct rte_pci_device *) 0x54de5e0
>>>
>>> (gdb) p /x *pdev
>>> $16 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device =
>>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver =
>>> 0x3a6b7f0,
>>>       bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain =
>>> 0x0, bus = 0x6, devid = 0x2, function = 0x0},
>>>
>>> Hope this explains why the pci_device has NULL devargs at this point
>>> and how my fix to set it at this point helps solve the problem?
>>>
>>> Please let me know your thoughts
>>>
>>
>> I think the proper fix is instead to have a clean update during the PCI scan.
>> The proper way is to keep the old device, but override its fields as new info was found.
> 
> Agreed
>>
>> Something like calling pci_name_set(dev2); line 359 maybe. BSD should also be updated for consistency.
>>
>> My issue with your patch is that I think this issue is very specific to the PCI bus and the capabilities
>> of some devices on it. It would be better to have a fully compliant scan + probe ops considering
>> the supported capabilities, instead of forcing this on all buses.
> Sure, so i'm guessing you meant something like this ?
> 
> dex 740a2cd..92a41c2 100644
> --- a/drivers/bus/pci/linux/pci.c
> +++ b/drivers/bus/pci/linux/pci.c
> @@ -377,6 +377,8 @@
>                                                   */
>                                                  RTE_LOG(ERR, EAL,
> "Unexpected device scan at %s!\n",
>                                                          filename);
> +                                       else if (dev2->device.devargs
> != dev->device.devargs)
> +                                               pci_name_set(dev2);
>                                  }
>                                  free(dev);
>                          }
> 
> Which is basically checking for devargs mismatch between the 'new'
> device and a match with an
> already probed device and  return the old device while adjusting the
> new info (devargs)
> Is that OK?
> 
>>
>> I'm wondering whether this can happen with an already existing devargs? If so, is it more correct to keep
>> the new one, or ignore the probe, free the new devargs and report an error? If the former,
>> please clean up properly the devargs using rte_devargs_remove() (before calling pci_name_set() of course).

Hello Somnath,

Yes, this is basically it, however you also need to take care of a potential already existing devargs here, because the PCI bus does not guarantee that all devargs of a device will be named the same.


More information about the dev mailing list