[dpdk-dev] PCIe Hot Insert/Remove Support

Walker, Benjamin benjamin.walker at intel.com
Mon Oct 24 20:16:37 CEST 2016


Hi all,

My name is Ben Walker and I'm the technical lead for SPDK (it's like DPDK, but
for storage devices). SPDK relies on DPDK only for the base functionality in the
EAL - memory management, the rings, and the PCI scanning code. A key feature for
storage devices is support for hot insert and remove, so we're currently working
through how best to implement this for a user space driver. While doing this
work, we've run into a few issues with the current DPDK PCI/device/driver
framework that I'd like to discuss with this list. I'm not entirely ramped up on
all of the current activity in this area or what the future plans are, so please
educate me if something is coming that will address our current issues. I'm
working off of the latest commit on the master branch as of today.

Today, there appears to be two lists - one of PCI devices and one of drivers. To
update the list of PCI devices, you call rte_eal_pci_scan(), which scans the PCI
bus. That call does not attempt to load any drivers. One scan is automatically
performed when the eal is first initialized. To add or remove drivers from the
driver list you call rte_eal_driver_register/unregister. To match drivers in the
driver list to devices in the device list, you call rte_eal_pci_probe.

There are a few problems with how the code works for us. First,
rte_eal_pci_scan's algorithm will not correctly detect devices that are in its
internal list but weren't found by the most recent PCI bus scan (i.e. they were
hot removed). DPDK's scan doesn't seem to comprehend hot remove in any way.
Fortunately there is a public API to remove devices from the device list -
rte_eal_pci_detach. That function will automatically unload any drivers
associated with the device and then remove it from the list. There is a similar
call for adding a device to the list - rte_eal_pci_probe_one, which will add a
device to the device list and then automatically match it to drivers. I think if
rte_eal_pci_scan is going to be a public interface (and it is), it needs to
correctly comprehend the removal of PCI devices. Otherwise, make it a private
API that is only called in response to rte_eal_init and only expose the public
probe_one/detach calls for modifying the list of devices. My preference is for
the former, not the latter.

Second, rte_eal_pci_probe will call the driver initialization functions each
time a probe happens, even if the driver has already been successfully loaded.
This tends to crash a lot of the PMDs. It seems to me like rte_eal_pci_probe is
not safe to call more than once during the lifetime of the program, which is a
real challenge when you have multiple users of the PCI framework. For instance,
an application may manage both storage devices using the rte_eal_pci framework
and NICs, and the initialization routine may go something like:

register NIC drivers
rte_eal_probe()
...
register SSD drivers
rte_eal_probe()

This is almost certainly how any real code is going to function because the code
dealing with NICs is unrelated and probably unaware of the code dealing with the
SSDs. It should be fairly trivial to simply not call the probe() callback for a
device if the driver has already been loaded. Is this a reasonable modification
to make?

Thanks,
Ben


More information about the dev mailing list