[dpdk-dev] Proposal for a big eal / ethdev cleanup

David Marchand david.marchand at 6wind.com
Sat Jan 16 16:53:11 CET 2016


Hello Jan,

On Thu, Jan 14, 2016 at 12:46 PM, Jan Viktorin <viktorin at rehivetech.com> wrote:
> On Thu, 14 Jan 2016 11:38:16 +0100
> David Marchand <david.marchand at 6wind.com> wrote:
>
>> Hello all,
>>
>> Here is a proposal of a big cleanup in ethdev (cryptodev would have to
>> follow) and eal structures.
>> This is something I wanted to do for quite some time and the arrival of
>> a new bus makes me think we need it.
>>
>> This is an alternative to what Jan proposed [1].
>
> As I need to extend DPDK by a non-PCI bus system, I would prefer any such
> working solution :). By [1], you probably mean:
>
> [1] http://comments.gmane.org/gmane.comp.networking.dpdk.devel/30973
>
> (I didn't find it in the e-mail.)

Thought I put the reference after my signature.
Anyway, yes, I was referring to your thread.


>> ABI is most likely broken with this, but I think this discussion can come later.
>
> I was trying in my initial approach to stay as much API and ABI backwards
> compatible as possible to be acceptable into upstream. As just a few
> people have shown their interest in these changes, I consider the ABI
> compatibility very important.
>
> I can see, that your approach does not care too much... Otherwise, it looks
> good to me. It is closer to the Linux drivers infra, so it seems to be clearer
> then the current one.

I did this on purpose.
>From my point of view, we will have an API/ABI breakage in this code
at one point.
So I sent this mail to show where I'd like us to go, because this
looks saner on the mid/long term.


>> First some context on how dpdk is initialized at the moment :
>>
>> Let's imagine a system with one ixgbe pci device and take some
>> part of ixgbe driver as an example.
>>
>> static struct eth_driver rte_ixgbe_pmd = {
>>         .pci_drv = {
>>                 .name = "rte_ixgbe_pmd",
>>                 .id_table = pci_id_ixgbe_map,
>>                 .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
>> RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_DETACHABLE,
>>         },
>>         .eth_dev_init = eth_ixgbe_dev_init,
>>         .eth_dev_uninit = eth_ixgbe_dev_uninit,
>>         .dev_private_size = sizeof(struct ixgbe_adapter),
>> };
>
> Note, that the biggest issue here is that the eth_driver has no way to
> distinguish among PCI or other subsystem. There is no field that helps
> the generic ethdev code (librte_ether) to decide what bus we are on
> (and it needs to know it in the current DPDK).
>
> Another point is that the ethdev infra should not know about the
> underlying bus infra. The question is whether we do a big clean
> up (extract PCI-bus code out) or a small clean up (give the ethdev
> infra a hint which bus system it deals with).

Yes, and I think these two choices are reflected by our two respective
proposals :-)


>> So now, what I had in mind is something like this.
>> It is far from perfect and I certainly did some shortcuts in my
>> reasoning.
>>
>>
>> Generic device/driver
>>
>> - introduce a rte_device structure,
>> - a rte_device has a name, that identifies it in a unique way across
>> all buses, maybe something like pci:0000:00:01.0, and for vdev,
>> vdev:name
>
> Having a prefix is good but would this break the user API? Is the
> name exposed to users?

Maybe define new apis, and wrap the old one in it ?
Not too sure, I need to experiment.


>> - current rte_driver does not need to know about the pmd_type
>> (pdev/vdev), this is only a way to order drivers init in eal, we could
>> use the rte_driver names to order them or maybe remove this ordering
>
> What is the main reason to have pdev/vdev distinction here? After I
> register the driver, does eal really need to know the pmd_type?

Already did some cleanup in the past because of pmd_type :
http://dpdk.org/browse/dpdk/commit/?id=78aecefed955917753bfb6f44ae970dde4c652d0
http://dpdk.org/browse/dpdk/commit/?id=6bc2415c3387ae72f2ce3677f0e3540e734583d5

For me, there is no reason but the init ordering (vdevs come before
pdevs), and I am pretty sure we don't need this for ethdev.


> Moreover, there is no way how to pass arguments to pdevs, only to
> vdevs. This was shortly disscussed in [2] for the szedata2 PMD.
>
> [2] http://dpdk.org/ml/archives/dev/2015-October/026285.html

Shorty discussed with Thomas, yes.
But after some experiments, it appears that you can pass devargs after
a whitelist option at init (--pci-whitelist
xxxx:xx:xx.x,whataniceoptionwehavehere=1).
This does not work for hotplug.
This is undocumented and this looks like a workaround, so I would
prefer we come up with a better api for 2.3.


>> Impact on PCI device/driver
>>
>> - rte_pci_device is modified to embed a rte_device (embedding makes it
>> possible later to cast the rte_device and get the rte_pci_device in pci
>> specific functions)
>> - no need for a rte_pci_driver reference in rte_pci_device, since we
>> have the rte_device driver
>
> I've noticed that DPDK does not use any kind of the container_of macro
> like the Linux Kernel does. Instead, some considerations like the
> rte_pci_driver MUST be placed at the beginning of the eth_driver (if I
> am not mistaken). Here, it should be considered to do it another way.

Ah, yes, we could use this.
Just did not think of using it :-)


Regards,
-- 
David Marchand


More information about the dev mailing list