[dpdk-dev] [PATCH v4 0/5] lib: add Port Representors

Thomas Monjalon thomas at monjalon.net
Mon Jan 15 17:09:26 CET 2018


15/01/2018 13:12, Doherty, Declan:
> On 10/01/2018 7:26 PM, Thomas Monjalon wrote:
> > 10/01/2018 14:46, Doherty, Declan:
> >> On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
> >>> Hi,
> >>>
> >>> 08/01/2018 15:37, Remy Horton:
> >>>> Port Representors provide a logical presentation in DPDK of VF (virtual
> >>>> function) ports for the purposes of control and monitoring. Each port
> >>>> representor device represents a single VF and is associated with it's
> >>>> parent physical function (PF) PMD which provides the back-end hooks for
> >>>> the representor device ops and defines the control domain to which that
> >>>> port belongs. This allows to use existing DPDK APIs to monitor and control
> >>>> the port without the need to create and maintain VF specific APIs.
> >>>
> >>> Extending control plane ability of DPDK has been discussed
> >>> multiple times.
> >>
> >> It has, and I have yet to see a really strong reason as to why we would
> >> not support control plane functions within DPDK, many of which are
> >> already support today implicitly anyway through our ethdev APIs.
> >>
> >>> The current agreed policy is:
> >>> "
> >>> The primary goal of DPDK is to provide a userspace dataplane.
> >>> Managing VFs from a PF driver is a control plane feature and developers
> >>> should generally rely on the Linux Kernel for that.
> >>> "
> >>> http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
> >>>
> >>
> >> My understanding is that this particular entry was based around the
> >> discussion on the divergence of functionality between the Linux kernel
> >> PF driver and the DPDK PF driver. I also don't really think the above
> >> statement is valid as a blanket statement for the project as it makes
> >> the assumption that DPDK is only deployed on Linux hosts, what about
> >> FreeBSD? or in the future Windows?
> > 
> > Yes, we must agree on removing this scope limitation while working
> > on a generic VF representor.
> > 
> >> A number of presentations at both Userspace in Dublin and the Summit
> >> in San Jose discussed the support of control plane functionality by
> >> DPDK and there wasn't any strong arguments or opposition against using
> >> DPDK for control plane functions that I saw.
> >>
> >> In any case this patchset is not introducing any new control plane APIs
> >> that don't already exist within DPDK today, it only enables the creation
> >> of a new type of virtual PMDs which are linked to the same base
> >> infrastructure and which can be used to represent VFs in a control plane
> >> application as we have implemented in this patch set.
> >>
> >>> If we relax this policy, I think the representor solution should be
> >>> a real port, not only "for the purposes of control and monitoring".
> >>> It has been asked several times as replies to this series,
> >>> but it is kindly ignored, saying it will be thought later.
> >>>
> >>
> >> I think we have stated in multiple discussions, especially during the
> >> userspace presentation back in September that this solution supports
> >> data path on the representors PMDs, and we have used the
> >> infrastructure proposed here to do exactly what you are asking. As the
> >> representor infrastructure doesn't preclude the support of a data
> >> path, we have used it as it is presented here to implement a data path
> >> for exception path packets for a prototype vswitch offload implementation.
> >>
> >>
> >>> I don't see a general agreement on this series so far.
> >>>
> >>
> >> I think the main issue of contention is that there is a
> >> misunderstanding that this implementation only supports control plane
> >> management and monitoring, but that is not the case and it can be used
> >> for full data path representors, with limited or no control plane
> >> functionality if required, at the end of the day the only limitations
> >> are based on what is implemented by the backend base driver were the
> >> broker is running for the representor ports.
> > 
> > The misunderstanding may originates from what you describe (even in v5):
> > "ports for the purposes of control and monitoring"
> > 
> noted, but that is the scope of what we demostrate in the patchset, but 
> we'll update the introduction to reflect the fact that they can be used 
> to also support data path functions, such as exception path traffic for 
> hw switch.
> 
> > I think everybody agree to have VF representors in DPDK.
> > But there are few things which are not generic enough,
> > and not easy to use.
> > I hoped the discussion started at Dublin would continue
> > on the mailing list but I realize the joint effort with other vendors
> > did not happen.
> > I will elaborate quickly below and more detailed in later review.
> > 
> > 1/ In order to keep track of the relations between PF, VF and
> > representor - which are all ethdev - you create a struct outside
> > of ethdev. I think it should be managed inside ethdev API.
> 
> Initially we had implemented the representor functionality within the 
> context of the ethdev library but ran into a number of scenarios where 
> this didn't work well as it makes the assumption that the base device 
> that the representors are attached to is always an ethdev, we ran into 
> cases were the PF isn't necessarily an ethdev, for example in some 
> smartNICs the PF would be better represented by a switchdev, or it is 
> possible that the device hosting the representor broker could just 
> provide a conduit to a kernel driver.

The base device may be something else than an ethdev PF,
so it may be represented as a rte_device.
But the VF and representor are still ethdev.
I still think the relationship should be described in ethdev.

> > As suggested by others, we could also think whether a switchdev API
> > is interesting or not.
> 
> Indeed if a switchdev is something that is required by the community it 
> would make sense that the representor infrastructure was initialized 
> within the switchdev and not an ethdev. The advantage of keeping the 
> representor infrastructure independent is that it gives the flexibility 
> for representors to be supported independently of device type they are 
> attached to.

switchdev is just another device class.
We must abstract the base device as a basic EAL rte_device for now.

> > 2/ You create a new library for ethdev device management.
> > It is the responsibility of the bus to scan and probe devices.
> > In Intel case, the representor has no real bus so it must rely on
> > the vdev bus. Note that the new custom scan hook may help.
> 
> This isn't the case in latest versions of the patchset, the bus the 
> representors are dependent on is that of the base device, so for the 
> i40e it's the PF PCI device.

I'm suggesting to use vdev as bus of i40e representor,
because there is no PCI identifier for them, right?

> > In Mellanox case, the representor already exists in Linux, and is based
> > on PCI bus.
> > Then, as any other port, it must be managed with whitelist or blacklist.
> 
> I think the suggestion by Yuanhan of using the device whitelist command 
> option makes sense as a option from the commandline, but it would 
> require the newly propose implementation which allows specification of 
> both the bus and device as not all devices are PCI, which have multiple 
> host ports using SR-IOV, but there are cases when an dynamic 
> creation/destruction of ports may also need to be supported, which is 
> what the representor APIs support.

The full proposed syntax of device identification is:
	http://dpdk.org/ml/archives/dev/2017-December/084572.html
With this generic syntax, we can describe port representors and request
their initialization via whitelisting.

[...]
> > 3/ You are using PCI address + index to identify the representor.
> > It is a no-go. We have made effort to abstract buses.
> > As an idea, the identification of a representor could use the new
> > proposed flexible device syntax.
> 
> We are currently using net_representor_%bus%_%device_id%_%vport_id% to 
> identify each representor device but I have no issue changing to either 
> the current convention which would be net_representor_%unique_id% or if 
> I understand the proposal in the RFC "ether: standardize getting the 
> port by name" we would be using something like,
> we should be looking at something along the lines of 
> net_%bus%_%device_id%_%port_id% which is pretty close to what we are 
> using now.

It is introducing yet another syntax.
It would be better to align every device identification usages
on a common and standard syntax.
Then the syntax parsing will be done in bus and libs (e.g. ethdev)
with some helper functions which can be re-used for every usages.
The latest version of representors is using direct PCI parsing instead
of relying on some bus or ethdev helpers.

> In terms of that RFC I'm not clear on if the proposal is just to affect 
> the API for getting a port by name, or actually the name name assigned 
> to the device itself.

It is not about the name. It is a proposal of syntax to describe a
resource of a hardware device, or a virtual device.
The most obvious usage is to replace -w/-b and --vdev.
It can also be used in OVS or any other DPDK configuration.

> > 4/ Such new API must be experimental.
> 
> We will address this in the next revision
> 
> > I propose to better think the representor API inside ethdev
> > with a good multi-vendor collaboration,
> > and submit a deprecation notice for 18.05 integration.
> 
> I would really like to see this included as experimental in 18.02 
> release, if it is agreed by the community that we need to re-integate 
> the representor concept into librte_ethdev during for 18.05 we will 
> support that work.

I don't see the benefit of introducing a lib and a syntax which would
be replaced in the next release.


More information about the dev mailing list