[dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add API's for VF management

Richardson, Bruce bruce.richardson at intel.com
Wed Sep 28 15:01:03 CEST 2016



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, September 28, 2016 12:24 PM
> To: Iremonger, Bernard <bernard.iremonger at intel.com>; Richardson, Bruce
> <bruce.richardson at intel.com>
> Cc: Thomas Monjalon <thomas.monjalon at 6wind.com>; dev at dpdk.org; Jerin Jacob
> <jerin.jacob at caviumnetworks.com>; Shah, Rahul R <rahul.r.shah at intel.com>;
> Lu, Wenzhuo <wenzhuo.lu at intel.com>; azelezniak <alexz at att.com>
> Subject: RE: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add API's for VF
> management
> 
> Hi lads,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Iremonger,
> > Bernard
> > Sent: Tuesday, September 27, 2016 3:13 PM
> > To: Richardson, Bruce <bruce.richardson at intel.com>
> > Cc: Thomas Monjalon <thomas.monjalon at 6wind.com>; dev at dpdk.org; Jerin
> > Jacob <jerin.jacob at caviumnetworks.com>; Shah, Rahul R
> > <rahul.r.shah at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>;
> > azelezniak <alexz at att.com>
> > Subject: Re: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add API's for
> > VF management
> >
> > Hi Bruce,
> >
> > <snip>
> >
> > > > > > > Subject: Re: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add
> > > > > > > API's for VF management
> > > > > > >
> > > > > > > 2016-09-23 17:02, Iremonger, Bernard:
> > > > > > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > > > > > > 2016-09-23 09:53, Richardson, Bruce:
> > > > > > > > > > From: Thomas Monjalon
> > > > > > > > > > [mailto:thomas.monjalon at 6wind.com]
> > > > > > > > > > > 2016-09-23 10:20, Bruce Richardson:
> > > > > > > > > > > > On Thu, Sep 22, 2016 at 07:04:37PM +0200, Thomas
> > > > > > > > > > > > Monjalon
> > > > > wrote:
> > > > > > > > > > > > > 2016-09-15 16:46, Iremonger, Bernard:
> > > > > > > > > > > > > > > > > Do we really need to expose VF specific
> > > > > > > > > > > > > > > > > functions
> > > > > here?
> > > > > > > > > > > > > > > > > It can be generic(PF/VF) function
> > > > > > > > > > > > > > > > > indexed only through
> > > > > > > > > > > port_id.
> > > > > > > > > > > > > > > > > (example: as
> > > > > > > > > > > > > > > > > rte_eth_dev_set_vlan_anti_spoof(uint8_t
> > > > > > > > > > > > > > > > > port_id, uint8_t on)) For instance, In
> > > > > > > > > > > > > > > > > Thunderx PMD, We are not exposing a
> > > > > > > > > > > > > > > > > separate port_id for PF. We only
> > > > > > > > > > > > > > > > > enumerate 0..N VFs as 0..N ethdev
> > > > > > > > > > > > > > > > > port_id
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Our intention with this patch is to
> > > > > > > > > > > > > > > > control the VF from the
> > > > > PF.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The following librte_ether functions
> > > > > > > > > > > > > > > > already work in a similar
> > > > > > > > > > > way:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > rte_eth_dev_set_vf_rxmode(uint8_t port_id,
> > > > > > > > > > > > > > > > uint16_t vf, uint16_t rx_mode, uint8_t on)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > rte_eth_dev_set_vf_rx(uint8_t port_id,
> > > > > > > > > > > > > > > > uint16_t vf, uint8_t
> > > > > > > > > > > > > > > > on)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > rte_eth_dev_set_vf_tx(uint8_t port_id,
> > > > > > > > > > > > > > > > uint16_t vf, uint8_t
> > > > > > > > > > > > > > > > on)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > int rte_eth_set_vf_rate_limit(uint8_t
> > > > > > > > > > > > > > > > port_id, uint16_t vf, uint16_t tx_rate,
> > > > > > > > > > > > > > > > uint64_t q_msk)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I have a bad feeling with these functions
> > > > > > > > > > > > > > > dedicated to VF from
> > > > > > > PF.
> > > > > > > > > > > > > > > Are we sure there is no other way?
> > > > > > > > > > > > > > > I mean we just need to know the VF with a port
> ID.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > When the VF is used in a VM the port ID of the
> > > > > > > > > > > > > > VF is not visible to
> > > > > > > > > > > the PF.
> > > > > > > > > > > > > > I don't think there is another way to do this.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't understand why we could not assign a
> > > > > > > > > > > > > port id to the VF from the host instead of
> > > > > > > > > > > > > having the couple PF port id /
> > > > > VF id.
> > > > > > > > > > > > > Can we enumerate all the VFs associated to a PF?
> > > > > > > > > > > > > Then can we allocate them a port id in the array
> > > > > rte_eth_devices?
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Thomas,
> > > > > > > > > > > >
> > > > > > > > > > > > The VF is not a port visible to DPDK, though, so
> > > > > > > > > > > > it shouldn't have a port id IMHO. DPDK can't
> > > > > > > > > > > > actually do anything
> > > > > with it.
> > > > > > > > > > >
> > > > > > > > > > > You say the contrary below.
> > > > > > > > > >
> > > > > > > > > > Well, yes and no. The driver can manipulate things for
> > > > > > > > > > the VF, but DPDK
> > > > > > > > > doesn't actually have a device that corresponds to the VF.
> > > > > > > > > There are no PCI bar mappings for it, DPDK can't do RX
> > > > > > > > > and TX with
> > > it etc.?
> > > > > > > > >
> > > > > > > > > Very good point.
> > > > > > > > > There are only few ethdev functions which are supported
> > > > > > > > > by every drivers, like Rx/Tx and would not be available
> > > > > > > > > for VF from PF
> > > > > interface.
> > > > > > > > >
> > > > > > > > > > > > The PCI device for the VF is likely passed through
> > > > > > > > > > > > to a different VM and being used there.
> > > > > > > > > > > > Unfortunately, the VF still needs certain things
> > > > > > > > > > > > done for it by the PF, so if the PF is under DPDK
> > > > > > > > > > > > control, it needs to provide the functionality to
> > > > > > > > > > > > assist
> > > > > > > the VF.
> > > > > > > > > > >
> > > > > > > > > > > Why not have a VF_from_PF driver which does the
> > > > > > > > > > > mailbox
> > > > > things?
> > > > > > > > > > > So you can manage the VF from the PF with a simple
> port id.
> > > > > > > > > > > It really seems to be the cleanest design to me.
> > > > > > > > > >
> > > > > > > > > > While I see your point, and it could work, I just want
> > > > > > > > > > to be sure that we are
> > > > > > > > > ok with the results of that. Suppose we do create
> > > > > > > > > ethdevs for the VFs controlled by the PF. Does the new
> > > > > > > > > VF get counted in the
> > > > > > > > > rte_eth_dev_count() value (I assume yes)? How are apps
> > > > > > > > > meant to use the port? Do they have to put in a special
> > > > > > > > > case when iterating through all the port ids to check
> > > > > > > > > that it's not a pseudo port that can't do anything. None
> > > > > > > > > of the standard ethdev calls from an app will work on
> > > > > > > > > it, you can't configure nb rx/tx queues on it, you can't
> > > > > > > > > start or
> > > > > > > stop it, you can't do rx or tx on it, etc, etc.
> > > > > > > > >
> > > > > > > > > Yes these devices would be special because their
> > > > > > > > > supported API would be quite different. I was thinking
> > > > > > > > > that in the future you could add most of the
> > > > > > > > > configuration functions through the VF
> > > > > mailbox.
> > > > > > > > > But the Intel mailbox currently support only some
> > > > > > > > > special configurations which are not supported by other
> > > > > > > > > devices even its own VF device (except setting MAC
> address).
> > > > > > > > > And when I read "set drop enable bit in the VF split rx
> > > > > > > > > control register", it becomes clear it is really
> > > > > > > > > specific and has nothing to do in the generic ethdev API.
> > > > > > > > > That's why it is a NACK.
> > > > > > > > >
> > > > > > > > > When we want to use these very specific features we are
> > > > > > > > > aware of the underlying device and driver. So we can
> > > > > > > > > directly include a header from the driver. I suggest to
> > > > > > > > > retrieve a handler for the device which is not a port id
> > > > > > > > > and will allow to call ixgbe functions
> > > > > directly.
> > > > > > > > > It could be achieved by adding an ethdev function like
> > > > > > > > > discussed
> > > here:
> > > > > > > > > 	http://dpdk.org/ml/archives/dev/2016-
> > > September/047392.html
> > > > > > > > >
> > > > > > > >
> > > > > > > > I have been reading the net/vhost mail thread above. The
> > > > > > > > following quote
> > > > > > > is from this thread.
> > > > > > > >
> > > > > > > > "It means I would be in favor of introducing API in
> > > > > > > > drivers for very specific
> > > > > > > features."
> > > > > > > >
> > > > > > > > At present all the PMD functions are accessed through the
> > > > > > > > eth_dev_ops
> > > > > > > structure, there are no PMD API's.
> > > > > > > >
> > > > > > > > Is your proposal to add API(s) to the DPDK ixgbe PMD
> > > > > > > > (similar to a driver
> > > > > > > ioctl API) which can be accessed through a generic API in the
> ethdev?
> > > > > > >
> > > > > > > Not exactly. I'm thinking about a PMD specific API.
> > > > > > > The only ethdev API you need would be a function to retrieve
> > > > > > > a handler (an opaque pointer on the device struct) from the
> port id.
> > > > > > > Then you can include rte_ixgbe.h and directly call the
> > > > > > > specific ixgbe function, passing the device handler.
> > > > > > > How does it sound?
> > > > > >
> > > > > > I have been prototyping this proposed solution, it appears to
> work.
> > > > > >
> > > > > > I have added the following function:
> > > > > >
> > > > > > int  rte_eth_dev_get_pmd_handle(uint8_t port_id, void**
> > > > > > pmd_handle);
> > > > > >
> > > > > > The pmd_handle is a pointer to a dev_ops structure containing
> > > > > > driver
> > > > > specific functions.
> > > > > >
> > > > > > Using the pmd_handle the driver specific functions can be
> > > > > > called (without having them in struct eth_dev_ops)
> > > > > >
> > > > > > Has this proposal been superseded by the discussion on the
> > > > > > following
> > > > > patch?
> > > > > >
> > > > > > [PATCH] net/vhost: Add function to retreive the 'vid' for a
> > > > > > given port id
> > > > >
> > > > > Maybe, it can be superseded by this discussion, yes.
> > > > > Bruce thinks we do not need rte_eth_dev_get_pmd_handle().
> > > > > What is your opinion about using port_id directly and retrieving
> > > > > the structs from the driver via rte_eth_devices?
> > > >
> > > > Looking at the code in rte_eth_devices[]
> > > >
> > > > struct rte_eth_dev  rte_eth_devices[RTE_MAX_ETHPORTS];
> > > >
> > > > struct rte_eth_dev {
> > > >
> > > > ...
> > > >
> > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD
> > > > */
> > > >
> > > > ...
> > > >
> > > >  void *pmd_ops;  /** < exported PMD specific functions */
> > > >
> > > > }
> > > >
> > > > The PMD functions are only accessible at present if they are in
> > > > struct
> > > eth_dev_ops.
> > > >
> > > > Adding a pmd_ops field to struct rte_eth_dev {} makes the PMD
> > > > functions
> > > accessible and is a simpler solution than using
> > > rte_eth_dev_get_pmd_handle() to get access to the PMD functions.
> > > >
> > > > Regards,
> > > >
> > >
> > > Why would an ops structure be needed? If it's a private API for a
> > > driver, there should be no need for function pointers, and instead
> > > the driver can define regular functions in it's header file, no?
> > >
> > > /Bruce
> >
> > The driver functions were static, I have made them public and added
> > them to the rte_pmd_ixgbe.h file, and it works.  These functions will
> also need to be added to the rte_pmd_ixgbe_version.map file, previously
> there were no public functions.
> 
> Sorry for being late in the discussion, but I have a question:
> If we  this way (force user to include driver specific headers and call
> driver specific functions), how you guys plan to make this functionality
> available for multiple driver types.
> From discussion with Bernard  understand that customers would need similar
> functionality for i40e.
> Does it mean that they'll have to re-implement this part of their code
> again?
> Or would have to create (and maintain) their own shim layer that would
> provide some s of abstraction?
> Basically their own version of rte_ethdev?
> 
> Konstantin

Yes, it's a problem. However, the concern right now is that this functionality is not generic across NICs, and will only be implemented for ixgbe and i40e. If it turns out to be implemented for a significant subset of DPDK PMDs, e.g. 4 or 5 drivers from at least two vendors, then I would expect the case to be made to "promote" this functionality to ethdev.

Does this sound reasonable. Any other thoughts?

/Bruce


More information about the dev mailing list