[dpdk-dev] [PATCH v3 2/5] Link Bonding PMD Library (librte_eal/librte_ether link bonding support changes)

Neil Horman nhorman at tuxdriver.com
Fri Jun 13 21:38:15 CEST 2014


On Fri, Jun 13, 2014 at 06:34:04PM +0000, Doherty, Declan wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Friday, June 13, 2014 5:08 PM
> > To: Doherty, Declan
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 2/5] Link Bonding PMD Library
> > (librte_eal/librte_ether link bonding support changes)
> > 
> > On Fri, Jun 13, 2014 at 03:41:59PM +0100, Declan Doherty wrote:
> > > Updating functionality in EAL to support adding link bonding
> > > devices via –vdev option. Link bonding devices will be
> > > initialized after all physical devices have been probed and
> > > initialized.
> > >
> > > Signed-off-by: Declan Doherty <declan.doherty at intel.com>
> > > ---
> > >  lib/librte_eal/common/eal_common_dev.c      | 66
> > +++++++++++++++++++++++++++--
> > >  lib/librte_eal/common/eal_common_pci.c      |  6 +++
> > >  lib/librte_eal/common/include/eal_private.h |  7 +++
> > >  lib/librte_eal/common/include/rte_dev.h     |  1 +
> > >  lib/librte_ether/rte_ethdev.c               | 34 +++++++++++++--
> > >  lib/librte_ether/rte_ethdev.h               |  7 ++-
> > >  lib/librte_pmd_pcap/rte_eth_pcap.c          | 22 +++++-----
> > >  lib/librte_pmd_ring/rte_eth_ring.c          | 32 +++++++-------
> > >  lib/librte_pmd_ring/rte_eth_ring.h          |  3 +-
> > >  lib/librte_pmd_xenvirt/rte_eth_xenvirt.c    |  2 +-
> > >  10 files changed, 144 insertions(+), 36 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_dev.c
> > b/lib/librte_eal/common/eal_common_dev.c
> > > index eae5656..b50c908 100644
> > > --- a/lib/librte_eal/common/eal_common_dev.c
> > > +++ b/lib/librte_eal/common/eal_common_dev.c
> > > @@ -75,14 +75,28 @@ rte_eal_dev_init(void)
> > >
> > >  	/* call the init function for each virtual device */
> > >  	TAILQ_FOREACH(devargs, &devargs_list, next) {
> > > +		uint8_t bdev = 0;
> > >
> > >  		if (devargs->type != RTE_DEVTYPE_VIRTUAL)
> > >  			continue;
> > >
> > >  		TAILQ_FOREACH(driver, &dev_driver_list, next) {
> > > -			if (driver->type != PMD_VDEV)
> > > +			/* RTE_DEVTYPE_VIRTUAL can only be a virtual or bonded
> > device*/
> > > +			if (driver->type != PMD_VDEV && driver->type !=
> > PMD_BDEV)
> > >  				continue;
> > >
> > > +			/*
> > > +			 * Bonded devices are not initialize here, we do it later in
> > > +			 * rte_eal_bonded_dev_init() after all physical devices
> > have been
> > > +			 * probed and initialized
> > > +			 */
> > > +			if (driver->type == PMD_BDEV &&
> > > +					!strncmp(driver->name, devargs-
> > >virtual.drv_name,
> > > +							strlen(driver->name))) {
> > > +				bdev = 1;
> > > +				break;
> > > +			}
> > > +
> > I really don't think you need to add a new device type for bonded devs.  Its got
> > no specific hardware that it drives, and you configure it with a --vdev command,
> > so treat it as one here.  I understand that you need to pass additional
> > information about slaves to a bonded device, which is fine, but you can do that
> > with kvargs pretty easily, at which point its just another vdev. The only other
> > requirement is that you initilize the bonded vdev after the slave vdevs have
> > been created, which you can do by any of several methods (a priority field to
> > indicate that bonded drivers should be initilized last/later, a deferral return
> > code from the init routine, or by dead reckoning via the careful construction of
> > the application command line (placed the bonded --vdev option last on the
> > command line argument list at run time).
> > 
> It was my initial intent to do as you have describe above, but the physical devices
> cause a real issue here, physical devices don't call through to rte_eth_dev_allocate until 
> during rte_eal_pci_probe call, so it's not possible to initialize the bonded device from
> within rte_eal_dev_init as the physical devices have not been fully initialized at this
> point, as a port_id has not been allocated and can't be added as bonding slaves. I don't 
> see away around this without changing the EAL API, which I've tried to avoid with this solution.
> 
> Ordering isn't an issue, and can easily be solved if the above problem didn't exist, and although
> a new device type isn't technically required, I think it's a cleaner solution than doing
> string comparisons.
> 


Ugh, you're right. That stinks.  We still shouldn't just drop the init code in
with the pci probe code though, especially if dpdk starts supporting more
stacked network devices. A notification system for interface creation that you
could monitor for pci_device addresses (B:D:F tupples) would be nice, but might
be overkill since dpdk doesn't really support device freeing yet.  what might be
useful is a generic ordering mechanism for rte_eal_dev_init.  If you added a
priority field to the rte_driver structure, you could create an enumeration for
the field to define when the init routine for that driver was called something
like:
typdef enum {
	PMD_INIT_FIRST = 0,
	PMD_INIT_POST_PROBE = 1,
}

Then you could make multiple calls to rte_eal_dev_init from rte_eal_init at the
appropriate places in the initalization process, each time passing one of the
above values.  rte_eal_dev_init would only initalize those drivers whos priority
matched the passed in priority.  That would be helpful for interfaces like
vlans, vxlans, tunnels, etc.


> > >
> > > +static int
> > > +rte_eth_dev_name_unique(const char* name)
> > > +{
> > > +	unsigned i;
> > > +
> > > +	for (i = 0; i < nb_ports; i++) {
> > > +		if (strcmp(rte_eth_devices[i].data->name, name) == 0)
> > > +			return -1;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  struct rte_eth_dev *
> > > -rte_eth_dev_allocate(void)
> > > +rte_eth_dev_allocate(const char* name)
> > >  {
> > >  	struct rte_eth_dev *eth_dev;
> > >
> > > @@ -165,23 +179,37 @@ rte_eth_dev_allocate(void)
> > >  	if (rte_eth_dev_data == NULL)
> > >  		rte_eth_dev_data_alloc();
> > >
> > > +	if (rte_eth_dev_name_unique(name)) {
> > > +		PMD_DEBUG_TRACE("Ethernet Device with name %s already
> > allocated!\n");
> > > +		return NULL;
> > > +	}
> > > +
> > This seems fairly racy if you allow dynamic device creation at run time from the
> > application, if multiple threads attempt to create bonds in parallel.
> 
> True but if this is an issue then there probably should be some locking around this allocation anyway.
Agreed the allocation of the port id is also racy.

> > 
> > >  /**
> > > diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c
> > b/lib/librte_pmd_pcap/rte_eth_pcap.c
> > Hmm, we're modifying other pmds for the naming feature, I think it would be best
> > split out into a separate patch.  Something entitled "support unique interface
> > naming for virtual pmds" or something.
> 
> True, but if I do that, I will need to break support for virtual devices in this patch set, and
> I would prefer to keep this functionality working initial release of link bonding. Also another
> patchset will be required at some point in the near future to tidy up/rationalize the current 
> handling of virtual devices identification and these changes do not change functionally how
> these devices work.

I'm not sure I'm following what you're saying here.  If you split this into
another patch set, and order earlier in this series, then you won't break
anything.  I'm not suggesting a functional change here, just a organizational
one, so that this change doesn't do more than the changelog indicates.

Neil



More information about the dev mailing list