[dpdk-dev] [PATCH 2/5] ethdev: add port ownership

Ananyev, Konstantin konstantin.ananyev at intel.com
Sun Dec 3 12:10:14 CET 2017



Hi Matan,

> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matan Azrad
> Sent: Sunday, December 3, 2017 8:05 AM
> To: Neil Horman <nhorman at tuxdriver.com>; Gaëtan Rivet <gaetan.rivet at 6wind.com>
> Cc: Thomas Monjalon <thomas at monjalon.net>; Wu, Jingjing <jingjing.wu at intel.com>; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Friday, December 1, 2017 2:10 PM
> > To: Gaëtan Rivet <gaetan.rivet at 6wind.com>
> > Cc: Matan Azrad <matan at mellanox.com>; Thomas Monjalon
> > <thomas at monjalon.net>; Jingjing Wu <jingjing.wu at intel.com>;
> > dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > Hello Matan, Neil,
> > >
> > > I like the port ownership concept. I think it is needed to clarify
> > > some operations and should be useful to several subsystems.
> > >
> > > This patch could certainly be sub-divided however, and your current
> > > 1/5 should probably come after this one.
> > >
> > > Some comments inline.
> > >
> > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It may be convenient for multi-process applications to know which
> > > > >    process is in charge of a port.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the user is not trying to use a port which
> > > > > is already managed by fail-safe.
> > > > >
> > > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > > management of a device by different DPDK entities.
> > > > >
> > > > > A port owner is built from owner id(number) and owner name(string)
> > > > > while the owner id must be unique to distinguish between two
> > > > > identical entity instances and the owner name can be any name.
> > > > > The name helps to logically recognize the owner by different DPDK
> > > > > entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it can
> > > > > manage the port or not.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > >
> > > The internal port management is not affected, but the external
> > > interface is, however. In order to respect port ownership,
> > > applications are forced to modify their port iterator, as shown by your
> > testpmd patch.
> > >
> > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > > would represent the application itself (probably with the ID 0 and an
> > > owner string ""). Only with specific additional configuration should
> > > this default subset of ethdev be divided.
> > >
> > > This would make this evolution seamless for applications, at no cost
> > > to the complexity of the design.
> > >
> > > > > Signed-off-by: Matan Azrad <matan at mellanox.com>
> > > >
> > > >
> > > > This seems fairly racy.  What if one thread attempts to set
> > > > ownership on a port, while another is checking it on another cpu in
> > > > parallel.  It doesn't seem like it will protect against that at all.
> > > > It also doesn't protect against the possibility of multiple threads
> > > > attempting to poll for rx in parallel, which I think was part of
> > > > Thomas's origional statement regarding port ownership (he noted that
> > > > the lockless design implied only a single thread should be allowed to poll
> > for receive or make configuration changes at a time.
> > > >
> > > > Neil
> > > >
> > >
> > > Isn't this race already there for any configuration operation /
> > > polling function? The DPDK arch is expecting applications to solve it.
> > > Why should port ownership be designed differently from other DPDK
> > components?
> > >
> > Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
> > your new api should contain it as well.
> >
> > > Embedding checks for port ownership within operations will force
> > > everyone to bear their costs, even those not interested in using this
> > > API. These checks should be kept outside, within the entity claiming
> > > ownership of the port, in the form of using the proper port iterator
> > > IMO.
> > >
> > No.  At the very least, you need to make the API itself exclusive.  That is to
> > say, you should at least ensure that a port ownership get call doesn't race
> > with a port ownership set call.  It seems rediculous to just leave that sort of
> > locking as an exercize to the user.
> >
> > Neil
> >
> Neil,
> As Thomas mentioned, a DPDK port is designed to be managed by only one
> thread (or synchronized DPDK entity).
> So all the port management includes port ownership shouldn't be synchronized,
> i.e. locks are not needed.
> If some application want to do dual thread port management, the responsibility
> to synchronize the port ownership or any other port management is on this
> application.
> Port ownership doesn't come to allow synchronized management of the port by
> two DPDK entities in parallel, it is just nice way to answer next questions:
> 	1. Is the port already owned by some DPDK entity(in early control path)?
> 	2. If yes, Who is the owner?
> If the answer to the first question is no, the current entity can take the ownership
> without any lock(1 thread).
> If the answer to the first question is yes, you can recognize the owner and take
> decisions accordingly, sometimes you can decide to use the port because you
> logically know what the current owner does and you can be logically synchronized
> with it, sometimes you can just leave this port because you have not any deal with
>  this owner.

If the goal is just to have an ability to recognize is that device is managed by another device
(failsafe, bonding, etc.),  then I think all we need is a pointer to rte_eth_dev_data of the owner
(NULL would mean no owner).
Also I think if we'd like to introduce that mechanism, then it needs to be
- mandatory (control API just don't allow changes to the device configuration if caller is not an owner).
- transparent to the user (no API changes).
 - set/get owner ops need to be atomic if we want this mechanism to be usable for MP.
Konstantin  





> 
> > > > > ---
> > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > ++++++++++++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > index 6a0c9f9..af639a1 100644
> > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > > > lock. This PMD feature found in som
> > > > >
> > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > capability probing details.
> > > > >
> > > > > -Device Identification and Configuration
> > > > > +Device Identification, Ownership  and Configuration
> > > > >  ---------------------------------------
> > > > >
> > > > >  Device Identification
> > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> > assigned two other identifiers:
> > > > >  *   A port name used to designate the port in console messages, for
> > administration or debugging purposes.
> > > > >      For ease of use, the port name includes the port index.
> > > > >
> > > > > +Port Ownership
> > > > > +~~~~~~~~~~~~~
> > > > > +The Ethernet devices ports can be owned by a single DPDK entity
> > (application, library, PMD, process, etc).
> > > > > +The ownership mechanism is controlled by ethdev APIs and allows to
> > set/remove/get a port owner by DPDK entities.
> > > > > +Allowing this should prevent any multiple management of Ethernet
> > port by different entities.
> > > > > +
> > > > > +.. note::
> > > > > +
> > > > > +    It is the DPDK entity responsibility either to check the port owner
> > before using it or to set the port owner to prevent others from using it.
> > > > > +
> > > > >  Device Configuration
> > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -71,6 +71,7 @@
> > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER
> > + 1;
> > > > >  static uint8_t eth_dev_last_created_port;
> > > > >
> > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > struct rte_eth_dev *
> > > > >  	if (eth_dev == NULL)
> > > > >  		return -EINVAL;
> > > > >
> > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > +rte_eth_dev_owner));
> > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > >  	return 0;
> > > > >  }
> > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > >  		return 1;
> > > > >  }
> > > > >
> > > > > +static int
> > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > +		return 0;
> > > > > +	}
> > > > > +	return 1;
> > > > > +}
> > > > > +
> > > > > +uint16_t
> > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > +		port_id++;
> > > > > +
> > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > +		return RTE_MAX_ETHPORTS;
> > > > > +
> > > > > +	return port_id;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > Ethernet port owners.\n");
> > > > > +		return -EUSERS;
> > > > > +	}
> > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	int ret;
> > > > > +
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    port_owner->id != owner->id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot set owner to port %d already owned by
> > %s_%05d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	ret = snprintf(port_owner->name,
> > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > +		       owner->name);
> > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > +		memset(port_owner->name, 0,
> > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +	port_owner->id = owner->id;
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > > +			    owner->name, owner->id);
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != owner_id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot remove port %d owner %s_%05d by
> > different owner id %5d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id,
> > owner_id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > removed.\n", port_id,
> > > > > +			port_owner->name, port_owner->id);
> > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +void
> > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > +	uint16_t p;
> > > > > +
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return;
> > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > +			    "%05d identifier has removed.\n", owner_id); }
> > > > > +
> > > > > +const struct rte_eth_dev_owner *
> > > > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > RTE_ETH_DEV_NO_OWNER)
> > > > > +		return NULL;
> > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > +}
> > > > > +
> > > > >  int
> > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > > > index 341c2d6..f54c26d 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > >
> > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > >
> > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > +
> > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > +
> > > > > +struct rte_eth_dev_owner {
> > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> > name. */
> > > > > +};
> > > > > +
> > > > >  /**
> > > > >   * @internal
> > > > >   * The data part, with no function pointers, associated with each
> > ethernet device.
> > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > >  	int numa_node;  /**< NUMA node connection */
> > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > >  	/**< VLAN filter configuration. */
> > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > >  };
> > > > >
> > > > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82
> > > > > @@ struct rte_eth_dev_data {
> > > > >
> > > > >
> > > > >  /**
> > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > + *
> > > > > + * @param port_id
> > > > > + *   The id of the next possible valid owned port.
> > > > > + * @param	owner_id
> > > > > + *  The owner identifier.
> > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> > ports.
> > > > > + * @return
> > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > there is none.
> > > > > + */
> > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> > owner.
> > > > > + */
> > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > +
> > > > > +/**
> > > > > + * Get a new unique owner identifier.
> > > > > + * An owner identifier is used to owns Ethernet devices by only
> > > > > +one DPDK entity
> > > > > + * to avoid multiple management of device by different entities.
> > > > > + *
> > > > > + * @param	owner_id
> > > > > + *   Owner identifier pointer.
> > > > > + * @return
> > > > > + *   Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Set an Ethernet device owner.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of the port to own.
> > > > > + * @param	owner
> > > > > + *  The owner pointer.
> > > > > + * @return
> > > > > + *  Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > +
> > > > > +/**
> > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of port to make ownerless.
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + * @return
> > > > > + *  0 on success, negative errno value on error.
> > > > > + */
> > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Remove owner from all Ethernet devices owned by a specific
> > owner.
> > > > > + *
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + */
> > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Get the owner of an Ethernet device.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The port identifier.
> > > > > + * @return
> > > > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > > > + */
> > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > +uint16_t port_id);
> > > > > +
> > > > > +/**
> > > > >   * Get the total number of Ethernet devices that have been
> > successfully
> > > > >   * initialized by the matching Ethernet driver during the PCI probing
> > phase
> > > > >   * and that are available for applications to use. These devices
> > > > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > index e9681ac..7d07edb 100644
> > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > >
> > > > >  } DPDK_17.08;
> > > > >
> > > > > +DPDK_18.02 {
> > > > > +	global:
> > > > > +
> > > > > +	rte_eth_find_next_owned_by;
> > > > > +	rte_eth_dev_owner_new;
> > > > > +	rte_eth_dev_owner_set;
> > > > > +	rte_eth_dev_owner_remove;
> > > > > +	rte_eth_dev_owner_delete;
> > > > > +	rte_eth_dev_owner_get;
> > > > > +
> > > > > +} DPDK_17.11;
> > > > > +
> > > > >  EXPERIMENTAL {
> > > > >  	global:
> > > > >
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > > > >
> > >
> > > --
> > > Gaëtan Rivet
> > > 6WIND
> > >


More information about the dev mailing list