[dpdk-dev] [PATCH v6 1/6] ethdev: add devop to check removal status

Matan Azrad matan at mellanox.com
Thu Jan 18 19:02:28 CET 2018


Hi Ferruh

From: Ferruh Yigit, Thursday, January 18, 2018 7:18 PM
> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> > There is time between the physical removal of the device until PMDs
> > get a RMV interrupt. At this time DPDK PMDs and applications still
> > don't know about the removal.
> >
> > Current removal detection is achieved only by registration to device
> > RMV event and the notification comes asynchronously. So, there is no
> > option to detect a device removal synchronously.
> > Applications and other DPDK entities may want to check a device
> > removal synchronously and to take an immediate decision accordingly.
> 
> So we will have two methods to detect device removal, one is asynchronous
> as you mentioned.
> Device removal will cause an interrupt which trigger to run user callback.

Yes.

> New method is synchronous, but still triggered from application. I mean
> application should do a rte_eth_dev_is_removed() to learn about status,
> what is the use case here, polling continuously? Won't this also cause some
> latency unless you dedicate a core just polling device status?
> 

It is for application and for other DPDK entities like PMDs, see fail-safe example in this series.
When hotplug in the game I think it can be used for application too.

> > Add new dev op called is_removed to allow DPDK entities to check an
> > Ethernet device removal status immediately.
> >
> > Signed-off-by: Matan Azrad <matan at mellanox.com>
> > Acked-by: Thomas Monjalon <thomas at monjalon.net>
> > ---
> >  lib/librte_ether/rte_ethdev.c           | 28 +++++++++++++++++++++++++---
> >  lib/librte_ether/rte_ethdev.h           | 20 ++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  1 +
> >  3 files changed, 46 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index b349599..c93cec1 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -114,7 +114,8 @@ enum {
> >  rte_eth_find_next(uint16_t port_id)
> >  {
> >  	while (port_id < RTE_MAX_ETHPORTS &&
> > -	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED)
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > +	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
> 
> If device is removed, why we are not allowed to re-use port_id assigned to
> it?
Sorry, don't understand.
We allow still to iterate over it here.

> Overall I am not clear with RTE_ETH_DEV_REMOVED state, why we are not
> directly setting RTE_ETH_DEV_UNUSED?
 
Someone should release the SW port resources before setting it to UNUSED.

> And state RTE_ETH_DEV_REMOVED set in ethdev layer, and ethdev layer
> won't let reusing it, so what changes the state of dev? Will it stay as it is
> during lifetime of the application?
> 
> >  		port_id++;
> >
> >  	if (port_id >= RTE_MAX_ETHPORTS)
> > @@ -262,8 +263,7 @@ struct rte_eth_dev *
> > rte_eth_dev_is_valid_port(uint16_t port_id)  {
> >  	if (port_id >= RTE_MAX_ETHPORTS ||
> > -	    (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
> > -	     rte_eth_devices[port_id].state != RTE_ETH_DEV_DEFERRED))
> > +	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
> >  		return 0;
> >  	else
> >  		return 1;
> > @@ -1094,6 +1094,28 @@ struct rte_eth_dev *  }
> >
> >  int
> > +rte_eth_dev_is_removed(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);
> > +
> > +	if (dev->state == RTE_ETH_DEV_REMOVED)
> > +		return 1;
> 
> Isn't this conflict with below API documentation:
> 

Yes, You absolutely right, we need to change this documentation.

> "
>  * @return
>  *   - 0 when the Ethernet device is removed, otherwise 1.
> "
> 
> > +
> > +	ret = dev->dev_ops->is_removed(dev);
> > +	if (ret != 0)
> > +		dev->state = RTE_ETH_DEV_REMOVED;
> 
> It isn't clear what "dev_ops->is_removed(dev)" should return, and this
> causing incompatible usages in PMDs by time.
> Please add some documentation about expected return values for dev_ops.
>

OK
 
> 
> And this not real remove, PMD signals us and we stop using that device, but
> device can be there, right?

It says that the device is physically removed but there is some software resources which still were not released.

> If there is a real removal, can be possible to use eal hotplug?

I think EAL hotplug is asynchrony  as the current RMV event , so EAl hotplug event can be used instead of RMV event.
 


> <...>


More information about the dev mailing list