[dpdk-dev] [PATCH v6 4/6] ethdev: adjust APIs removal error report

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Jan 19 18:35:45 CET 2018



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, January 19, 2018 4:19 PM
> To: Matan Azrad <matan at mellanox.com>; Adrien Mazarguil <adrien.mazarguil at 6wind.com>; Gaetan Rivet <gaetan.rivet at 6wind.com>
> Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Andrew Rybchenko <arybchenko at solarflare.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Alejandro Lucero <alejandro.lucero at netronome.com>; Jerin Jacob
> <jerin.jacob at caviumnetworks.com>; Hemant Agrawal <hemant.agrawal at nxp.com>; Shahaf Shuler <shahafs at mellanox.com>; Adrien
> Mazarguil <adrien.mazarguil at 6wind.com>; Olivier MATZ <olivier.matz at 6wind.com>; Zhang, Helin <helin.zhang at intel.com>
> Subject: Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report
> 
> On 1/18/2018 6:10 PM, Matan Azrad wrote:
> > Hi Ferruh
> >
> > From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM
> >> On 1/18/2018 11:27 AM, Matan Azrad wrote:
> >>> rte_eth_dev_is_removed API was added to detect a device removal
> >>> synchronously.
> >>>
> >>> When a device removal occurs during control command execution, many
> >>> different errors can be reported to the user.
> >>>
> >>> Adjust all ethdev APIs error reports to return -EIO in case of device
> >>> removal using rte_eth_dev_is_removed API.
> >>>
> >>> Signed-off-by: Matan Azrad <matan at mellanox.com>
> >>> Acked-by: Thomas Monjalon <thomas at monjalon.net>
> >>> ---
> >>>  lib/librte_ether/rte_ethdev.c | 192
> >>> +++++++++++++++++++++++++++---------------
> >>>  lib/librte_ether/rte_ethdev.h |  51 ++++++++++-
> >>>  2 files changed, 170 insertions(+), 73 deletions(-)
> >>>
> >>> diff --git a/lib/librte_ether/rte_ethdev.c
> >>> b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644
> >>> --- a/lib/librte_ether/rte_ethdev.c
> >>> +++ b/lib/librte_ether/rte_ethdev.c
> >>> @@ -338,6 +338,16 @@ struct rte_eth_dev *
> >>>  	return -ENODEV;
> >>>  }
> >>>
> >>> +static int
> >>> +eth_err(uint16_t port_id, int ret)
> >>> +{
> >>> +	if (ret == 0)
> >>> +		return 0;
> >>> +	if (rte_eth_dev_is_removed(port_id))
> >>> +		return -EIO;
> >>> +	return ret;
> >>> +}
> >>> +
> >>>  /* attach the new device, then store port_id of the device */  int
> >>> rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7
> >>> +502,8 @@ struct rte_eth_dev *
> >>>  		return 0;
> >>>  	}
> >>>
> >>> -	return dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> >>> +	return eth_err(port_id, dev->dev_ops->rx_queue_start(dev,
> >>> +							     rx_queue_id));
> >>>
> >>>  }
> >>
> >> This patch updates *all* ethdev public APIs to add if device is removed
> >> check?
> >
> > Yes.
> >
> >> And each check goes to ethdev is_removed() dev_ops to ask if dev is
> >> removed.
> > Probably, if the REMOVED state setted in will not call device is_remove.
> >
> >> These must be better way of doing this, am I missing something.
> >
> > Suggest.
> 
> With a silly analogy, this is like a blind person asking each time if he is dead
> before talking to other person.

I am agree with Ferruh that it looks a bit clumsy...
Though I don't have any bright ideas here too.
As I can see right now is_removed() is implemented only for mlx PMD.
Would I make sense to hide that check inside mlx implementation then?
BTW:


 int
+rte_eth_dev_is_removed(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0);

I'd says these 2 checks have to be swapped.
Konstantin

+
+	if (dev->state == RTE_ETH_DEV_REMOVED)
+		return 1;
+


> 
> At first glance I can think of a kind of watchdog timer can be implemented in
> ethdev layer. It provides periodic checks and if device is dead it calls the
> registered user callback function.
> 
> This method presented as synchronous method but not triggered from side where
> event happens, I mean not triggered from PMD but from application.
> So does application doing polling continuously if device is dead?
> Or if application is relying this patch to add a check in each API, what happens
> if device removed during data processing, will app rely on asynchronous method?
> 
> I am including a few consumers of the ethdev to the mail thread, clearly I am
> not very supportive of this patch, but specially taking release is being close
> to the account, if there is no objection than me I will take as consensus to get
> the patch in.
> 
> >
> > This code will replace similar code in each PMD.
> >
> >> I definitely would like to see more comments for this patch.
> >>
> >> Another question is what happens if device removed while or before
> >> dev_ops called? There is no synchronizations in drivers for removal, right?
> >>
> >
> > Yes. You right, the device removal can be changed a moment after the call.
> > Actually the caller suspected in removal before call it(and want to validate it) - so it makes sense.
> > From this reason the check in ethdev APIs is called generally in error flows.
> >
> >
> >> <...>



More information about the dev mailing list