[dpdk-stable] [PATCH v2 4/4] net/failsafe: fix removed device handling
Matan Azrad
matan at mellanox.com
Thu Dec 14 15:43:03 CET 2017
Hi
> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet at 6wind.com]
> Sent: Thursday, December 14, 2017 3:27 PM
> To: Matan Azrad <matan at mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarguil at 6wind.com>; Thomas Monjalon
> <thomas at monjalon.net>; dev at dpdk.org; stable at dpdk.org
> Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling
>
> On Thu, Dec 14, 2017 at 01:07:31PM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet at 6wind.com]
> > > Sent: Thursday, December 14, 2017 12:49 PM
> > > To: Matan Azrad <matan at mellanox.com>
> > > Cc: Adrien Mazarguil <adrien.mazarguil at 6wind.com>; Thomas Monjalon
> > > <thomas at monjalon.net>; dev at dpdk.org; stable at dpdk.org
> > > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device
> > > handling
> > >
> > > On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> > > > Hi Gaetan
> > > >
> > >
>
> <snip>
>
> > > > > Ok, actually you were right here to do it this way. The "is_removed"
> > > > > check needs to happen after the operation attempt to effectively
> > > > > mitigate the possible race. Checking before attempting the call
> > > > > will be much less effective.
> > > > >
> > > > > That being said, would it be cleaner to have eth_dev ops return
> > > > > -ENODEV directly, and check against it within fail-safe?
> > > > >
> > > >
> > > > I think that according to "is_removed" semantic we must return a
> > > > Boolean
> > > value (Each value different from '0' means that the device is
> > > removed) like other functions in c library (for example isspace()).
> > > >
> > >
> > > Sure, I wasn't discussing the interface proposed by
> > > rte_eth_dev_is_removed().
> > >
> > > What I meant was to ask whether checking rte_eth_dev_is_removed()
> > > would be more interesting in the ethdev layer, making the
> > > eth_dev_ops return -ENODEV regardless of the previous error if this
> > > check is supported by the driver and signal that the port is removed.
> > >
> > > I think this information could be interesting to other systems, not
> > > just fail- safe.
> > >
> >
> > Ok. Got you now.
> > Interesting approach - plan:
> > 1. update fs_link_update to use rte_eth* functions.
>
> I'm surprised it doesn't already.
> Either the rte_eth* function was introduced after the failsafe, or be wary of
> potential issues. I don't see a problem right now though.
>
> > 2. maybe -EIO is preferred because -ENODEV is used for no port
> error?
>
> Good point, didn't think about it.
> Prepare yourself maybe to some arguments about the most relevant error
> code. -EIO seems fine to me, but maybe use a wrapper for all this.
>
> Something like:
>
> ---8<---
>
> static int
> eth_error(pid, int original_ret)
> {
> int ret;
>
> if (original_ret == 0)
> return original_ret;
> ret = rte_eth_is_removed(pid);
> if (ret == 0 || ret == -ENOTSUP)
> return original_ret;
> return -EIO;
> }
>
> int
> rte_eth_ops_xyz(pid)
> {
> int ret;
> ret = eth_dev(pid).ops_xyz();
> return eth_error(pid, ret);
> }
>
> --->8---
>
> This way you would be able to change it easily and the logic would be
> insulated.
>
Nice.
> > 3. update all relevant rte_eth* to use "is_removed" in error flows(1
> patch for flow APIs and 1 for the others).
> > 4. Change fs checks in error flows to check rte_eth* return values.
> > 5. Remove CC stable from commit massage.
> >
> > What do you think?
> >
>
> Agreed otherwise.
>
Will create V3, thanks!
> Thanks,
>
> > > --
> > > Gaëtan Rivet
> > > 6WIND
>
> --
> Gaëtan Rivet
> 6WIND
More information about the stable
mailing list