[dpdk-dev] net/mlx5: fix error number handling

Message ID 20180604173731.29125-1-yskoh@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Shahaf Shuler
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Yongseok Koh June 4, 2018, 5:37 p.m. UTC
  rte_errno should be saved only if error has occurred because rte_errno
could have garbage value.

Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Nélio Laranjeiro June 5, 2018, 6:52 a.m. UTC | #1
On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> rte_errno should be saved only if error has occurred because rte_errno
> could have garbage value.
> 
> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 994be05be..eaffe7495 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
>  		/* The flow does not match. */
>  		continue;
>  	}
> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> +	if (ret)
> +		ret = rte_errno; /* Save rte_errno before cleanup. */
>  	if (flow)
>  		mlx5_flow_list_destroy(dev, &priv->flows, flow);
>  exit:
> -- 
> 2.11.0

This patch is not enough, the returned value being -rte_errno if no
error is detected by the function it cannot set rte_errno nor return it.

Thanks,
  
Yongseok Koh June 5, 2018, 9:36 p.m. UTC | #2
> On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:

> 

> On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:

>> rte_errno should be saved only if error has occurred because rte_errno

>> could have garbage value.

>> 

>> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")

>> Cc: stable@dpdk.org

>> 

>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

>> ---

>> drivers/net/mlx5/mlx5_flow.c | 3 ++-

>> 1 file changed, 2 insertions(+), 1 deletion(-)

>> 

>> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c

>> index 994be05be..eaffe7495 100644

>> --- a/drivers/net/mlx5/mlx5_flow.c

>> +++ b/drivers/net/mlx5/mlx5_flow.c

>> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,

>> 		/* The flow does not match. */

>> 		continue;

>> 	}

>> -	ret = rte_errno; /* Save rte_errno before cleanup. */

>> +	if (ret)

>> +		ret = rte_errno; /* Save rte_errno before cleanup. */

>> 	if (flow)

>> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);

>> exit:

>> -- 

>> 2.11.0

> 

> This patch is not enough, the returned value being -rte_errno if no

> error is detected by the function it cannot set rte_errno nor return it.


We may need to refactor this kind of code (saving and restoring rte_errno). I
still don't understand why we should preserve rte_errno like this.

Even if this function returns success, there's no obligation to preserve
rte_errno in the function. Once it is called, the ownership of rte_errno belongs
to this function. I can't find how we define this per-lcore variable but, from
the man page of errno,

       The  <errno.h>  header  file  defines  the integer variable errno, which
       is set by system calls and some library functions in the event of an
       error to indicate what went wrong.  Its value is significant only when
       the return value of the call indicated an error (i.e., -1 from most
       system calls; -1 or NULL from most library  functions);
       a function that succeeds is allowed to change errno.

So, I still think an API can change rte_errno even if it succeeds, no need to
preserve it. If needed, the caller has to save it.


Thanks,
Yongseok
  
Nélio Laranjeiro June 6, 2018, 6:55 a.m. UTC | #3
On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
> > On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> > 
> > On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> >> rte_errno should be saved only if error has occurred because rte_errno
> >> could have garbage value.
> >> 
> >> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> >> Cc: stable@dpdk.org
> >> 
> >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >> ---
> >> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> >> index 994be05be..eaffe7495 100644
> >> --- a/drivers/net/mlx5/mlx5_flow.c
> >> +++ b/drivers/net/mlx5/mlx5_flow.c
> >> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> >> 		/* The flow does not match. */
> >> 		continue;
> >> 	}
> >> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> >> +	if (ret)
> >> +		ret = rte_errno; /* Save rte_errno before cleanup. */
> >> 	if (flow)
> >> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
> >> exit:
> >> -- 
> >> 2.11.0
> > 
> > This patch is not enough, the returned value being -rte_errno if no
> > error is detected by the function it cannot set rte_errno nor return it.
> 
> We may need to refactor this kind of code (saving and restoring rte_errno). I
> still don't understand why we should preserve rte_errno like this.
>
> Even if this function returns success, there's no obligation to preserve
> rte_errno in the function. Once it is called, the ownership of rte_errno belongs
> to this function.
>
> I can't find how we define this per-lcore variable but, from
> the man page of errno,
> 
>        The  <errno.h>  header  file  defines  the integer variable errno, which
>        is set by system calls and some library functions in the event of an
>        error to indicate what went wrong.  Its value is significant only when
>        the return value of the call indicated an error (i.e., -1 from most
>        system calls; -1 or NULL from most library  functions);
>        a function that succeeds is allowed to change errno.
>
> So, I still think an API can change rte_errno even if it succeeds, no need to
> preserve it. If needed, the caller has to save it.

Functions in this PMD are defined as is:

  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.

Which means rte_errno is only modified in case of error.

This fix does not respect the documentation of the function or any other
function of the PMD which can return errors.  rte_errno is only set if
an error is encountered and contains only the error code of the first
error sub-sequent ones are considered consequences of the first one and
thus not preserved.

Not preserving the rte_errno in roll backs is equivalent to not setting
it at all as a function called by the rollback may also set it, example:

 {
    void * a;

    foo_do();
    a  = malloc(10);
    if (!a) {
    	rte_errno = ENOMEM;
	foo_undo();
	return -rte_errno;
    }
 }

If foo_undo() also encounter an error it will modify the rte_errno which
may have a value different from ENOMEM, for the callee won't be informed
the error is due to a memory issue and thus cannot make counter parts.
In such situation the rte_errno must be preserved to keep the ENOMEM
error code.
This is also the main reason almost all system function only update
errno when no error is encountered.

Regards,
  
Yongseok Koh June 6, 2018, 6:39 p.m. UTC | #4
On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
> On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
> > > On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> > > 
> > > On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> > >> rte_errno should be saved only if error has occurred because rte_errno
> > >> could have garbage value.
> > >> 
> > >> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> > >> Cc: stable@dpdk.org
> > >> 
> > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > >> ---
> > >> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> > >> 1 file changed, 2 insertions(+), 1 deletion(-)
> > >> 
> > >> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> > >> index 994be05be..eaffe7495 100644
> > >> --- a/drivers/net/mlx5/mlx5_flow.c
> > >> +++ b/drivers/net/mlx5/mlx5_flow.c
> > >> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> > >> 		/* The flow does not match. */
> > >> 		continue;
> > >> 	}
> > >> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> > >> +	if (ret)
> > >> +		ret = rte_errno; /* Save rte_errno before cleanup. */
> > >> 	if (flow)
> > >> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
> > >> exit:
> > >> -- 
> > >> 2.11.0
> > > 
> > > This patch is not enough, the returned value being -rte_errno if no
> > > error is detected by the function it cannot set rte_errno nor return it.
> > 
> > We may need to refactor this kind of code (saving and restoring rte_errno). I
> > still don't understand why we should preserve rte_errno like this.
> >
> > Even if this function returns success, there's no obligation to preserve
> > rte_errno in the function. Once it is called, the ownership of rte_errno belongs
> > to this function.
> >
> > I can't find how we define this per-lcore variable but, from
> > the man page of errno,
> > 
> >        The  <errno.h>  header  file  defines  the integer variable errno, which
> >        is set by system calls and some library functions in the event of an
> >        error to indicate what went wrong.  Its value is significant only when
> >        the return value of the call indicated an error (i.e., -1 from most
> >        system calls; -1 or NULL from most library  functions);
> >        a function that succeeds is allowed to change errno.
> >
> > So, I still think an API can change rte_errno even if it succeeds, no need to
> > preserve it. If needed, the caller has to save it.
> 
> Functions in this PMD are defined as is:
> 
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
> 
> Which means rte_errno is only modified in case of error.
> 
> This fix does not respect the documentation of the function or any other
> function of the PMD which can return errors.

That's logically a wrong interpretation. According to the description, if
returning error, rte_errno is set but the opposite isn't always true. Even if
rte_errno is set, it doesn't mean there's an error. So the description coincides
with that of errno. If you want to enforce preserving rte_errno in case of
success, you should amend the documentation.

> rte_errno is only set if an error is encountered and contains only the error
> code of the first error sub-sequent ones are considered consequences of the
> first one and thus not preserved.
> 
> Not preserving the rte_errno in roll backs is equivalent to not setting
> it at all as a function called by the rollback may also set it, example:
> 
>  {
>     void * a;
> 
>     foo_do();
>     a  = malloc(10);
>     if (!a) {
>     	rte_errno = ENOMEM;
> 	foo_undo();

This example is weird. You can simply set rte_errno after foo_undo() in this
case.

> 	return -rte_errno;
>     }
>  }
> 
> If foo_undo() also encounter an error it will modify the rte_errno which
> may have a value different from ENOMEM, for the callee won't be informed
> the error is due to a memory issue and thus cannot make counter parts.
> In such situation the rte_errno must be preserved to keep the ENOMEM
> error code.

I knew it. That's why rte_errno is saved before calling another API which may
change the rte_errno inside. But, we are talking about a case where an API
returns success. If caller is supposed to save rte_errno (when it's needed), why
does callee have to put some effort to preserve it even in case of success? If
rte_errno must be preserved even in case of success, we have to make a big
change to preserve rte_errno for cases where a void function is called (or cases
where we don't check its return value of non-void function).

> This is also the main reason almost all system function only update
> errno when no error is encountered.

'Almost' doesn't mean 'all", does it? It is true that such functions must update
errno when it returns error but it doesn't care about the value when it returns
success. Like the man page I attached above, the errno is significant only when
it returns an error. And "a function that succeeds is allowed to change errno."


So, the decision point is whether we want to preserve rte_errno in case of
success? My opinion is no.


Thanks,
Yongseok
  
Nélio Laranjeiro June 7, 2018, 7:39 a.m. UTC | #5
On Wed, Jun 06, 2018 at 11:39:27AM -0700, Yongseok Koh wrote:
> On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
> > On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
> > > > On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> > > > 
> > > > On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> > > >> rte_errno should be saved only if error has occurred because rte_errno
> > > >> could have garbage value.
> > > >> 
> > > >> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> > > >> Cc: stable@dpdk.org
> > > >> 
> > > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > >> ---
> > > >> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> > > >> 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >> 
> > > >> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> > > >> index 994be05be..eaffe7495 100644
> > > >> --- a/drivers/net/mlx5/mlx5_flow.c
> > > >> +++ b/drivers/net/mlx5/mlx5_flow.c
> > > >> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> > > >> 		/* The flow does not match. */
> > > >> 		continue;
> > > >> 	}
> > > >> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> > > >> +	if (ret)
> > > >> +		ret = rte_errno; /* Save rte_errno before cleanup. */
> > > >> 	if (flow)
> > > >> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
> > > >> exit:
> > > >> -- 
> > > >> 2.11.0
> > > > 
> > > > This patch is not enough, the returned value being -rte_errno if no
> > > > error is detected by the function it cannot set rte_errno nor return it.
> > > 
> > > We may need to refactor this kind of code (saving and restoring rte_errno). I
> > > still don't understand why we should preserve rte_errno like this.
> > >
> > > Even if this function returns success, there's no obligation to preserve
> > > rte_errno in the function. Once it is called, the ownership of rte_errno belongs
> > > to this function.
> > >
> > > I can't find how we define this per-lcore variable but, from
> > > the man page of errno,
> > > 
> > >        The  <errno.h>  header  file  defines  the integer variable errno, which
> > >        is set by system calls and some library functions in the event of an
> > >        error to indicate what went wrong.  Its value is significant only when
> > >        the return value of the call indicated an error (i.e., -1 from most
> > >        system calls; -1 or NULL from most library  functions);
> > >        a function that succeeds is allowed to change errno.
> > >
> > > So, I still think an API can change rte_errno even if it succeeds, no need to
> > > preserve it. If needed, the caller has to save it.
> > 
> > Functions in this PMD are defined as is:
> > 
> >   * @return
> >   *   0 on success, a negative errno value otherwise and rte_errno is set.
> > 
> > Which means rte_errno is only modified in case of error.
> > 
> > This fix does not respect the documentation of the function or any other
> > function of the PMD which can return errors.
> 
> That's logically a wrong interpretation. According to the description, if
> returning error, rte_errno is set but the opposite isn't always true. Even if
> rte_errno is set, it doesn't mean there's an error. So the description coincides
> with that of errno. If you want to enforce preserving rte_errno in case of
> success, you should amend the documentation.
> 
> > rte_errno is only set if an error is encountered and contains only the error
> > code of the first error sub-sequent ones are considered consequences of the
> > first one and thus not preserved.
> > 
> > Not preserving the rte_errno in roll backs is equivalent to not setting
> > it at all as a function called by the rollback may also set it, example:
> > 
> >  {
> >     void * a;
> > 
> >     foo_do();
> >     a  = malloc(10);
> >     if (!a) {
> >     	rte_errno = ENOMEM;
> > 	foo_undo();
> 
> This example is weird. You can simply set rte_errno after foo_undo() in this
> case.
> 
> > 	return -rte_errno;
> >     }
> >  }
> > 
> > If foo_undo() also encounter an error it will modify the rte_errno which
> > may have a value different from ENOMEM, for the callee won't be informed
> > the error is due to a memory issue and thus cannot make counter parts.
> > In such situation the rte_errno must be preserved to keep the ENOMEM
> > error code.
> 
> I knew it. That's why rte_errno is saved before calling another API which may
> change the rte_errno inside. But, we are talking about a case where an API
> returns success. If caller is supposed to save rte_errno (when it's needed), why
> does callee have to put some effort to preserve it even in case of success? If
> rte_errno must be preserved even in case of success, we have to make a big
> change to preserve rte_errno for cases where a void function is called (or cases
> where we don't check its return value of non-void function).
>
> > This is also the main reason almost all system function only update
> > errno when no error is encountered.
> 
> 'Almost' doesn't mean 'all", does it? It is true that such functions must update
> errno when it returns error but it doesn't care about the value when it returns
> success. Like the man page I attached above, the errno is significant only when
> it returns an error. And "a function that succeeds is allowed to change errno."

It is "almost" because a system function touching the errno when the
function succeed it not common.  But as the man page says it is not
impossible.

> So, the decision point is whether we want to preserve rte_errno in case of
> success? My opinion is no.

I did not understood it was only a concern about the success of the
function, even it is better to avoid as most as possible a useless
store, in this specific case, as errno (rte_errno) has a garbage value,
I fully agree with you.

Regards,
  
Yongseok Koh June 18, 2018, 5:06 p.m. UTC | #6
> On Jun 7, 2018, at 12:39 AM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> 
> On Wed, Jun 06, 2018 at 11:39:27AM -0700, Yongseok Koh wrote:
>> On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
>>> On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
>>>>> On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
>>>>> 
>>>>> On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
>>>>>> rte_errno should be saved only if error has occurred because rte_errno
>>>>>> could have garbage value.
>>>>>> 
>>>>>> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
>>>>>> Cc: stable@dpdk.org
>>>>>> 
>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>> ---
>>>>>> drivers/net/mlx5/mlx5_flow.c | 3 ++-
>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>> 
>>>>>> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
>>>>>> index 994be05be..eaffe7495 100644
>>>>>> --- a/drivers/net/mlx5/mlx5_flow.c
>>>>>> +++ b/drivers/net/mlx5/mlx5_flow.c
>>>>>> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
>>>>>> 		/* The flow does not match. */
>>>>>> 		continue;
>>>>>> 	}
>>>>>> -	ret = rte_errno; /* Save rte_errno before cleanup. */
>>>>>> +	if (ret)
>>>>>> +		ret = rte_errno; /* Save rte_errno before cleanup. */
>>>>>> 	if (flow)
>>>>>> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
>>>>>> exit:
>>>>>> -- 
>>>>>> 2.11.0
>>>>> 
>>>>> This patch is not enough, the returned value being -rte_errno if no
>>>>> error is detected by the function it cannot set rte_errno nor return it.
>>>> 
>>>> We may need to refactor this kind of code (saving and restoring rte_errno). I
>>>> still don't understand why we should preserve rte_errno like this.
>>>> 
>>>> Even if this function returns success, there's no obligation to preserve
>>>> rte_errno in the function. Once it is called, the ownership of rte_errno belongs
>>>> to this function.
>>>> 
>>>> I can't find how we define this per-lcore variable but, from
>>>> the man page of errno,
>>>> 
>>>>       The  <errno.h>  header  file  defines  the integer variable errno, which
>>>>       is set by system calls and some library functions in the event of an
>>>>       error to indicate what went wrong.  Its value is significant only when
>>>>       the return value of the call indicated an error (i.e., -1 from most
>>>>       system calls; -1 or NULL from most library  functions);
>>>>       a function that succeeds is allowed to change errno.
>>>> 
>>>> So, I still think an API can change rte_errno even if it succeeds, no need to
>>>> preserve it. If needed, the caller has to save it.
>>> 
>>> Functions in this PMD are defined as is:
>>> 
>>>  * @return
>>>  *   0 on success, a negative errno value otherwise and rte_errno is set.
>>> 
>>> Which means rte_errno is only modified in case of error.
>>> 
>>> This fix does not respect the documentation of the function or any other
>>> function of the PMD which can return errors.
>> 
>> That's logically a wrong interpretation. According to the description, if
>> returning error, rte_errno is set but the opposite isn't always true. Even if
>> rte_errno is set, it doesn't mean there's an error. So the description coincides
>> with that of errno. If you want to enforce preserving rte_errno in case of
>> success, you should amend the documentation.
>> 
>>> rte_errno is only set if an error is encountered and contains only the error
>>> code of the first error sub-sequent ones are considered consequences of the
>>> first one and thus not preserved.
>>> 
>>> Not preserving the rte_errno in roll backs is equivalent to not setting
>>> it at all as a function called by the rollback may also set it, example:
>>> 
>>> {
>>>    void * a;
>>> 
>>>    foo_do();
>>>    a  = malloc(10);
>>>    if (!a) {
>>>    	rte_errno = ENOMEM;
>>> 	foo_undo();
>> 
>> This example is weird. You can simply set rte_errno after foo_undo() in this
>> case.
>> 
>>> 	return -rte_errno;
>>>    }
>>> }
>>> 
>>> If foo_undo() also encounter an error it will modify the rte_errno which
>>> may have a value different from ENOMEM, for the callee won't be informed
>>> the error is due to a memory issue and thus cannot make counter parts.
>>> In such situation the rte_errno must be preserved to keep the ENOMEM
>>> error code.
>> 
>> I knew it. That's why rte_errno is saved before calling another API which may
>> change the rte_errno inside. But, we are talking about a case where an API
>> returns success. If caller is supposed to save rte_errno (when it's needed), why
>> does callee have to put some effort to preserve it even in case of success? If
>> rte_errno must be preserved even in case of success, we have to make a big
>> change to preserve rte_errno for cases where a void function is called (or cases
>> where we don't check its return value of non-void function).
>> 
>>> This is also the main reason almost all system function only update
>>> errno when no error is encountered.
>> 
>> 'Almost' doesn't mean 'all", does it? It is true that such functions must update
>> errno when it returns error but it doesn't care about the value when it returns
>> success. Like the man page I attached above, the errno is significant only when
>> it returns an error. And "a function that succeeds is allowed to change errno."
> 
> It is "almost" because a system function touching the errno when the
> function succeed it not common.  But as the man page says it is not
> impossible.
> 
>> So, the decision point is whether we want to preserve rte_errno in case of
>> success? My opinion is no.
> 
> I did not understood it was only a concern about the success of the
> function, even it is better to avoid as most as possible a useless
> store, in this specific case, as errno (rte_errno) has a garbage value,
> I fully agree with you.

Nelio,

Do you still want me to make any change for this patch?
Let me know if any.

Thanks,
Yongseok
  
Nélio Laranjeiro June 19, 2018, 11:48 a.m. UTC | #7
On Mon, Jun 18, 2018 at 05:06:41PM +0000, Yongseok Koh wrote:
> 
> > On Jun 7, 2018, at 12:39 AM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> > 
> > On Wed, Jun 06, 2018 at 11:39:27AM -0700, Yongseok Koh wrote:
> >> On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
> >>> On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
> >>>>> On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> >>>>> 
> >>>>> On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> >>>>>> rte_errno should be saved only if error has occurred because rte_errno
> >>>>>> could have garbage value.
> >>>>>> 
> >>>>>> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> >>>>>> Cc: stable@dpdk.org
> >>>>>> 
> >>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>>> ---
> >>>>>> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>> 
> >>>>>> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> >>>>>> index 994be05be..eaffe7495 100644
> >>>>>> --- a/drivers/net/mlx5/mlx5_flow.c
> >>>>>> +++ b/drivers/net/mlx5/mlx5_flow.c
> >>>>>> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> >>>>>> 		/* The flow does not match. */
> >>>>>> 		continue;
> >>>>>> 	}
> >>>>>> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> >>>>>> +	if (ret)
> >>>>>> +		ret = rte_errno; /* Save rte_errno before cleanup. */
> >>>>>> 	if (flow)
> >>>>>> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
> >>>>>> exit:
> >>>>>> -- 
> >>>>>> 2.11.0
> >>>>> 
> >>>>> This patch is not enough, the returned value being -rte_errno if no
> >>>>> error is detected by the function it cannot set rte_errno nor return it.
> >>>> 
> >>>> We may need to refactor this kind of code (saving and restoring rte_errno). I
> >>>> still don't understand why we should preserve rte_errno like this.
> >>>> 
> >>>> Even if this function returns success, there's no obligation to preserve
> >>>> rte_errno in the function. Once it is called, the ownership of rte_errno belongs
> >>>> to this function.
> >>>> 
> >>>> I can't find how we define this per-lcore variable but, from
> >>>> the man page of errno,
> >>>> 
> >>>>       The  <errno.h>  header  file  defines  the integer variable errno, which
> >>>>       is set by system calls and some library functions in the event of an
> >>>>       error to indicate what went wrong.  Its value is significant only when
> >>>>       the return value of the call indicated an error (i.e., -1 from most
> >>>>       system calls; -1 or NULL from most library  functions);
> >>>>       a function that succeeds is allowed to change errno.
> >>>> 
> >>>> So, I still think an API can change rte_errno even if it succeeds, no need to
> >>>> preserve it. If needed, the caller has to save it.
> >>> 
> >>> Functions in this PMD are defined as is:
> >>> 
> >>>  * @return
> >>>  *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> 
> >>> Which means rte_errno is only modified in case of error.
> >>> 
> >>> This fix does not respect the documentation of the function or any other
> >>> function of the PMD which can return errors.
> >> 
> >> That's logically a wrong interpretation. According to the description, if
> >> returning error, rte_errno is set but the opposite isn't always true. Even if
> >> rte_errno is set, it doesn't mean there's an error. So the description coincides
> >> with that of errno. If you want to enforce preserving rte_errno in case of
> >> success, you should amend the documentation.
> >> 
> >>> rte_errno is only set if an error is encountered and contains only the error
> >>> code of the first error sub-sequent ones are considered consequences of the
> >>> first one and thus not preserved.
> >>> 
> >>> Not preserving the rte_errno in roll backs is equivalent to not setting
> >>> it at all as a function called by the rollback may also set it, example:
> >>> 
> >>> {
> >>>    void * a;
> >>> 
> >>>    foo_do();
> >>>    a  = malloc(10);
> >>>    if (!a) {
> >>>    	rte_errno = ENOMEM;
> >>> 	foo_undo();
> >> 
> >> This example is weird. You can simply set rte_errno after foo_undo() in this
> >> case.
> >> 
> >>> 	return -rte_errno;
> >>>    }
> >>> }
> >>> 
> >>> If foo_undo() also encounter an error it will modify the rte_errno which
> >>> may have a value different from ENOMEM, for the callee won't be informed
> >>> the error is due to a memory issue and thus cannot make counter parts.
> >>> In such situation the rte_errno must be preserved to keep the ENOMEM
> >>> error code.
> >> 
> >> I knew it. That's why rte_errno is saved before calling another API which may
> >> change the rte_errno inside. But, we are talking about a case where an API
> >> returns success. If caller is supposed to save rte_errno (when it's needed), why
> >> does callee have to put some effort to preserve it even in case of success? If
> >> rte_errno must be preserved even in case of success, we have to make a big
> >> change to preserve rte_errno for cases where a void function is called (or cases
> >> where we don't check its return value of non-void function).
> >> 
> >>> This is also the main reason almost all system function only update
> >>> errno when no error is encountered.
> >> 
> >> 'Almost' doesn't mean 'all", does it? It is true that such functions must update
> >> errno when it returns error but it doesn't care about the value when it returns
> >> success. Like the man page I attached above, the errno is significant only when
> >> it returns an error. And "a function that succeeds is allowed to change errno."
> > 
> > It is "almost" because a system function touching the errno when the
> > function succeed it not common.  But as the man page says it is not
> > impossible.
> > 
> >> So, the decision point is whether we want to preserve rte_errno in case of
> >> success? My opinion is no.
> > 
> > I did not understood it was only a concern about the success of the
> > function, even it is better to avoid as most as possible a useless
> > store, in this specific case, as errno (rte_errno) has a garbage value,
> > I fully agree with you.
> 
> Nelio,
> 
> Do you still want me to make any change for this patch?
> Let me know if any.

With your modification the function documentation is no more accurate as
rte_errno is always set.
  
Yongseok Koh June 19, 2018, 11 p.m. UTC | #8
> On Jun 19, 2018, at 4:48 AM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
> 
> On Mon, Jun 18, 2018 at 05:06:41PM +0000, Yongseok Koh wrote:
>> 
>>> On Jun 7, 2018, at 12:39 AM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
>>> 
>>> On Wed, Jun 06, 2018 at 11:39:27AM -0700, Yongseok Koh wrote:
>>>> On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
>>>>> On Tue, Jun 05, 2018 at 09:36:32PM +0000, Yongseok Koh wrote:
>>>>>>> On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro <nelio.laranjeiro@6wind.com> wrote:
>>>>>>> 
>>>>>>> On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
>>>>>>>> rte_errno should be saved only if error has occurred because rte_errno
>>>>>>>> could have garbage value.
>>>>>>>> 
>>>>>>>> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
>>>>>>>> Cc: stable@dpdk.org
>>>>>>>> 
>>>>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>>>>> ---
>>>>>>>> drivers/net/mlx5/mlx5_flow.c | 3 ++-
>>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>> 
>>>>>>>> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
>>>>>>>> index 994be05be..eaffe7495 100644
>>>>>>>> --- a/drivers/net/mlx5/mlx5_flow.c
>>>>>>>> +++ b/drivers/net/mlx5/mlx5_flow.c
>>>>>>>> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
>>>>>>>> 		/* The flow does not match. */
>>>>>>>> 		continue;
>>>>>>>> 	}
>>>>>>>> -	ret = rte_errno; /* Save rte_errno before cleanup. */
>>>>>>>> +	if (ret)
>>>>>>>> +		ret = rte_errno; /* Save rte_errno before cleanup. */
>>>>>>>> 	if (flow)
>>>>>>>> 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
>>>>>>>> exit:
>>>>>>>> -- 
>>>>>>>> 2.11.0
>>>>>>> 
>>>>>>> This patch is not enough, the returned value being -rte_errno if no
>>>>>>> error is detected by the function it cannot set rte_errno nor return it.
>>>>>> 
>>>>>> We may need to refactor this kind of code (saving and restoring rte_errno). I
>>>>>> still don't understand why we should preserve rte_errno like this.
>>>>>> 
>>>>>> Even if this function returns success, there's no obligation to preserve
>>>>>> rte_errno in the function. Once it is called, the ownership of rte_errno belongs
>>>>>> to this function.
>>>>>> 
>>>>>> I can't find how we define this per-lcore variable but, from
>>>>>> the man page of errno,
>>>>>> 
>>>>>>      The  <errno.h>  header  file  defines  the integer variable errno, which
>>>>>>      is set by system calls and some library functions in the event of an
>>>>>>      error to indicate what went wrong.  Its value is significant only when
>>>>>>      the return value of the call indicated an error (i.e., -1 from most
>>>>>>      system calls; -1 or NULL from most library  functions);
>>>>>>      a function that succeeds is allowed to change errno.
>>>>>> 
>>>>>> So, I still think an API can change rte_errno even if it succeeds, no need to
>>>>>> preserve it. If needed, the caller has to save it.
>>>>> 
>>>>> Functions in this PMD are defined as is:
>>>>> 
>>>>> * @return
>>>>> *   0 on success, a negative errno value otherwise and rte_errno is set.
>>>>> 
>>>>> Which means rte_errno is only modified in case of error.
>>>>> 
>>>>> This fix does not respect the documentation of the function or any other
>>>>> function of the PMD which can return errors.
>>>> 
>>>> That's logically a wrong interpretation. According to the description, if
>>>> returning error, rte_errno is set but the opposite isn't always true. Even if
>>>> rte_errno is set, it doesn't mean there's an error. So the description coincides
>>>> with that of errno. If you want to enforce preserving rte_errno in case of
>>>> success, you should amend the documentation.
>>>> 
>>>>> rte_errno is only set if an error is encountered and contains only the error
>>>>> code of the first error sub-sequent ones are considered consequences of the
>>>>> first one and thus not preserved.
>>>>> 
>>>>> Not preserving the rte_errno in roll backs is equivalent to not setting
>>>>> it at all as a function called by the rollback may also set it, example:
>>>>> 
>>>>> {
>>>>>   void * a;
>>>>> 
>>>>>   foo_do();
>>>>>   a  = malloc(10);
>>>>>   if (!a) {
>>>>>   	rte_errno = ENOMEM;
>>>>> 	foo_undo();
>>>> 
>>>> This example is weird. You can simply set rte_errno after foo_undo() in this
>>>> case.
>>>> 
>>>>> 	return -rte_errno;
>>>>>   }
>>>>> }
>>>>> 
>>>>> If foo_undo() also encounter an error it will modify the rte_errno which
>>>>> may have a value different from ENOMEM, for the callee won't be informed
>>>>> the error is due to a memory issue and thus cannot make counter parts.
>>>>> In such situation the rte_errno must be preserved to keep the ENOMEM
>>>>> error code.
>>>> 
>>>> I knew it. That's why rte_errno is saved before calling another API which may
>>>> change the rte_errno inside. But, we are talking about a case where an API
>>>> returns success. If caller is supposed to save rte_errno (when it's needed), why
>>>> does callee have to put some effort to preserve it even in case of success? If
>>>> rte_errno must be preserved even in case of success, we have to make a big
>>>> change to preserve rte_errno for cases where a void function is called (or cases
>>>> where we don't check its return value of non-void function).
>>>> 
>>>>> This is also the main reason almost all system function only update
>>>>> errno when no error is encountered.
>>>> 
>>>> 'Almost' doesn't mean 'all", does it? It is true that such functions must update
>>>> errno when it returns error but it doesn't care about the value when it returns
>>>> success. Like the man page I attached above, the errno is significant only when
>>>> it returns an error. And "a function that succeeds is allowed to change errno."
>>> 
>>> It is "almost" because a system function touching the errno when the
>>> function succeed it not common.  But as the man page says it is not
>>> impossible.
>>> 
>>>> So, the decision point is whether we want to preserve rte_errno in case of
>>>> success? My opinion is no.
>>> 
>>> I did not understood it was only a concern about the success of the
>>> function, even it is better to avoid as most as possible a useless
>>> store, in this specific case, as errno (rte_errno) has a garbage value,
>>> I fully agree with you.
>> 
>> Nelio,
>> 
>> Do you still want me to make any change for this patch?
>> Let me know if any.
> 
> With your modification the function documentation is no more accurate as
> rte_errno is always set.

I still don't agree with that but will send out v2. It's not a big deal.

Thanks,
Yongseok
  
Nélio Laranjeiro June 20, 2018, 7:05 a.m. UTC | #9
On Tue, Jun 19, 2018 at 11:00:25PM +0000, Yongseok Koh wrote:
>[...]
> >>> I did not understood it was only a concern about the success of the
> >>> function, even it is better to avoid as most as possible a useless
> >>> store, in this specific case, as errno (rte_errno) has a garbage value,
> >>> I fully agree with you.
> >> 
> >> Nelio,
> >> 
> >> Do you still want me to make any change for this patch?
> >> Let me know if any.
> > 
> > With your modification the function documentation is no more accurate as
> > rte_errno is always set.
> 
> I still don't agree with that but will send out v2. It's not a big deal.

What I meant is, you could have only changed the function documentation.

 @return
   0 on success with rte_errno always set, negative errno otherwise.

letting the function documentation saying rte_errno is only modified in
case of error whereas it is not is a bug or in the documentation or in
the code, but as a function must respect its documentation, it would
have raised a bug in the code itself.

Regards,
  

Patch

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 994be05be..eaffe7495 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -3561,7 +3561,8 @@  mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
 		/* The flow does not match. */
 		continue;
 	}
-	ret = rte_errno; /* Save rte_errno before cleanup. */
+	if (ret)
+		ret = rte_errno; /* Save rte_errno before cleanup. */
 	if (flow)
 		mlx5_flow_list_destroy(dev, &priv->flows, flow);
 exit: