diff mbox series

[3/4] eal: don't crash if alarm set fails

Message ID	20180725182019.31518-4-stephen@networkplumber.org (mailing list archive)
State	Superseded, archived
Delegated to:	Thomas Monjalon
Headers	From: Stephen Hemminger <stephen@networkplumber.org> To: dev@dpdk.org Cc: Stephen Hemminger <stephen@networkplumber.org>, Stephen Hemminger <sthemmin@microsoft.com> Date: Wed, 25 Jul 2018 11:20:18 -0700 Message-Id: <20180725182019.31518-4-stephen@networkplumber.org> In-Reply-To: <20180725182019.31518-1-stephen@networkplumber.org> References: <20180725182019.31518-1-stephen@networkplumber.org> Subject: [dpdk-dev] [PATCH 3/4] eal: don't crash if alarm set fails Precedence: list Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org>
Series	small cleanups \| [0/4] small cleanups [1/4] arm: remove profanity in comment [2/4] bnx2x: remove profanity [3/4] eal: don't crash if alarm set fails [4/4] ixgbe: remove mild profanity

Checks

Context	Check	Description
ci/checkpatch	success	coding style OK
ci/Intel-compilation	success	Compilation OK

Commit Message

Stephen Hemminger July 25, 2018, 6:20 p.m. UTC

  There is no need to call rte_exit and crash the application here;
better to let the application handle the error itself.

Remove the gratuitous profanity which would be visible if
the rte_exit was still there.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 lib/librte_eal/common/eal_common_proc.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Anatoly Burakov July 26, 2018, 9:34 a.m. UTC | #1

On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
> There is no need to call rte_exit and crash the application here;
> better to let the application handle the error itself.
> 
> Remove the gratuitous profanity which would be visible if
> the rte_exit was still there.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---

Oops, this was a "debug" message i accidentally left in :( My apologies!

Anatoly Burakov July 26, 2018, 9:41 a.m. UTC | #2

On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
> There is no need to call rte_exit and crash the application here;
> better to let the application handle the error itself.
> 
> Remove the gratuitous profanity which would be visible if
> the rte_exit was still there.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
>   lib/librte_eal/common/eal_common_proc.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
> index 9fcb9121908d..07b7579c565a 100644
> --- a/lib/librte_eal/common/eal_common_proc.c
> +++ b/lib/librte_eal/common/eal_common_proc.c
> @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
>   
>   	param->user_reply.nb_sent++;
>   
> -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> -			      async_reply_handle, pending_req) < 0) {
> +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> +				async_reply_handle, pending_req);
> +	if (ret < 0)
>   		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
>   			dst, req->name);
> -		rte_panic("Fix the above shit to properly free all memory\n");

Profanity aside, i think the message was trying to tell me something - 
namely, that if alarm_set fails, we're risking to leak this memory if 
reply from the peer never comes, and we're risking leaving the 
application hanging because the timeout never triggers. I'm not sure if 
leaving this "to the user" is the right choice, because there is no way 
for the user to free IPC-internal memory if it leaks.

So i think the proper way to handle this would've been to set the alarm 
first, then, if it fails, don't sent the message in the first place.

Thomas Monjalon Sept. 18, 2018, 9:43 a.m. UTC | #3

26/07/2018 11:41, Burakov, Anatoly:
> On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
> > There is no need to call rte_exit and crash the application here;
> > better to let the application handle the error itself.
> > 
> > Remove the gratuitous profanity which would be visible if
> > the rte_exit was still there.
> > 
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > ---
> > --- a/lib/librte_eal/common/eal_common_proc.c
> > +++ b/lib/librte_eal/common/eal_common_proc.c
> > @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
> >   
> >   	param->user_reply.nb_sent++;
> >   
> > -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> > -			      async_reply_handle, pending_req) < 0) {
> > +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> > +				async_reply_handle, pending_req);
> > +	if (ret < 0)
> >   		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
> >   			dst, req->name);
> > -		rte_panic("Fix the above shit to properly free all memory\n");
> 
> Profanity aside, i think the message was trying to tell me something - 
> namely, that if alarm_set fails, we're risking to leak this memory if 
> reply from the peer never comes, and we're risking leaving the 
> application hanging because the timeout never triggers. I'm not sure if 
> leaving this "to the user" is the right choice, because there is no way 
> for the user to free IPC-internal memory if it leaks.
> 
> So i think the proper way to handle this would've been to set the alarm 
> first, then, if it fails, don't sent the message in the first place.

What should be done here? OK to remove rte_panic for now?

Anatoly Burakov Sept. 18, 2018, 10:16 a.m. UTC | #4

On 18-Sep-18 10:43 AM, Thomas Monjalon wrote:
> 26/07/2018 11:41, Burakov, Anatoly:
>> On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
>>> There is no need to call rte_exit and crash the application here;
>>> better to let the application handle the error itself.
>>>
>>> Remove the gratuitous profanity which would be visible if
>>> the rte_exit was still there.
>>>
>>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>>> ---
>>> --- a/lib/librte_eal/common/eal_common_proc.c
>>> +++ b/lib/librte_eal/common/eal_common_proc.c
>>> @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
>>>    
>>>    	param->user_reply.nb_sent++;
>>>    
>>> -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
>>> -			      async_reply_handle, pending_req) < 0) {
>>> +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
>>> +				async_reply_handle, pending_req);
>>> +	if (ret < 0)
>>>    		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
>>>    			dst, req->name);
>>> -		rte_panic("Fix the above shit to properly free all memory\n");
>>
>> Profanity aside, i think the message was trying to tell me something -
>> namely, that if alarm_set fails, we're risking to leak this memory if
>> reply from the peer never comes, and we're risking leaving the
>> application hanging because the timeout never triggers. I'm not sure if
>> leaving this "to the user" is the right choice, because there is no way
>> for the user to free IPC-internal memory if it leaks.
>>
>> So i think the proper way to handle this would've been to set the alarm
>> first, then, if it fails, don't sent the message in the first place.
> 
> What should be done here? OK to remove rte_panic for now?
> 

As i said, the above fix is wrong because it leaks memory (however 
unlikely it may be).

The alarm set call should be moved to before we do send_msg() call (and 
goto fail; on failure). That way, even if alarm triggers too early (i.e. 
immediately), the requests tailq will still be locked until we complete 
our request sends - so we appropriately free memory on response, on 
timeout or in our failure handler if alarm set has failed.

Thomas Monjalon Oct. 24, 2018, 11:51 p.m. UTC | #5

18/09/2018 12:16, Burakov, Anatoly:
> On 18-Sep-18 10:43 AM, Thomas Monjalon wrote:
> > 26/07/2018 11:41, Burakov, Anatoly:
> >> On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
> >>> There is no need to call rte_exit and crash the application here;
> >>> better to let the application handle the error itself.
> >>>
> >>> Remove the gratuitous profanity which would be visible if
> >>> the rte_exit was still there.
> >>>
> >>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> >>> ---
> >>> --- a/lib/librte_eal/common/eal_common_proc.c
> >>> +++ b/lib/librte_eal/common/eal_common_proc.c
> >>> @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
> >>>    
> >>>    	param->user_reply.nb_sent++;
> >>>    
> >>> -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> >>> -			      async_reply_handle, pending_req) < 0) {
> >>> +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> >>> +				async_reply_handle, pending_req);
> >>> +	if (ret < 0)
> >>>    		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
> >>>    			dst, req->name);
> >>> -		rte_panic("Fix the above shit to properly free all memory\n");
> >>
> >> Profanity aside, i think the message was trying to tell me something -
> >> namely, that if alarm_set fails, we're risking to leak this memory if
> >> reply from the peer never comes, and we're risking leaving the
> >> application hanging because the timeout never triggers. I'm not sure if
> >> leaving this "to the user" is the right choice, because there is no way
> >> for the user to free IPC-internal memory if it leaks.
> >>
> >> So i think the proper way to handle this would've been to set the alarm
> >> first, then, if it fails, don't sent the message in the first place.
> > 
> > What should be done here? OK to remove rte_panic for now?
> > 
> 
> As i said, the above fix is wrong because it leaks memory (however 
> unlikely it may be).
> 
> The alarm set call should be moved to before we do send_msg() call (and 
> goto fail; on failure). That way, even if alarm triggers too early (i.e. 
> immediately), the requests tailq will still be locked until we complete 
> our request sends - so we appropriately free memory on response, on 
> timeout or in our failure handler if alarm set has failed.

Someone to fix it, please?

Anatoly Burakov Oct. 25, 2018, 2:04 p.m. UTC | #6

On 25-Oct-18 12:51 AM, Thomas Monjalon wrote:
> 18/09/2018 12:16, Burakov, Anatoly:
>> On 18-Sep-18 10:43 AM, Thomas Monjalon wrote:
>>> 26/07/2018 11:41, Burakov, Anatoly:
>>>> On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
>>>>> There is no need to call rte_exit and crash the application here;
>>>>> better to let the application handle the error itself.
>>>>>
>>>>> Remove the gratuitous profanity which would be visible if
>>>>> the rte_exit was still there.
>>>>>
>>>>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>>>>> ---
>>>>> --- a/lib/librte_eal/common/eal_common_proc.c
>>>>> +++ b/lib/librte_eal/common/eal_common_proc.c
>>>>> @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
>>>>>     
>>>>>     	param->user_reply.nb_sent++;
>>>>>     
>>>>> -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
>>>>> -			      async_reply_handle, pending_req) < 0) {
>>>>> +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
>>>>> +				async_reply_handle, pending_req);
>>>>> +	if (ret < 0)
>>>>>     		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
>>>>>     			dst, req->name);
>>>>> -		rte_panic("Fix the above shit to properly free all memory\n");
>>>>
>>>> Profanity aside, i think the message was trying to tell me something -
>>>> namely, that if alarm_set fails, we're risking to leak this memory if
>>>> reply from the peer never comes, and we're risking leaving the
>>>> application hanging because the timeout never triggers. I'm not sure if
>>>> leaving this "to the user" is the right choice, because there is no way
>>>> for the user to free IPC-internal memory if it leaks.
>>>>
>>>> So i think the proper way to handle this would've been to set the alarm
>>>> first, then, if it fails, don't sent the message in the first place.
>>>
>>> What should be done here? OK to remove rte_panic for now?
>>>
>>
>> As i said, the above fix is wrong because it leaks memory (however
>> unlikely it may be).
>>
>> The alarm set call should be moved to before we do send_msg() call (and
>> goto fail; on failure). That way, even if alarm triggers too early (i.e.
>> immediately), the requests tailq will still be locked until we complete
>> our request sends - so we appropriately free memory on response, on
>> timeout or in our failure handler if alarm set has failed.
> 
> Someone to fix it, please?
> 

I'll do it.

diff mbox series

Patch

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index 9fcb9121908d..07b7579c565a 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -841,14 +841,12 @@  mp_request_async(const char *dst, struct rte_mp_msg *req,
 
 	param->user_reply.nb_sent++;
 
-	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
-			      async_reply_handle, pending_req) < 0) {
+	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
+				async_reply_handle, pending_req);
+	if (ret < 0)
 		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
 			dst, req->name);
-		rte_panic("Fix the above shit to properly free all memory\n");
-	}
-
-	return 0;
+	return ret;
 fail:
 	free(pending_req);
 	free(reply_msg);