[dpdk-dev] [PATCH v3 2/3] eal: add synchronous multi-process communication

Tan, Jianfeng jianfeng.tan at intel.com
Thu Jan 25 18:10:34 CET 2018



On 1/26/2018 12:22 AM, Burakov, Anatoly wrote:
> On 25-Jan-18 3:03 PM, Ananyev, Konstantin wrote:
>>
>>
>>> -----Original Message-----
>>> From: Burakov, Anatoly
>>> Sent: Thursday, January 25, 2018 1:10 PM
>>> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Tan, 
>>> Jianfeng <jianfeng.tan at intel.com>; dev at dpdk.org
>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; thomas at monjalon.net
>>> Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous 
>>> multi-process communication
>>>
>>> On 25-Jan-18 1:05 PM, Burakov, Anatoly wrote:
>>>> On 25-Jan-18 1:00 PM, Ananyev, Konstantin wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Burakov, Anatoly
>>>>>> Sent: Thursday, January 25, 2018 12:26 PM
>>>>>> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Tan, 
>>>>>> Jianfeng
>>>>>> <jianfeng.tan at intel.com>; dev at dpdk.org
>>>>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; 
>>>>>> thomas at monjalon.net
>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>> communication
>>>>>>
>>>>>> On 25-Jan-18 12:19 PM, Ananyev, Konstantin wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Burakov, Anatoly
>>>>>>>> Sent: Thursday, January 25, 2018 12:00 PM
>>>>>>>> To: Tan, Jianfeng <jianfeng.tan at intel.com>; dev at dpdk.org
>>>>>>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; Ananyev,
>>>>>>>> Konstantin <konstantin.ananyev at intel.com>; thomas at monjalon.net
>>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>>>> communication
>>>>>>>>
>>>>>>>> On the overall patch,
>>>>>>>>
>>>>>>>> Reviewed-by: Anatoly Burakov <anatoly.burakov at intel.com>
>>>>>>>>
>>>>>>>> For request(), returning number of replies received actually makes
>>>>>>>> sense, because now we get use the value to read our replies, if we
>>>>>>>> were
>>>>>>>> a primary process sending messages to secondary processes.
>>>>>>>
>>>>>>> Yes, I also think it is good to return number of sends.
>>>>>>> Then caller can compare number of sended requests with number of
>>>>>>> received replies and decide should it be considered a failure or 
>>>>>>> no.
>>>>>>>
>>>>>>
>>>>>> Well, OK, that might make sense. However, i think it would've be 
>>>>>> of more
>>>>>> value to make the API consistent (0/-1 on success/failure) and put
>>>>>> number of sent messages into the reply, like number of received. 
>>>>>> I.e.
>>>>>> something like
>>>>>>
>>>>>> struct reply {
>>>>>>       int nb_sent;
>>>>>>       int nb_received;
>>>>>> };
>>>>>>
>>>>>> We do it for the latter already, so why not the former?
>>>>>
>>>>> The question is what treat as success/failure?
>>>>> Let say we sent 2 requests (of 3 possible), got back 1 response...
>>>>> Should we consider it as success or failure?
>>>>>
>>>>
>>>> I think "failure" is "something went wrong", not "secondary processes
>>>> didn't respond". For example, invalid parameters, or our socket 
>>>> suddenly
>>>> being closed, or some other error that prevents us from sending 
>>>> requests
>>>> to secondaries.
>>>>
>>>> As far as i can tell from the code, there's no way to know if the
>>>> secondary process is running other than by attempting to connect to 
>>>> it,
>>>> and get a response. So, failed connection should not be a failure
>>>> condition, because we can't know if we *can* connect to the process
>>>> until we do. Process may have ended, but socket files will still be
>>>> around, and there's nothing we can do about that. So i wouldn't 
>>>> consider
>>>> inability to send a message a failure condition.
>>>>
>>>
>>> Just to clarify - i'm suggesting leaving this decision up to the user.
>>> If a user expects there to be "n" processes running, but only "m"
>>> responses were received, he could treat it as error. Another user might
>>> simply send periodical updates/polls to secondaries, for whatever 
>>> reason
>>> (say, stats display), and won't really care if one of them just 
>>> died, so
>>> there's no error for that user.
>>>
>>> However, all of this has nothing to do with API. If we're able to send
>>> messages - it's not a failure. If we can't - it is. That's the part API
>>> should be concerned about, and that's what the return value should
>>> indicate, IMO.
>>
>> Ok so to clarify, you are suggesting:
>> we have N peers - if send_msg() returns success for all N - return 
>> success
>> (no matter did we get a reply or not)
>> Otherwise return a failure.
>> ?
>> Konstantin
>
> More along the lines of, return -1 if and only if something went 
> wrong. That might be invalid parameters, or that might be an error 
> with our own socket,

To check if the error is caused by our own socket, we check the errno 
after sendmsg?

Like for remote socket errors, we check:
- ECONNRESET
- ECONNREFUSED
- ENOBUFS

Right?

Thanks,
Jianfeng


> or something else to that effect. In all other cases, return 0 (that 
> includes cases where we sent N messages but M replies where N != M). 
> So, in other words, return 0 if we *could have succeeded* if nothing 
> went wrong on the other side, and only return -1 if something went 
> wrong on our side.
>
>>
>>
>>>
>>> -- 
>>> Thanks,
>>> Anatoly
>
>



More information about the dev mailing list