[dpdk-dev] [PATCH v3 2/3] eal: add synchronous multi-process communication
Tan, Jianfeng
jianfeng.tan at intel.com
Thu Jan 25 18:10:34 CET 2018
On 1/26/2018 12:22 AM, Burakov, Anatoly wrote:
> On 25-Jan-18 3:03 PM, Ananyev, Konstantin wrote:
>>
>>
>>> -----Original Message-----
>>> From: Burakov, Anatoly
>>> Sent: Thursday, January 25, 2018 1:10 PM
>>> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Tan,
>>> Jianfeng <jianfeng.tan at intel.com>; dev at dpdk.org
>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; thomas at monjalon.net
>>> Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous
>>> multi-process communication
>>>
>>> On 25-Jan-18 1:05 PM, Burakov, Anatoly wrote:
>>>> On 25-Jan-18 1:00 PM, Ananyev, Konstantin wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Burakov, Anatoly
>>>>>> Sent: Thursday, January 25, 2018 12:26 PM
>>>>>> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Tan,
>>>>>> Jianfeng
>>>>>> <jianfeng.tan at intel.com>; dev at dpdk.org
>>>>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>;
>>>>>> thomas at monjalon.net
>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>> communication
>>>>>>
>>>>>> On 25-Jan-18 12:19 PM, Ananyev, Konstantin wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Burakov, Anatoly
>>>>>>>> Sent: Thursday, January 25, 2018 12:00 PM
>>>>>>>> To: Tan, Jianfeng <jianfeng.tan at intel.com>; dev at dpdk.org
>>>>>>>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; Ananyev,
>>>>>>>> Konstantin <konstantin.ananyev at intel.com>; thomas at monjalon.net
>>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>>>> communication
>>>>>>>>
>>>>>>>> On the overall patch,
>>>>>>>>
>>>>>>>> Reviewed-by: Anatoly Burakov <anatoly.burakov at intel.com>
>>>>>>>>
>>>>>>>> For request(), returning number of replies received actually makes
>>>>>>>> sense, because now we get use the value to read our replies, if we
>>>>>>>> were
>>>>>>>> a primary process sending messages to secondary processes.
>>>>>>>
>>>>>>> Yes, I also think it is good to return number of sends.
>>>>>>> Then caller can compare number of sended requests with number of
>>>>>>> received replies and decide should it be considered a failure or
>>>>>>> no.
>>>>>>>
>>>>>>
>>>>>> Well, OK, that might make sense. However, i think it would've be
>>>>>> of more
>>>>>> value to make the API consistent (0/-1 on success/failure) and put
>>>>>> number of sent messages into the reply, like number of received.
>>>>>> I.e.
>>>>>> something like
>>>>>>
>>>>>> struct reply {
>>>>>> int nb_sent;
>>>>>> int nb_received;
>>>>>> };
>>>>>>
>>>>>> We do it for the latter already, so why not the former?
>>>>>
>>>>> The question is what treat as success/failure?
>>>>> Let say we sent 2 requests (of 3 possible), got back 1 response...
>>>>> Should we consider it as success or failure?
>>>>>
>>>>
>>>> I think "failure" is "something went wrong", not "secondary processes
>>>> didn't respond". For example, invalid parameters, or our socket
>>>> suddenly
>>>> being closed, or some other error that prevents us from sending
>>>> requests
>>>> to secondaries.
>>>>
>>>> As far as i can tell from the code, there's no way to know if the
>>>> secondary process is running other than by attempting to connect to
>>>> it,
>>>> and get a response. So, failed connection should not be a failure
>>>> condition, because we can't know if we *can* connect to the process
>>>> until we do. Process may have ended, but socket files will still be
>>>> around, and there's nothing we can do about that. So i wouldn't
>>>> consider
>>>> inability to send a message a failure condition.
>>>>
>>>
>>> Just to clarify - i'm suggesting leaving this decision up to the user.
>>> If a user expects there to be "n" processes running, but only "m"
>>> responses were received, he could treat it as error. Another user might
>>> simply send periodical updates/polls to secondaries, for whatever
>>> reason
>>> (say, stats display), and won't really care if one of them just
>>> died, so
>>> there's no error for that user.
>>>
>>> However, all of this has nothing to do with API. If we're able to send
>>> messages - it's not a failure. If we can't - it is. That's the part API
>>> should be concerned about, and that's what the return value should
>>> indicate, IMO.
>>
>> Ok so to clarify, you are suggesting:
>> we have N peers - if send_msg() returns success for all N - return
>> success
>> (no matter did we get a reply or not)
>> Otherwise return a failure.
>> ?
>> Konstantin
>
> More along the lines of, return -1 if and only if something went
> wrong. That might be invalid parameters, or that might be an error
> with our own socket,
To check if the error is caused by our own socket, we check the errno
after sendmsg?
Like for remote socket errors, we check:
- ECONNRESET
- ECONNREFUSED
- ENOBUFS
Right?
Thanks,
Jianfeng
> or something else to that effect. In all other cases, return 0 (that
> includes cases where we sent N messages but M replies where N != M).
> So, in other words, return 0 if we *could have succeeded* if nothing
> went wrong on the other side, and only return -1 if something went
> wrong on our side.
>
>>
>>
>>>
>>> --
>>> Thanks,
>>> Anatoly
>
>
More information about the dev
mailing list