[dpdk-dev] [PATCH] vhost: fix connect hang in client mode

Ilya Maximets i.maximets at samsung.com
Thu Jul 21 14:13:14 CEST 2016



On 21.07.2016 15:10, Ilya Maximets wrote:
> On 21.07.2016 14:40, Yuanhan Liu wrote:
>> On Thu, Jul 21, 2016 at 02:14:59PM +0300, Ilya Maximets wrote:
>>>> Hmm, how about this fixup:
>>>> ------------------------------------------------------------------------------
>>>> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
>>>> index 8626d13..b0f45e6 100644
>>>> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
>>>> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
>>>> @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz)
>>>>  	errno = EINVAL;
>>>>  
>>>>  	ret = connect(fd, un, sz);
>>>> -	if (ret == -1 && errno != EINPROGRESS)
>>>> -		return -1;
>>>> -	if (ret == 0)
>>>> -		goto connected;
>>>> -
>>>> -	FD_ZERO(&fdset);
>>>> -	FD_SET(fd, &fdset);
>>>> -
>>>> -	ret = select(fd + 1, NULL, &fdset, NULL, &tv);
>>>> -	if (!ret)
>>>> -		errno = ETIMEDOUT;
>>>> -	if (ret != 1)
>>>> +	if (ret < 0 && errno != EISCONN)
>>>>  		return -1;
>>>>  
>>>>  	ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len);
>>>> @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz)
>>>>  		return -1;
>>>>  	}
>>>>  
>>>> -connected:
>>>>  	flags = fcntl(fd, F_GETFL, 0);
>>>>  	if (flags < 0) {
>>>>  		RTE_LOG(ERR, VHOST_CONFIG,
>>>> ------------------------------------------------------------------------------
>>>> ?
>>>>
>>>> We will not check the EINPROGRESS, but subsequent 'connect()' will return
>>>> EISCONN if connection already established. getsockopt() is kept just in
>>>> case. Subsequent 'connect()' will happen on the next iteration of
>>>> reconnection cycle (1 second sleep).
>>>
>>> I've sent v2 with this changes.
>>
>> Thanks. But still, it doesn't look clean to me. I was thinking following
>> might be cleaner?
>>
>>     diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c
>>     b/lib/librte_vhost/vhost_user/vhost-net-user.
>>     index f0f92f8..c0ef290 100644
>>     --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
>>     +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
>>     @@ -532,6 +532,10 @@ vhost_user_client_reconnect(void *arg __rte_unused)
>>                          reconn != NULL; reconn = next) {
>>                             next = TAILQ_NEXT(reconn, next);
>>     
>>     +                       if (reconn->conn_inprogress) {
>>     +                               /* do connect check here */
>>     +                       }
>>     +
>>                             if (connect(reconn->fd, (struct sockaddr *)&reconn->un,
>>                                         sizeof(reconn->un)) < 0)
>>                                     continue;
>>     @@ -605,6 +609,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket)
>>             reconn->un = un;
>>             reconn->fd = fd;
>>             reconn->vsocket = vsocket;
>>     +       reconn->conn_inprogress = errno == EINPROGRESS;
>>             pthread_mutex_lock(&reconn_list.mutex);
>>             TAILQ_INSERT_TAIL(&reconn_list.head, reconn, next);
>>             pthread_mutex_unlock(&reconn_list.mutex);
>>
>> It's just a rough diff, hopefully it shows my idea clearly. And of
>> course, we should not call connect() anymore when conn_inprogress
>> is set.
>>
>> What do you think of it?
> 
> I found that we can't check connection status without select/poll
> on it. 'getsockopt()' will return 0 with no errors if connection
> is not still established just like if it was.
> So, I think, the first version of this patch is the only
> acceptable solution.

Sorry, v2 is acceptable too, because it always calls 'connect()'.


More information about the dev mailing list