[dpdk-dev] pthread_barrier_deadlock in -rc1

Tan, Jianfeng jianfeng.tan at intel.com
Wed May 2 11:32:30 CEST 2018


Hi Maxime and Olivier,

[...]
>>> Below patch can fix another strange sigsegv issue in my VM. Please check
>>> if it works for you. I doubt it's use-after-free problem which could
>>> lead to different issues in different env. Please have a try.
>>>
>>>
>>> diff --git a/lib/librte_eal/common/eal_common_thread.c
>>> b/lib/librte_eal/common/eal_common_thread.c
>>> index de69452..d91b67d 100644
>>> --- a/lib/librte_eal/common/eal_common_thread.c
>>> +++ b/lib/librte_eal/common/eal_common_thread.c
>>> @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const char
>>> *name,
>>>                   goto fail;
>>>
>>>           pthread_barrier_wait(&params->configured);
>>> +       pthread_barrier_destroy(&params->configured);
>> Thanks Jianfeng, that fixes my issue.
>> For correctness, I wonder whether we should check pthread_barrier_wait
>> return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD?
>> And so also do same the same thing in rte_thread_init().
>>
>> What do you think?
>> Thanks,
>> Maxime
>
> Thanks for the update. I also have a patch that replaces the barrier by
> a lock which could also work, but if Jianfeng's one fixes the issue, I
> think it is better.
>
> About the PTHREAD_BARRIER_SERIAL_THREAD, not sure it will change
> something:
>
>         Upon successful completion, the pthread_barrier_wait() function
>         shall return PTHREAD_BARRIER_SERIAL_THREAD for a single
>         (arbitrary) thread synchronized at the barrier and zero for each
>         of the other threads. Otherwise, an error number shall be
>         returned to indicate the error.
>
> I understand that it will ensure that only one barrier will return
> PTHREAD_BARRIER_SERIAL_THREAD, but not necessarily the last one. So
> if destroy() is called in the parent thread, it should be the same, no?
>
> By the way, there is also a small memory leak that was introduced by
> the previous patch, maybe you can add the fix too:
>
> -       if (ret != 0)
> +       if (ret != 0) {
> +               free(params);
>                  return ret;
> +       }

How about: the thread who gets PTHREAD_BARRIER_SERIAL_THREAD returned, 
is responsible for the destroy and free(params)?

Thanks,
Jianfeng


More information about the dev mailing list