[dpdk-stable] [PATCH v16.11 LTS] vhost: protect active rings from async ring changes

Maxime Coquelin maxime.coquelin at redhat.com
Mon Mar 5 15:25:07 CET 2018



On 03/05/2018 02:05 PM, Luca Boccassi wrote:
> On Mon, 2018-03-05 at 13:34 +0100, Maxime Coquelin wrote:
>> With up-to-date Yuanhan address.
>>
>> On 03/05/2018 01:32 PM, Maxime Coquelin wrote:
>>>
>>>
>>> On 03/02/2018 06:28 PM, Luca Boccassi wrote:
>>>> On Fri, 2018-03-02 at 18:10 +0100, Maxime Coquelin wrote:
>>>>> From: Victor Kaplansky <victork at redhat.com>
>>>>>
>>>>> [ backported from upstream commit
>>>>> a3688046995f88c518fa27c45b39ae389260b18d ]
>>>>>
>>>>> When performing live migration or memory hot-plugging,
>>>>> the changes to the device and vrings made by message handler
>>>>> done independently from vring usage by PMD threads.
>>>>>
>>>>> This causes for example segfaults during live-migration
>>>>> with MQ enable, but in general virtually any request
>>>>> sent by qemu changing the state of device can cause
>>>>> problems.
>>>>>
>>>>> These patches fixes all above issues by adding a spinlock
>>>>> to every vring and requiring message handler to start operation
>>>>> only after ensuring that all PMD threads related to the device
>>>>> are out of critical section accessing the vring data.
>>>>>
>>>>> Each vring has its own lock in order to not create contention
>>>>> between PMD threads of different vrings and to prevent
>>>>> performance degradation by scaling queue pair number.
>>>>>
>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=1450680
>>>>>
>>>>> Cc: stable at dpdk.org
>>>>> Signed-off-by: Victor Kaplansky <victork at redhat.com>
>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>>> Acked-by: Yuanhan Liu <yliu at fridaylinux.org>
>>>>>
>>>>> Backport conflicts:
>>>>>      lib/librte_vhost/vhost.c
>>>>>      lib/librte_vhost/vhost.h
>>>>>      lib/librte_vhost/vhost_user.c
>>>>>      lib/librte_vhost/virtio_net.c
>>>>>
>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>>> ---
>>>>>
>>>>> Hi Luca, All,
>>>>>
>>>>> This is the v16.11 backport for Victor's patch already
>>>>> available in
>>>>> master and v17.11 LTS. It needed some rework to be applied to
>>>>> v16.11.
>>>>
>>>> Thank you, applied and pushed to dpdk-stable/16.11.
>>>>
>>>
>>> Thanks Luca,
>>>
>>> There is another patch that would be applied on top of it, as
>>> Victor's
>>> patch introduce a regression with Virtio-user. I see it is neither
>>> in
>>> 16.11 nor 17.11 LTS:
>>>
>>> commit 9fce5d0b401fc2c13a860bbbfdebcf85080334e1
>>> Author: Maxime Coquelin <maxime.coquelin at redhat.com>
>>> Date:   Mon Feb 12 16:46:12 2018 +0100
>>>
>>>       vhost: do not take lock on owner reset
>>>
>>>       A deadlock happens when handling VHOST_USER_RESET_OWNER
>>> request
>>>       for the same reason the lock is not taken for
>>>       VHOST_USER_GET_VRING_BASE.
>>>
>>>       It is safe not to take the lock, as the queues are no more
>>> used
>>>       by the application when the virtqueues and the device are
>>> reset.
>>>
>>>       Fixes: a3688046995f ("vhost: protect active rings from async
>>> ring
>>> changes")
>>>       Cc: stable at dpdk.org
>>>
>>>       Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>       Reviewed-by: Tiwei Bie <tiwei.bie at intel.com>
>>>       Reviewed-by: Jianfeng Tan <jianfeng.tan at intel.com>
>>>
>>>
>>> Let me know if you want me to post the backport to stable at dpdk.org,
>>> or if you can pick it directly from upstream master.
>>>
>>> Cheers,
>>> Maxime
> 
> I can take care of that for 16.11 - have you tested it on top of the
> current dpdk-stable/16.11 ? We are in the 11th hour so I want to make
> sure any new patches that I pick are tested :-)

I understand!
So I just tried to test the patch with virtio-usr, but it is not enabled 
in default config, and I didn't made it to work when enabled.

The issue it solves can be reproduced with old QEMU that sends
_RESET_OWNER request (v2.4 and earlier). With this, I manage to
reproduce the bug, and once patch is applied, the deadlock no more
appear, so:

Tested-by: Maxime Coquelin <maxime.coquelin at redhat.com>

Thanks!
Maxime


More information about the stable mailing list