[dpdk-dev] vhost_user: protect active rings from async ring changes

From: Maxime Coquelin <maxime.coquelin@redhat.com>

  Hi Jianfeng,

On 12/08/2017 09:51 AM, Tan, Jianfeng wrote:
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Friday, December 8, 2017 4:36 PM
>> To: Tan, Jianfeng; Victor Kaplansky; dev@dpdk.org; yliu@fridaylinux.org; Bie,
>> Tiwei
>> Cc: stable@dpdk.org; jfreiman@redhat.com
>> Subject: Re: [PATCH] vhost_user: protect active rings from async ring
>> changes
>>
>>
>>
>> On 12/08/2017 03:14 AM, Tan, Jianfeng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>>>> Sent: Thursday, December 7, 2017 6:02 PM
>>>> To: Tan, Jianfeng; Victor Kaplansky; dev@dpdk.org; yliu@fridaylinux.org;
>> Bie,
>>>> Tiwei
>>>> Cc: stable@dpdk.org; jfreiman@redhat.com
>>>> Subject: Re: [PATCH] vhost_user: protect active rings from async ring
>>>> changes
>>>>
>>>>
>>>>
>>>> On 12/07/2017 10:33 AM, Tan, Jianfeng wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Victor Kaplansky [mailto:vkaplans@redhat.com]
>>>>>> Sent: Wednesday, December 6, 2017 9:56 PM
>>>>>> To: dev@dpdk.org; yliu@fridaylinux.org; Bie, Tiwei; Tan, Jianfeng;
>>>>>> vkaplans@redhat.com
>>>>>> Cc: stable@dpdk.org; jfreiman@redhat.com; Maxime Coquelin
>>>>>> Subject: [PATCH] vhost_user: protect active rings from async ring
>> changes
>>>>>>
>>>>>> When performing live migration or memory hot-plugging,
>>>>>> the changes to the device and vrings made by message handler
>>>>>> done independently from vring usage by PMD threads.
>>>>>>
>>>>>> This causes for example segfauls during live-migration
>>>>>
>>>>> segfauls ->segfaults?
>>>>>
>>>>>> with MQ enable, but in general virtually any request
>>>>>> sent by qemu changing the state of device can cause
>>>>>> problems.
>>>>>>
>>>>>> These patches fixes all above issues by adding a spinlock
>>>>>> to every vring and requiring message handler to start operation
>>>>>> only after ensuring that all PMD threads related to the divece
>>>>>
>>>>> Another typo: divece.
>>>>>
>>>>>> are out of critical section accessing the vring data.
>>>>>>
>>>>>> Each vring has its own lock in order to not create contention
>>>>>> between PMD threads of different vrings and to prevent
>>>>>> performance degradation by scaling queue pair number.
>>>>>
>>>>> Also wonder how much overhead it brings.
>>>>>
>>>>> Instead of locking each vring, can we just, waiting a while (10us for
>> example)
>>>> after call destroy_device() callback so that every PMD thread has enough
>>>> time to skip out the criterial area?
>>>>
>>>> No, because we are not destroying the device when it is needed.
>>>> Actually, once destroy_device() is called, it is likely that the
>>>> application has taken care the ring aren't being processed anymore
>>>> before returning from the callback (This is at least the case with Vhost
>>>> PMD).
>>>
>>> OK, I did not put it right way as there are multiple cases above: migration
>> and memory hot plug. Let me try again:
>>>
>>> Whenever a vhost thread handles a message affecting PMD threads, (like
>> SET_MEM_TABLE, GET_VRING_BASE, etc) we can remove the dev flag -
>> VIRTIO_DEV_RUNNING, and wait for a while so that PMD threads skip out of
>> those criterial area. After message handling, reset the flag -
>> VIRTIO_DEV_RUNNING.
>>
>> I think you mean clearing vq's enabled flag, because PMD threads never
>> check the VIRTIO_DEV_RUNNING flag.
> 
> Ah, yes.
> 
>>
>>> I suppose it can work, basing on an assumption that PMD threads work in
>> polling mode and can skip criterial area quickly and inevitably.
>>
>> That sounds very fragile, because if the CPU aren't perfectly isolated,
>> your PMD thread can be preempted for interrupt handling for example.
>>
>> Or what if for some reason the PMD thread CPU stalls for a short while?
>>
>> The later is unlikely, but if it happens, it will be hard to debug.
>>
>> Let's see first the performance impact of using the spinlock. It might
>> not be that important because 99.9999% of the times, it will not even
>> spin.
> 
> Fair enough.

I did some benchmarks on my Broadwell test bench (see patch below), and
it seems that depending on the benchmark, perfs are on par, or better
with the spinlock! I guess it explains because with the spinlock there
is a better batching and less concurrent accesses on the rings, but I'm
not sure.

Please find my results below (CPU E5-2667 v4 @ 3.20GHz):

       Bench         v17.11     v17.11 + spinlock
  ---------------- ----------- -------------------
   PVP Bidir run1   19.29Mpps   19.26Mpps
   PVP Bidir run2   19.26Mpps   19.28Mpps
   TxOnly           18.47Mpps   18.83Mpps
   RxOnly           13.83Mpps   13.83Mpps
   IO Loopback      7.94Mpps    7.97Mpps

Patch:
---------------------------------------------------------

 From 7e18cefce682235558fed66a1fb87ab937ec9297 Mon Sep 17 00:00:00 2001
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Date: Fri, 8 Dec 2017 04:21:25 -0500
Subject: [PATCH] vhost: spinlock benchmarking

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
  lib/librte_vhost/vhost.c      |  2 ++
  lib/librte_vhost/vhost.h      |  2 ++
  lib/librte_vhost/virtio_net.c | 12 ++++++++++++
  3 files changed, 16 insertions(+)

Context	Check	Description
ci/checkpatch	warning	coding style issues
ci/Intel-compilation	fail	apply patch file failure

[dpdk-dev] vhost_user: protect active rings from async ring changes

Checks

Commit Message

Comments

Patch