[dpdk-stable] [PATCH v16.11 LTS] vhost: protect active rings from async ring changes

Luca Boccassi bluca at debian.org
Mon Mar 5 16:54:50 CET 2018


On Mon, 2018-03-05 at 15:25 +0100, Maxime Coquelin wrote:
> 
> On 03/05/2018 02:05 PM, Luca Boccassi wrote:
> > On Mon, 2018-03-05 at 13:34 +0100, Maxime Coquelin wrote:
> > > With up-to-date Yuanhan address.
> > > 
> > > On 03/05/2018 01:32 PM, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 03/02/2018 06:28 PM, Luca Boccassi wrote:
> > > > > On Fri, 2018-03-02 at 18:10 +0100, Maxime Coquelin wrote:
> > > > > > From: Victor Kaplansky <victork at redhat.com>
> > > > > > 
> > > > > > [ backported from upstream commit
> > > > > > a3688046995f88c518fa27c45b39ae389260b18d ]
> > > > > > 
> > > > > > When performing live migration or memory hot-plugging,
> > > > > > the changes to the device and vrings made by message
> > > > > > handler
> > > > > > done independently from vring usage by PMD threads.
> > > > > > 
> > > > > > This causes for example segfaults during live-migration
> > > > > > with MQ enable, but in general virtually any request
> > > > > > sent by qemu changing the state of device can cause
> > > > > > problems.
> > > > > > 
> > > > > > These patches fixes all above issues by adding a spinlock
> > > > > > to every vring and requiring message handler to start
> > > > > > operation
> > > > > > only after ensuring that all PMD threads related to the
> > > > > > device
> > > > > > are out of critical section accessing the vring data.
> > > > > > 
> > > > > > Each vring has its own lock in order to not create
> > > > > > contention
> > > > > > between PMD threads of different vrings and to prevent
> > > > > > performance degradation by scaling queue pair number.
> > > > > > 
> > > > > > See https://bugzilla.redhat.com/show_bug.cgi?id=1450680
> > > > > > 
> > > > > > Cc: stable at dpdk.org
> > > > > > Signed-off-by: Victor Kaplansky <victork at redhat.com>
> > > > > > Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > > > > > Acked-by: Yuanhan Liu <yliu at fridaylinux.org>
> > > > > > 
> > > > > > Backport conflicts:
> > > > > >      lib/librte_vhost/vhost.c
> > > > > >      lib/librte_vhost/vhost.h
> > > > > >      lib/librte_vhost/vhost_user.c
> > > > > >      lib/librte_vhost/virtio_net.c
> > > > > > 
> > > > > > Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > > > > > ---
> > > > > > 
> > > > > > Hi Luca, All,
> > > > > > 
> > > > > > This is the v16.11 backport for Victor's patch already
> > > > > > available in
> > > > > > master and v17.11 LTS. It needed some rework to be applied
> > > > > > to
> > > > > > v16.11.
> > > > > 
> > > > > Thank you, applied and pushed to dpdk-stable/16.11.
> > > > > 
> > > > 
> > > > Thanks Luca,
> > > > 
> > > > There is another patch that would be applied on top of it, as
> > > > Victor's
> > > > patch introduce a regression with Virtio-user. I see it is
> > > > neither
> > > > in
> > > > 16.11 nor 17.11 LTS:
> > > > 
> > > > commit 9fce5d0b401fc2c13a860bbbfdebcf85080334e1
> > > > Author: Maxime Coquelin <maxime.coquelin at redhat.com>
> > > > Date:   Mon Feb 12 16:46:12 2018 +0100
> > > > 
> > > >       vhost: do not take lock on owner reset
> > > > 
> > > >       A deadlock happens when handling VHOST_USER_RESET_OWNER
> > > > request
> > > >       for the same reason the lock is not taken for
> > > >       VHOST_USER_GET_VRING_BASE.
> > > > 
> > > >       It is safe not to take the lock, as the queues are no
> > > > more
> > > > used
> > > >       by the application when the virtqueues and the device are
> > > > reset.
> > > > 
> > > >       Fixes: a3688046995f ("vhost: protect active rings from
> > > > async
> > > > ring
> > > > changes")
> > > >       Cc: stable at dpdk.org
> > > > 
> > > >       Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.co
> > > > m>
> > > >       Reviewed-by: Tiwei Bie <tiwei.bie at intel.com>
> > > >       Reviewed-by: Jianfeng Tan <jianfeng.tan at intel.com>
> > > > 
> > > > 
> > > > Let me know if you want me to post the backport to stable at dpdk.
> > > > org,
> > > > or if you can pick it directly from upstream master.
> > > > 
> > > > Cheers,
> > > > Maxime
> > 
> > I can take care of that for 16.11 - have you tested it on top of
> > the
> > current dpdk-stable/16.11 ? We are in the 11th hour so I want to
> > make
> > sure any new patches that I pick are tested :-)
> 
> I understand!
> So I just tried to test the patch with virtio-usr, but it is not
> enabled 
> in default config, and I didn't made it to work when enabled.
> 
> The issue it solves can be reproduced with old QEMU that sends
> _RESET_OWNER request (v2.4 and earlier). With this, I manage to
> reproduce the bug, and once patch is applied, the deadlock no more
> appear, so:
> 
> Tested-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> 
> Thanks!
> Maxime

Great, thank you, applied & pushed to dpdk-stable/16.11

-- 
Kind regards,
Luca Boccassi


More information about the stable mailing list