[dpdk-dev] [PATCH] vhost: Clean iotlb cache on vring stop

Xia, Chenbo chenbo.xia at intel.com
Tue Aug 31 04:02:36 CEST 2021


Hi Eugenio,

> -----Original Message-----
> From: Eugenio Perez Martin <eperezma at redhat.com>
> Sent: Tuesday, August 31, 2021 2:10 AM
> To: Xia, Chenbo <chenbo.xia at intel.com>
> Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; dev at dpdk.org; Pei Zhang
> <pezhang at redhat.com>; Jason Wang <jasowang at redhat.com>
> Subject: Re: [PATCH] vhost: Clean iotlb cache on vring stop
> 
> On Mon, Aug 30, 2021 at 1:58 PM Xia, Chenbo <chenbo.xia at intel.com> wrote:
> >
> > Hi Eugenio,
> >
> 
> Hi Chenbo,
> 
> > > -----Original Message-----
> > > From: Eugenio Pérez <eperezma at redhat.com>
> > > Sent: Saturday, August 28, 2021 12:13 AM
> > > To: Maxime Coquelin <maxime.coquelin at redhat.com>; Xia, Chenbo
> > > <chenbo.xia at intel.com>
> > > Cc: dev at dpdk.org; Pei Zhang <pezhang at redhat.com>; Jason Wang
> > > <jasowang at redhat.com>
> > > Subject: [PATCH] vhost: Clean iotlb cache on vring stop
> >
> > Clean -> clean
> >
> 
> Is that something I need to send a new revision for, or is it ok to
> apply on the maintainer side?
> 
> > >
> > > Old IOVA cache entries are left when there is a change on virtio driver
> > > in VM. In case that all these old entries have iova addresses lesser
> > > than new iova entries, vhost code will need to iterate all the cache to
> > > find the new ones. In case of just a new iova entry needed for the new
> > > translations, this condition will last forever.
> > >
> > > This has been observed in virtio-net to testpmd's vfio-pci driver
> > > transition, reducing the performance from more than 10Mpps to less than
> > > 0.07Mpps if the hugepage address was higher than the networking
> > > buffers. Since all new buffers are contained in this new gigantic page,
> > > vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst.
> >
> > I'm curious why QEMU will not invalidate iotlb when virtio-net driver is
> removed
> > (dma region should be unmapped).
> >
> 
> I'm going to investigate this more, but qemu iommu notifier callback
> (vhost_iommu_unmap_notify) is never called through all the test. Also,
> guest kernel code calls dma_unmap_page for each buffer and vqs, but it
> never generates an iotlb flush.
> 
> Or do you mean that qemu should also flush all iotlb entries on vhost
> device stop?

I think as you said, the driver unmapped all entries, so theoretically the iotlb
entries should all be invalidated. The iotlb invalidation should be triggered by
iommu entry unmapping. Make sense to you?

> 
> > And since the perf drop is huge, why not cc to stable and add fix tag?
> >
> 
> I was not sure if it was worth it to backport, but I would say that
> the issue can be reproduced with enough bad luck. Since translations
> have always been saved in a linked list:
> 
> Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
> 
> Same question as before, if no changes to the code are needed for the
> patch, do I need to send a second revision?

I think it sounds a bug to me. So no matter how often it could be reproduced, it's
worth to backport. I suggest to send a v2 with the title fixed and Fixes & cc
stable tag added so that it will show in stable mailing list and save Maxime's effort
when applying.

Thanks,
Chenbo

> 
> Thanks!
> 
> 
> > Thanks,
> > Chenbo
> >
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma at redhat.com>
> > > Reported-by: Pei Zhang <pezhang at redhat.com>
> > > ---
> > >  lib/vhost/vhost_user.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> > > index 29a4c9af60..7de48f5333 100644
> > > --- a/lib/vhost/vhost_user.c
> > > +++ b/lib/vhost/vhost_user.c
> > > @@ -2113,6 +2113,8 @@ vhost_user_get_vring_base(struct virtio_net **pdev,
> > >       msg->size = sizeof(msg->payload.state);
> > >       msg->fd_num = 0;
> > >
> > > +     vhost_user_iotlb_flush_all(vq);
> > > +
> > >       vring_invalidate(dev, vq);
> > >
> > >       return RTE_VHOST_MSG_RESULT_REPLY;
> > > --
> > > 2.27.0
> >



More information about the dev mailing list