[dpdk-dev] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue

Xia, Chenbo chenbo.xia at intel.com
Thu Jun 24 12:49:08 CEST 2021


Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Friday, June 18, 2021 4:48 PM
> To: Xia, Chenbo <chenbo.xia at intel.com>; dev at dpdk.org;
> david.marchand at redhat.com
> Cc: stable at dpdk.org
> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> 
> 
> 
> On 6/18/21 10:21 AM, Xia, Chenbo wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> >> Sent: Friday, June 18, 2021 4:01 PM
> >> To: Xia, Chenbo <chenbo.xia at intel.com>; dev at dpdk.org;
> >> david.marchand at redhat.com
> >> Cc: stable at dpdk.org
> >> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> >>
> >>
> >>
> >> On 6/18/21 6:34 AM, Xia, Chenbo wrote:
> >>> Hi Maxime,
> >>>
> >>>> -----Original Message-----
> >>>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> >>>> Sent: Thursday, June 17, 2021 11:38 PM
> >>>> To: dev at dpdk.org; david.marchand at redhat.com; Xia, Chenbo
> >> <chenbo.xia at intel.com>
> >>>> Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; stable at dpdk.org
> >>>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue
> >>>>
> >>>> Since the Vhost-user device initialization has been reworked,
> >>>> enabling the application to start using the device as soon as
> >>>> the first queue pair is ready, NUMA reallocation no more
> >>>> happened on queue pairs other than the first one since
> >>>> numa_realloc() was returning early if the device was running.
> >>>>
> >>>> This patch fixes this issue by only preventing the device
> >>>> metadata to be allocated if the device is running. For the
> >>>> virtqueues, a vring state change notification is sent to
> >>>> notify the application of its disablement. Since the callback
> >>>> is supposed to be blocking, it is safe to reallocate it
> >>>> afterwards.
> >>>
> >>> Is there a corner case? Numa_realloc may happen during vhost-user msg
> >>> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will
> >>> not take vq access lock. It could happen when numa_realloc happens on
> >>> iotlb msg and app accesses vq in the meantime?
> >>
> >> I think we are safe wrt to numa_realloc(), because the app's
> >> .vring_state_changed() callback is only returning when it is no more
> >> processing the rings.
> >
> > Yes, I think it should be. But in this iotlb msg case (take vhost pmd for
> example),
> > can't vhost pmd still access vq since vq access lock is not took? Do I miss
> something?
> 
> Vhost PMD sends RTE_ETH_EVENT_QUEUE_STATE, and my assumption was that
> the application would stop processing the rings when handling this
> event and only return from the callback when it's one, but this seems
> that's not done at least in testpmd. So we may not rely on that after
> all :/.
> 
> We cannot rely on the VQ's access lock since the goal of numa_realloc is
> to reallocate the vhost_virtqueue itself which contains the acces_lock.
> Relying on it would cause a use after free.
> 
> Maybe the safest thing to do is to just skip the reallocation if
> vq->ready == true.
> 
> Having vq->ready == true means we already received all the vrings info
> from QEMU, which means the driver has already initialized the device.
> 
> It should not change runtime behavior compared to this patch since it
> would not reallocate anyway.
> 
> What do you think?

That sounds good to me 😊

Thanks,
Chenbo

> 
> > Thanks,
> > Chenbo
> >
> >>
> >>
> >>> Thanks,
> >>> Chenbo
> >>>
> >>>>
> >>>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")
> >>>> Cc: stable at dpdk.org
> >>>>
> >>>> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> >>>> ---
> >>>>  lib/vhost/vhost_user.c | 11 ++++++++---
> >>>>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> >>>> index 0e9e26ebe0..6e7b327ef8 100644
> >>>> --- a/lib/vhost/vhost_user.c
> >>>> +++ b/lib/vhost/vhost_user.c
> >>>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  	struct batch_copy_elem *new_batch_copy_elems;
> >>>>  	int ret;
> >>>>
> >>>> -	if (dev->flags & VIRTIO_DEV_RUNNING)
> >>>> -		return dev;
> >>>> -
> >>>>  	old_dev = dev;
> >>>>  	vq = old_vq = dev->virtqueue[index];
> >>>>
> >>>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  		return dev;
> >>>>  	}
> >>>>  	if (oldnode != newnode) {
> >>>> +		if (vq->ready) {
> >>>> +			vq->ready = false;
> >>>> +			vhost_user_notify_queue_state(dev, index, 0);
> >>>> +		}
> >>>> +
> >>>>  		VHOST_LOG_CONFIG(INFO,
> >>>>  			"reallocate vq from %d to %d node\n", oldnode,
> newnode);
> >>>>  		vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode);
> >>>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index)
> >>>>  		rte_free(old_vq);
> >>>>  	}
> >>>>
> >>>> +	if (dev->flags & VIRTIO_DEV_RUNNING)
> >>>> +		goto out;
> >>>> +
> >>>>  	/* check if we need to reallocate dev */
> >>>>  	ret = get_mempolicy(&oldnode, NULL, 0, old_dev,
> >>>>  			    MPOL_F_NODE | MPOL_F_ADDR);
> >>>> --
> >>>> 2.31.1
> >>>
> >



More information about the dev mailing list