[dpdk-dev,v2,4/4] vhost: destroy unused virtqueues when multiqueue not negotiated

Message ID 20171205083434.14292-5-maxime.coquelin@redhat.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Maxime Coquelin Dec. 5, 2017, 8:34 a.m. UTC
  QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
declared in QEMU command line before the guest is started.
It has the effect in DPDK vhost-user backend to allocate vrings
for all queues declared by QEMU.

If the first driver being used does not support multiqueue,
the device never changes to VIRTIO_DEV_RUNNING state as only
the first queue pair is initialized. One driver impacted by
this bug is virtio-net's iPXE driver which does not support
VIRTIO_NET_F_MQ feature.

It is safe to destroy unused virtqueues in SET_FEATURES request
handler, as it is ensured the device is not in running state
at this stage, so virtqueues aren't being processed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
  

Comments

Laszlo Ersek Dec. 5, 2017, 2:40 p.m. UTC | #1
Hi Maxime,

On 12/05/17 09:34, Maxime Coquelin wrote:
> QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
> declared in QEMU command line before the guest is started.
> It has the effect in DPDK vhost-user backend to allocate vrings
> for all queues declared by QEMU.
> 
> If the first driver being used does not support multiqueue,
> the device never changes to VIRTIO_DEV_RUNNING state as only
> the first queue pair is initialized. One driver impacted by
> this bug is virtio-net's iPXE driver which does not support
> VIRTIO_NET_F_MQ feature.
> 
> It is safe to destroy unused virtqueues in SET_FEATURES request
> handler, as it is ensured the device is not in running state
> at this stage, so virtqueues aren't being processed.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index a5e1f2482..b17080215 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -173,6 +173,7 @@ vhost_user_get_features(struct virtio_net *dev)
>  static int
>  vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>  {
> +	int i;
>  	uint64_t vhost_features = 0;
>  
>  	rte_vhost_driver_get_features(dev->ifname, &vhost_features);
> @@ -216,6 +217,24 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>  		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
>  		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>  
> +	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
> +		/*
> +		 * Remove all but first queue pair if MQ hasn't been
> +		 * negotiated. This is safe because the device is not
> +		 * running at this stage.
> +		 */
> +		for (i = dev->nr_vring; i > 1; i--) {
> +			struct vhost_virtqueue *vq = dev->virtqueue[i];

Sorry, I don't have any experience with dpdk.

But, if "dev->virtqueue" has "dev->nr_vring" elements, then this loop is
off-by one; dev->virtqueue[dev->nr_vring] should never be accessed. For
example, if you have three queues, numbered 0, 1 and 2, this loop will
access/release virtqueue[3] (bad) and virtqueue[2] (good).

Instead, I suggest:

  i = dev->nr_vring;
  while (i > 2) {
    struct vhost_virtqueue *vq;

    vq = dev->virtqueue[--i];
    /* the rest here */
  }

The loop body is entered with "i" standing for "how many queues are left
that should be freed".

Thanks
Laszlo

> +
> +			if (!vq)
> +				continue;
> +
> +			cleanup_vq(vq, 1);
> +			free_vq(vq);
> +			dev->nr_vring--;
> +		}
> +	}
> +
>  	return 0;
>  }
>  
>
  
Maxime Coquelin Dec. 5, 2017, 2:56 p.m. UTC | #2
Hi Laszlo,

On 12/05/2017 03:40 PM, Laszlo Ersek wrote:
> Hi Maxime,
> 
> On 12/05/17 09:34, Maxime Coquelin wrote:
>> QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues
>> declared in QEMU command line before the guest is started.
>> It has the effect in DPDK vhost-user backend to allocate vrings
>> for all queues declared by QEMU.
>>
>> If the first driver being used does not support multiqueue,
>> the device never changes to VIRTIO_DEV_RUNNING state as only
>> the first queue pair is initialized. One driver impacted by
>> this bug is virtio-net's iPXE driver which does not support
>> VIRTIO_NET_F_MQ feature.
>>
>> It is safe to destroy unused virtqueues in SET_FEATURES request
>> handler, as it is ensured the device is not in running state
>> at this stage, so virtqueues aren't being processed.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>   lib/librte_vhost/vhost_user.c | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>> index a5e1f2482..b17080215 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -173,6 +173,7 @@ vhost_user_get_features(struct virtio_net *dev)
>>   static int
>>   vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>>   {
>> +	int i;
>>   	uint64_t vhost_features = 0;
>>   
>>   	rte_vhost_driver_get_features(dev->ifname, &vhost_features);
>> @@ -216,6 +217,24 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>>   		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
>>   		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>>   
>> +	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
>> +		/*
>> +		 * Remove all but first queue pair if MQ hasn't been
>> +		 * negotiated. This is safe because the device is not
>> +		 * running at this stage.
>> +		 */
>> +		for (i = dev->nr_vring; i > 1; i--) {
>> +			struct vhost_virtqueue *vq = dev->virtqueue[i];
> 
> Sorry, I don't have any experience with dpdk.
> 
> But, if "dev->virtqueue" has "dev->nr_vring" elements, then this loop is
> off-by one; dev->virtqueue[dev->nr_vring] should never be accessed. For
> example, if you have three queues, numbered 0, 1 and 2, this loop will
> access/release virtqueue[3] (bad) and virtqueue[2] (good).

Right, thanks for spotting this.

I didn't noticed my mistake while testing it because of the NULL check
in the loop.

> Instead, I suggest:
> 
>    i = dev->nr_vring;
>    while (i > 2) {
>      struct vhost_virtqueue *vq;
> 
>      vq = dev->virtqueue[--i];
>      /* the rest here */
>    }
> 
> The loop body is entered with "i" standing for "how many queues are left
> that should be freed".

Yes, that sounds cleaner. I think dev->nr_vring can safely be
decremented directly, so "i" could be skipped.

Thanks!
Maxime

> Thanks
> Laszlo
> 
>> +
>> +			if (!vq)
>> +				continue;
>> +
>> +			cleanup_vq(vq, 1);
>> +			free_vq(vq);
>> +			dev->nr_vring--;
>> +		}
>> +	}
>> +
>>   	return 0;
>>   }
>>   
>>
>
  

Patch

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index a5e1f2482..b17080215 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -173,6 +173,7 @@  vhost_user_get_features(struct virtio_net *dev)
 static int
 vhost_user_set_features(struct virtio_net *dev, uint64_t features)
 {
+	int i;
 	uint64_t vhost_features = 0;
 
 	rte_vhost_driver_get_features(dev->ifname, &vhost_features);
@@ -216,6 +217,24 @@  vhost_user_set_features(struct virtio_net *dev, uint64_t features)
 		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
 		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
 
+	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
+		/*
+		 * Remove all but first queue pair if MQ hasn't been
+		 * negotiated. This is safe because the device is not
+		 * running at this stage.
+		 */
+		for (i = dev->nr_vring; i > 1; i--) {
+			struct vhost_virtqueue *vq = dev->virtqueue[i];
+
+			if (!vq)
+				continue;
+
+			cleanup_vq(vq, 1);
+			free_vq(vq);
+			dev->nr_vring--;
+		}
+	}
+
 	return 0;
 }