[dpdk-dev,2/2] vhost: only drop vqs with built-in virtio_net.c driver

Message ID 20180131174651.6386-3-stefanha@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Stefan Hajnoczi Jan. 31, 2018, 5:46 p.m. UTC
  Commit e29109323595beb3884da58126ebb3b878cb66f5 ("vhost: destroy unused
virtqueues when multiqueue not negotiated") broke vhost-scsi by removing
virtqueues when the virtio-net-specific VIRTIO_NET_F_MQ feature bit is
missing.

The vhost_user.c code shouldn't assume all devices are vhost net device
backends.  Use the new VIRTIO_DEV_BUILTIN_VIRTIO_NET flag to check
whether virtio_net.c is being used.

This fixes examples/vhost_scsi.

Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Maxime Coquelin Jan. 31, 2018, 6:07 p.m. UTC | #1
Hi Stefan,

On 01/31/2018 06:46 PM, Stefan Hajnoczi wrote:
> Commit e29109323595beb3884da58126ebb3b878cb66f5 ("vhost: destroy unused
> virtqueues when multiqueue not negotiated") broke vhost-scsi by removing
> virtqueues when the virtio-net-specific VIRTIO_NET_F_MQ feature bit is
> missing.
> 
> The vhost_user.c code shouldn't assume all devices are vhost net device
> backends.  Use the new VIRTIO_DEV_BUILTIN_VIRTIO_NET flag to check
> whether virtio_net.c is being used.
> 
> This fixes examples/vhost_scsi.
> 
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   lib/librte_vhost/vhost_user.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 1dd1a61b6..65ee33919 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -187,7 +187,8 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>   		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
>   		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>   
> -	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
> +	if ((dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) &&
> +	    !(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {

If we had an external net backend using the library, we would also need 
to remove extra queues if it advertised VIRTIO_NET_F_MQ, but the feature
isn't negotiated.

In this case, the fix I suggested yesterday would work:
if ((vhost_features & (1ULL << VIRTIO_NET_F_MQ)) &&
     !(dev->features & (1ULL << VIRTIO_NET_F_MQ)) {
...
}

For any backend that does not advertise the feature, no queues will be
destroyed.

Thanks,
Maxime
>   		/*
>   		 * Remove all but first queue pair if MQ hasn't been
>   		 * negotiated. This is safe because the device is not
>
  
Maxime Coquelin Feb. 1, 2018, 12:49 p.m. UTC | #2
On 02/01/2018 11:24 AM, Stefan Hajnoczi wrote:
> On Wed, Jan 31, 2018 at 07:07:50PM +0100, Maxime Coquelin wrote:
>> Hi Stefan,
>>
>> On 01/31/2018 06:46 PM, Stefan Hajnoczi wrote:
>>> Commit e29109323595beb3884da58126ebb3b878cb66f5 ("vhost: destroy unused
>>> virtqueues when multiqueue not negotiated") broke vhost-scsi by removing
>>> virtqueues when the virtio-net-specific VIRTIO_NET_F_MQ feature bit is
>>> missing.
>>>
>>> The vhost_user.c code shouldn't assume all devices are vhost net device
>>> backends.  Use the new VIRTIO_DEV_BUILTIN_VIRTIO_NET flag to check
>>> whether virtio_net.c is being used.
>>>
>>> This fixes examples/vhost_scsi.
>>>
>>> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>    lib/librte_vhost/vhost_user.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>>> index 1dd1a61b6..65ee33919 100644
>>> --- a/lib/librte_vhost/vhost_user.c
>>> +++ b/lib/librte_vhost/vhost_user.c
>>> @@ -187,7 +187,8 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features)
>>>    		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
>>>    		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
>>> -	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
>>> +	if ((dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) &&
>>> +	    !(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
>>
>> If we had an external net backend using the library, we would also need to
>> remove extra queues if it advertised VIRTIO_NET_F_MQ, but the feature
>> isn't negotiated.
> 
> A net-specific fix inside librte_vhost is not enough since non-net
> drivers may only initialize a subset of the virtqueues.

You are right, and even with net backend, the driver may decide to only
initialize a subset of the virtqueues even if VIRTIO_NET_F_MQ is
negotiated.

This is not the case today with virtio-net Linux kernel driver and
DPDK's Virtio PMD, but Windows virtio-net driver only initialize as much
queue pairs as online vCPUs.

> I suggest starting over.  Get rid of the net-specific fix and instead
> look at when new_device() needs to be called.

I agree, and already started to work on it. But my understanding is that
we will need a vhost-user protocol update. So I implemented this
workaround to support the VIRTIO_NET_F_MQ not negotiated case, which
happens with iPXE.

> Virtqueues will not be changed after the DRIVER_OK status bit has been
> set.  The VIRTIO 1.0 specification says, "The device MUST NOT consume
> buffers before DRIVER_OK, and the driver MUST NOT notify the device
> before it sets DRIVER_OK" (3.1 Device Initialization).
> 
> http://docs.oasis-open.org/virtio/virtio/v1.0/csprd01/virtio-v1.0-csprd01.html#x1-230001
> 
> However, it also says "legacy device implementations often used the
> device before setting the DRIVER_OK bit" (3.1.1 Legacy Interface: Device
> Initialization).
> 
> VIRTIO 1.0 can be supported fine by the current librte_vhost API.
> Legacy cannot be supported without API changes - there is no magic way
> to detect when new_device() can be invoked if the driver might require
> some virtqueue processing before the device is fully initialized.
> 
> I think this is the main discussion that needs to happen.  This patch
> series and the original VIRTIO_NET_F_MQ fix are just workarounds for the
> real problem.

Yes.

>> In this case, the fix I suggested yesterday would work:
>> if ((vhost_features & (1ULL << VIRTIO_NET_F_MQ)) &&
>>      !(dev->features & (1ULL << VIRTIO_NET_F_MQ)) {
>> ...
>> }
>>
>> For any backend that does not advertise the feature, no queues will be
>> destroyed.
> 
> The feature bit space is shared by all device types.  Another device can
> use bit 22 (VIRTIO_NET_F_MQ) for another purpose.  This code would
> incorrectly assume it's a net device.

Thanks for pointing this out, I missed that.

> No other device type in VIRTIO 1.0 uses bit 22 yet, but this solution is
> not future-proof.  If you decide to use your fix, please include a
> comment in the code so whoever needs to debug this again in the future
> can spot the problem more quickly.

No, I agree this is not future proof. I now think your patch is better.

Thanks for the insights!
Maxime

> Stefan
>
  

Patch

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 1dd1a61b6..65ee33919 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -187,7 +187,8 @@  vhost_user_set_features(struct virtio_net *dev, uint64_t features)
 		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
 		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
 
-	if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
+	if ((dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) &&
+	    !(dev->features & (1ULL << VIRTIO_NET_F_MQ))) {
 		/*
 		 * Remove all but first queue pair if MQ hasn't been
 		 * negotiated. This is safe because the device is not