lib/librte_vhost: fix vid allocation race

Message ID 20210129073547.80108-1-hepeng.0320@bytedance.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series lib/librte_vhost: fix vid allocation race |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-testing fail Testing issues

Commit Message

贺鹏 Jan. 29, 2021, 7:35 a.m. UTC
  From: "chenwei.0515" <chenwei.0515@bytedance.com>

vhost_new_devcie might be called in different threads at the same time.
thread 1(config thread)
            rte_vhost_driver_start
               ->vhost_user_start_client
                   ->vhost_user_add_connection
                     -> vhost_new_device

thread 2(vhost-events)
	vhost_user_read_cb
           ->vhost_user_msg_handler (return value < 0)
             -> vhost_user_start_client
                 -> vhost_new_device

So there could be a case that a same vid has been allocated twice, or
some vid might be lost in DPDK lib however still held by the upper
applications.

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
---
 lib/librte_vhost/vhost.c | 6 ++++++
 1 file changed, 6 insertions(+)
  

Comments

Chenbo Xia Feb. 1, 2021, 6:27 a.m. UTC | #1
Hi Peng & Fei,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Peng He
> Sent: Friday, January 29, 2021 3:36 PM
> To: dev@dpdk.org
> Cc: maxime.coquelin@redhat.com
> Subject: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race

Fix the title to 'vhost: XXXXX'

> 
> From: "chenwei.0515" <chenwei.0515@bytedance.com>

This should not be here.. you could just delete it as the author is already
in commit message.

> 
> vhost_new_devcie might be called in different threads at the same time.

s/devcie/device

> thread 1(config thread)
>             rte_vhost_driver_start
>                ->vhost_user_start_client
>                    ->vhost_user_add_connection
>                      -> vhost_new_device
> 
> thread 2(vhost-events)
> 	vhost_user_read_cb
>            ->vhost_user_msg_handler (return value < 0)
>              -> vhost_user_start_client
>                  -> vhost_new_device
> 
> So there could be a case that a same vid has been allocated twice, or
> some vid might be lost in DPDK lib however still held by the upper
> applications.

Good catch! I checked the code and find there exists cases that different threads
may allocate vhost slot.

And I also find that other functions which use the global vhost_devices[] may also
meet the same problem. For example, vhost_destroy_device() could be called by different
thread. So I suggest to fix the problem completely in all related functions like
vhost_destroy_device() and get_device(). What do you think?

If you agree with above, note that the title should also be changed.

Besides, please also add 'Fixes:$COMMID_ID' and cc to stable@dpdk.org so it could be
fixed in LTS. You can check other commit for reference.

> 
> Reported-by: Peng He <hepeng.0320@bytedance.com>
> Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  lib/librte_vhost/vhost.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index efb136edd1..db11d293d2 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -26,6 +26,7 @@
>  #include "vhost_user.h"
> 
>  struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
> +pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;

There's a duplicate space between 'pthread_mutex_t' and 'vhost_dev_lock',
Let's just leave one 😊

Thanks,
Chenbo

> 
>  /* Called with iotlb_lock read-locked */
>  uint64_t
> @@ -645,6 +646,7 @@ vhost_new_device(void)
>  	struct virtio_net *dev;
>  	int i;
> 
> +	pthread_mutex_lock(&vhost_dev_lock);
>  	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
>  		if (vhost_devices[i] == NULL)
>  			break;
> @@ -653,6 +655,7 @@ vhost_new_device(void)
>  	if (i == MAX_VHOST_DEVICE) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to find a free slot for new device.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
> @@ -660,10 +663,13 @@ vhost_new_device(void)
>  	if (dev == NULL) {
>  		VHOST_LOG_CONFIG(ERR,
>  			"Failed to allocate memory for new dev.\n");
> +		pthread_mutex_unlock(&vhost_dev_lock);
>  		return -1;
>  	}
> 
>  	vhost_devices[i] = dev;
> +	pthread_mutex_unlock(&vhost_dev_lock);
> +
>  	dev->vid = i;
>  	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
>  	dev->slave_req_fd = -1;
> --
> 2.23.0
  
贺鹏 Feb. 1, 2021, 8:53 a.m. UTC | #2
Hi, Chenbo,

Thanks for the detailed review!


Xia, Chenbo <chenbo.xia@intel.com> 于2021年2月1日周一 下午2:27写道:
>
> Hi Peng & Fei,
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Peng He
> > Sent: Friday, January 29, 2021 3:36 PM
> > To: dev@dpdk.org
> > Cc: maxime.coquelin@redhat.com
> > Subject: [dpdk-dev] [PATCH] lib/librte_vhost: fix vid allocation race
>
> Fix the title to 'vhost: XXXXX'
>
> >
> > From: "chenwei.0515" <chenwei.0515@bytedance.com>
>
> This should not be here.. you could just delete it as the author is already
> in commit message.
>
> >
> > vhost_new_devcie might be called in different threads at the same time.
>
> s/devcie/device
>

will fix it in v2.

> > thread 1(config thread)
> >             rte_vhost_driver_start
> >                ->vhost_user_start_client
> >                    ->vhost_user_add_connection
> >                      -> vhost_new_device
> >
> > thread 2(vhost-events)
> >       vhost_user_read_cb
> >            ->vhost_user_msg_handler (return value < 0)
> >              -> vhost_user_start_client
> >                  -> vhost_new_device
> >
> > So there could be a case that a same vid has been allocated twice, or
> > some vid might be lost in DPDK lib however still held by the upper
> > applications.
>
> Good catch! I checked the code and find there exists cases that different threads
> may allocate vhost slot.
>
> And I also find that other functions which use the global vhost_devices[] may also
> meet the same problem. For example, vhost_destroy_device() could be called by different
> thread. So I suggest to fix the problem completely in all related functions like
> vhost_destroy_device() and get_device(). What do you think?
>
> If you agree with above, note that the title should also be changed.
>

Yes, we've investigated also these places where race would exist.

In *vhost_destroy_device*, the access to vhost_devices is just to set
the specific slot to NULL.
If the vid is not the same, the race will not exist. Two threads will
not destroy the same vid at
the same time.

We will add these notes in the commits for clarity.


> Besides, please also add 'Fixes:$COMMID_ID' and cc to stable@dpdk.org so it could be
> fixed in LTS. You can check other commit for reference.

will do it the v2.

>
> >
> > Reported-by: Peng He <hepeng.0320@bytedance.com>
> > Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
> > Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> > ---
> >  lib/librte_vhost/vhost.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> > index efb136edd1..db11d293d2 100644
> > --- a/lib/librte_vhost/vhost.c
> > +++ b/lib/librte_vhost/vhost.c
> > @@ -26,6 +26,7 @@
> >  #include "vhost_user.h"
> >
> >  struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
> > +pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
>
> There's a duplicate space between 'pthread_mutex_t' and 'vhost_dev_lock',
> Let's just leave one

will fix it in v2.

>
> Thanks,
> Chenbo
>
> >
> >  /* Called with iotlb_lock read-locked */
> >  uint64_t
> > @@ -645,6 +646,7 @@ vhost_new_device(void)
> >       struct virtio_net *dev;
> >       int i;
> >
> > +     pthread_mutex_lock(&vhost_dev_lock);
> >       for (i = 0; i < MAX_VHOST_DEVICE; i++) {
> >               if (vhost_devices[i] == NULL)
> >                       break;
> > @@ -653,6 +655,7 @@ vhost_new_device(void)
> >       if (i == MAX_VHOST_DEVICE) {
> >               VHOST_LOG_CONFIG(ERR,
> >                       "Failed to find a free slot for new device.\n");
> > +             pthread_mutex_unlock(&vhost_dev_lock);
> >               return -1;
> >       }
> >
> > @@ -660,10 +663,13 @@ vhost_new_device(void)
> >       if (dev == NULL) {
> >               VHOST_LOG_CONFIG(ERR,
> >                       "Failed to allocate memory for new dev.\n");
> > +             pthread_mutex_unlock(&vhost_dev_lock);
> >               return -1;
> >       }
> >
> >       vhost_devices[i] = dev;
> > +     pthread_mutex_unlock(&vhost_dev_lock);
> > +
> >       dev->vid = i;
> >       dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
> >       dev->slave_req_fd = -1;
> > --
> > 2.23.0
>
  

Patch

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index efb136edd1..db11d293d2 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -26,6 +26,7 @@ 
 #include "vhost_user.h"
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
+pthread_mutex_t  vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
 
 /* Called with iotlb_lock read-locked */
 uint64_t
@@ -645,6 +646,7 @@  vhost_new_device(void)
 	struct virtio_net *dev;
 	int i;
 
+	pthread_mutex_lock(&vhost_dev_lock);
 	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
 		if (vhost_devices[i] == NULL)
 			break;
@@ -653,6 +655,7 @@  vhost_new_device(void)
 	if (i == MAX_VHOST_DEVICE) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to find a free slot for new device.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
@@ -660,10 +663,13 @@  vhost_new_device(void)
 	if (dev == NULL) {
 		VHOST_LOG_CONFIG(ERR,
 			"Failed to allocate memory for new dev.\n");
+		pthread_mutex_unlock(&vhost_dev_lock);
 		return -1;
 	}
 
 	vhost_devices[i] = dev;
+	pthread_mutex_unlock(&vhost_dev_lock);
+
 	dev->vid = i;
 	dev->flags = VIRTIO_DEV_BUILTIN_VIRTIO_NET;
 	dev->slave_req_fd = -1;