[dpdk-dev] [PATCH] virtio: fix the vq size issue

Stephen Hemminger stephen at networkplumber.org
Mon Jul 20 17:47:17 CEST 2015


On Sat, 18 Jul 2015 12:11:11 +0000
"Ouyang, Changchun" <changchun.ouyang at intel.com> wrote:

> Hi Stephen,
> 
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > Sent: Saturday, July 18, 2015 12:28 AM
> > To: Ouyang, Changchun
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> > 
> > On Wed,  1 Jul 2015 15:48:50 +0800
> > Ouyang Changchun <changchun.ouyang at intel.com> wrote:
> > 
> > > This commit breaks virtio basic packets rx functionality:
> > >   d78deadae4dca240e85054bf2d604a801676becc
> > >
> > > The QEMU use 256 as default vring size, also use this default value to
> > > calculate the virtio avail ring base address and used ring base
> > > address, and vhost in the backend use the ring base address to do packet
> > IO.
> > >
> > > Virtio spec also says the queue size in PCI configuration is
> > > read-only, so virtio front end can't change it. just need use the
> > > read-only value to allocate space for vring and calculate the avail
> > > and used ring base address. Otherwise, the avail and used ring base
> > address will be different between host and guest, accordingly, packet IO
> > can't work normally.
> > >
> > > Signed-off-by: Changchun Ouyang <changchun.ouyang at intel.com>
> > > ---
> > >  drivers/net/virtio/virtio_ethdev.c | 14 +++-----------
> > >  1 file changed, 3 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > > b/drivers/net/virtio/virtio_ethdev.c
> > > index fe5f9a1..d84de13 100644
> > > --- a/drivers/net/virtio/virtio_ethdev.c
> > > +++ b/drivers/net/virtio/virtio_ethdev.c
> > > @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> > *dev,
> > >  	 */
> > >  	vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> > >  	PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> > nb_desc);
> > > -	if (nb_desc == 0)
> > > -		nb_desc = vq_size;
> > 
> > command queue is setup with nb_desc = 0
> 
> nb_desc is not used in the rest of the function, then why we need such an assignment here?
> Why command queues is setup whit nb_desc = 0?
> Even if it is the case, what the code change break? 
> 
> > 
> > >  	if (vq_size == 0) {
> > >  		PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> > __func__);
> > >  		return -EINVAL;
> > > @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> > *dev,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > -	if (nb_desc < vq_size) {
> > > -		if (!rte_is_power_of_2(nb_desc)) {
> > > -			PMD_INIT_LOG(ERR,
> > > -				     "nb_desc(%u) size is not powerof 2",
> > > -				     nb_desc);
> > > -			return -EINVAL;
> > > -		}
> > > -		vq_size = nb_desc;
> > > -	}
> > > +	if (nb_desc != vq_size)
> > > +		PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to
> > vq size (%d), fall to vq size",
> > > +			nb_desc, vq_size);
> > 
> > Nack. This breaks onn Google Compute Engine the vring size is 16K.
> 
> 
> As I mentioned before, the commit d78deadae4dca240e85054bf2d604a801676becc break the basic functionality of virtio pmd,
> I don't think keeping it broken is good way for us.
> We have to revert it firstly to recover its functionality on qemu!
> Why we need break current functionality to just meet a new thing's requirement?
> 
> > 
> > An application that wants to work on both QEMU and GCE will want to pass a
> > reasonable size and have the negotiation resolve to best value.
> 
> Do you have already a patch to revert the mistaken and support both qemu and gce?
> If you have, then pls send out it, and let's review.
> 
> > 
> > For example, vRouter passes 512 as Rx ring size.
> > On QEMU this gets rounded down to 256 and on GCE only 512 elements are
> > used.
> > 
> > This is what the Linux kernel virtio does.

The part in dev_queue_setup is correct, but there is a different problem
if the user has requested smaller number of descriptors. What happens is that
the receive start process runs the mbuf pool out of space getting more packets
than the application expected. Imagine application expects 512 packets in rx ring
but full 16K are allocated.

Working on a fix to the rx initialization logic to take that into account.



More information about the dev mailing list