[dpdk-dev] [PATCH 4/8] net/virtio: allocate queue at init stage

Yuanhan Liu yuanhan.liu at linux.intel.com
Sat Nov 5 07:15:33 CET 2016


On Fri, Nov 04, 2016 at 08:30:00PM +0000, Kevin Traynor wrote:
> On 11/04/2016 03:21 PM, Kevin Traynor wrote:
> > On 11/03/2016 04:09 PM, Yuanhan Liu wrote:
> >> Queue allocation should be done once, since the queue related info (such
> >> as vring addreess) will only be informed to the vhost-user backend once
> >> without virtio device reset.
> >>
> >> That means, if you allocate queues again after the vhost-user negotiation,
> >> the vhost-user backend will not be informed any more. Leading to a state
> >> that the vring info mismatches between virtio PMD driver and vhost-backend:
> >> the driver switches to the new address has just been allocated, while the
> >> vhost-backend still sticks to the old address has been assigned in the init
> >> stage.
> >>
> >> Unfortunately, that is exactly how the virtio driver is coded so far: queue
> >> allocation is done at queue_setup stage (when rte_eth_tx/rx_queue_setup is
> >> invoked). This is wrong, because queue_setup can be invoked several times.
> >> For example,
> >>
> >>     $ start_testpmd.sh ... --txq=1 --rxq=1 ...
> >>     > port stop 0
> >>     > port config all txq 1 # just trigger the queue_setup callback again
> >>     > port config all rxq 1
> >>     > port start 0
> >>
> >> The right way to do is allocate the queues in the init stage, so that the
> >> vring info could be persistent with the vhost-user backend.
> >>
> >> Besides that, we should allocate max_queue pairs the device supports, but
> >> not nr queue pairs firstly configured, to make following case work.
> >>
> >>     $ start_testpmd.sh ... --txq=1 --rxq=1 ...
> >>     > port stop 0
> >>     > port config all txq 2
> >>     > port config all rxq 2
> >>     > port start 0
> > 
> > hi Yuanhan, firstly - thanks for this patchset. It is certainly needed
> > to fix the silent failure after increase num q's.
> > 
> > I tried a few tests and I'm seeing an issue. I can stop the port,
> > increase the number of queues and traffic is ok, but if I try to
> > decrease the number of queues it hangs on port start. I'm running head
> > of the master with your patches in the guest and 16.07 in the host.
> > 
> > $ testpmd -c 0x5f -n 4 --socket-mem 1024 -- --burst=64 -i
> > --disable-hw-vlan --rxq=2 --txq=2 --rxd=256 --txd=256 --forward-mode=io
> >> port stop all
> >> port config all rxq 1
> >> port config all txq 1
> >> port start all
> > Configuring Port 0 (socket 0)
> > (hang here)
> > 
> > I've tested a few different scenarios and anytime the queues are
> > decreased from the previous number the hang occurs.
> > 
> > I can debug further but wanted to report early as maybe issue is an
> > obvious one?

Kevin, thanks for testing! Hmm, it's a case I missed: I was thinking/testing
more about increasing (but not shrinking) the queue size :(

> virtio_dev_start() is getting stuck as soon as it needs to send a

That's because the connection is closed (for a bad reason, see detailes
below). You could figure it out quickly from the vhost log:

    testpmd> VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
    PMD: Connection closed

Those messages showed immediately after you execute "port start all".

It's actually triggered by the "rx_queue_release", which in turn invokes
the "del_queue" virtio-pci method. QEMU then resets the device once
the message is got. Thus, you saw above log.

Since we now allocate queue once, it doesn't make sense to free those
queues and invoke the "del_queue" method at rx/tx_queue_release callback
then. The queue_release callback will be invoked when we shrink the
queue size.

And then I saw this case works like a charm.

I'm about to catch a train soon, but I will try to re-post v2 today, with
another minor fix I noticed while checking this issue: we should also send
the VIRTIO_NET_CTRL_MQ message while the queues shrinks from 2 to 1.

	--yliu


More information about the dev mailing list