[dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM

Traynor, Kevin kevin.traynor at intel.com
Mon May 11 14:10:14 CEST 2015


> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pravin Shelar
> Sent: Friday, May 8, 2015 2:20 AM
> To: Oleg Strikov
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Issues met while running openvswitch/dpdk/virtio
> inside the VM
> 
> On Thu, May 7, 2015 at 9:22 AM, Oleg Strikov <oleg.strikov at canonical.com>
> wrote:
> > Hi DPDK users and developers,
> >
> > Few weeks ago I came up with the idea to run openvswitch with dpdk backend
> > inside qemu-kvm virtual machine. I don't have enough supported NICs yet and
> > my plan was to start experimenting inside the virtualized environment,
> > achieve functional state of all the components and then switch to the real
> > hardware. Additional useful side-effect of doing things inside the vm is
> > that issues can be easily reproduced by someone else in a different
> > environment.
> >
> > I (fondly) hoped that running openvswitch/dpdk inside the vm would be
> > simpler than running the same set of components on the real hardware.
> > Unfortunately I met a bunch of issues on the way. All these issues lie on a
> > borderline between dpdk and openvswitch but I think that you might be
> > interested in my story. Please note that I still don't have
> > openvswitch/dpdk working inside the vm. I definetely have some progress
> > though.
> >
> Thanks for summarizing all the issues.
> DPDK is testing is done on real hardware and we are planing testing it
> in VM. This will certainly help in fixing issues sooner.
> 
> > Q: Does it sound okay from functional (not performance) standpoint to run
> > openvswitch/dpdk inside the vm? Do we want to be able to do this? Does
> > anyone from the dpdk development team do this?
> >
> > ## Issue 1 ##
> >
> > Openvswitch requires backend pmd driver to provide N_CORES tx queues where
> > N_CORES is the amount of cores available on the machine (openvswitch counts
> > the amount of cpu* entries inside /sys/devices/system/node/node0/ folder).
> > To my understanding it doesn't take into account the actual amount of cores
> > used by dpdk and just allocates tx queue for each available core. You may
> > refer to this chunk of code for details:
> > https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067
> >
> In case of OVS DPDK, there is no dpdk thread. Therefore all polling
> cores are managed by OVS and there is no need to account cores for
> DPDK. You can assign specific cores for OVS to limit number of cores
> used by OVS.
> 
> > This approach works fine on the real hardware but makes some issues when we
> > run openvswitch/dpdk inside the virtual machine. I tried both emulated
> > e1000 NIC and virtio NIC and neither of them worked just from the box.
> > Emulated e1000 NIC doesn't support multiple tx queues at all (see
> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
> > virtio NIC doesn't support multiple tx queues by default. To enable
> > multiple tx queue for virtio NIC I had to add the following line to the
> > interface section of my libvirt config: '<driver name="vhost" queues="4"/>'
> >
> Good point. We should document this. Can you send patch to update
> README.DPDK?

Daniele's patch http://openvswitch.org/pipermail/dev/2015-March/052344.html
also allows for having a limited set of queues available. The documentation
patch is a good idea too.

> 
> > ## Issue 2 ##
> >
> > Openvswitch calls rte_eth_tx_queue_setup() twice for the same
> > port_id/queue_id. First call takes place during device initialization (see
> > call to dpdk_eth_dev_init() inside netdev_dpdk_init():
> > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522).
> > Second call takes place when openvswitch tries to add more tx queues to the
> > device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq():
> > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697).
> > Second call not only initialized new queues but tries to re-initialize
> > existing ones.
> >
> > Unfortunately virtio driver can't handle second call of
> > rte_eth_tx_queue_setup() and returns error here:
> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316
> > This happens because memzone with the name portN_tvqN already exists when
> > second call takes place (memzone has been created during the first call).
> > To deal with this issue I had to manually add rte_memzone_lookup-based
> > check for this situation and avoid allocation of a new memzone if it
> > already exists.
> >
> This sounds like issue with virtIO driver. I think we need to fix DPDK
> upstream for this to work correctly.
> 
> > Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right
> > now I can't understand if it's the issue with the virtio pmd driver or
> > incorrect API usage by openvswitch? Could someone shed some light on this
> > so I can move forward and maybe propose a fix.
> >
> > ## Issue 3 ##
> >
> > This issue is also (somehow) related to the fact that openvswitch calls
> > rte_eth_tx_queue_setup() twice. I fix the previous issue by the method
> > described above and initialization finishes. The whole machinery starts to
> > work but crashes at the very beginning (while fetching the first packet
> > from the NIC maybe). This crash happens here:
> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588
> > It takes place because vq_ring structure contains zeros instead of correct
> > values:
> > vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}
> > My understanding is that vq_ring gets initialized after the first call to
> > rte_eth_tx_queue_setup(), then overwritten by the second call to
> > rte_eth_tx_queue_setup() but without an appropriate initialization for the
> > second time. I'm trying to fix this issue right now.
> >
> This also sounds like DPDK issue.
> 
> > Q: Does it sound like a realistic goal to make virtio driver work in
> > openvswitch-like scenarios? I'm definitely not an expert in the area of
> > dpdk and can't estimate time and resources required. Maybe it's better to
> > wait until I get a proper hardware?
> >
> It will be nice to make OVS-DPDK work in VM. As I said I am also
> planning on working on it. Thanks for the heads up.


More information about the dev mailing list