[dpdk-dev] outw() in virtio_ring_doorbell() in DPDK+virtio consume 40% of the CPU in oprofile

James Yu ypyu2011 at gmail.com
Tue Dec 17 00:35:27 CET 2013


(A) The packets I sent are 64-bytes, not big packet. I am not sure GSO will
help. For bigger packet, it will help.

(B) What you do mean "multiple packets per second" ? Do you mean multiple
queue support to send/receive parallel in multiple cores to speed it up ?
Is it supported in DPDK 1.3.1r2 ?

(C)
There are two places using dpdk_ring_doorbell() in virtio_user.c,
eth_tx_burst() and virtio_alloc_rxq() which is called in virtio_recv_buf().
I looked at them further using "top perf -C 0". It could even occupies 80%
of the logical core 0 on a CentOS 32-bit VM. Here is the implementation of
outw() using gcc preprocessing (-E)
static void outw(unsigned short int value, unsigned short int __port){
  __asm__ __volatile__ ("outw %w0,%w1": :"a" (value), "Nd" (__port));
}
Is outw command a blocking call ?
Based on this link http://wiki.osdev.org/Inline_Assembly/Examples, I am not
sure it is blocked/waiting.

The question is what's causing it to be blocked during the outw operation ?
Is it normal ?
If it is simply because of the IO virtualization to map guest physical port
address to host physical port addresses, how can it be improved with using
VT-d ? Using the MMIO as described below ?

There are two components in vmxnet3: the user PMD codes and the kernel
driver. It actually uses MMIO to access to the memory.

vmxnet3.ko

vmxnet3_alloc_pci_resources -> compat_pci_resource_start -> ioremap



userspace -> PMD

vmxnet3_init_adapter:: adapter->hw_addr1 = (unsigned char*) mmap()
The first time of virtual address fault, the vmxnet3 driver to find the
mapped IO and cache it. Subsequent access to the virtual address will be
faster.

I wonder how Virtio using outw() handles the access to the IO port address.
Does it have to  map from the IO port address in the VM to the physical
port address in the host for EVERY access ? If that's the case,some
improvement can be done if we use similar way as the vmxnet3 model ?


Thanks

James




On Fri, Dec 13, 2013 at 3:01 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Fri, 13 Dec 2013 14:04:35 -0800
> James Yu <ypyu2011 at gmail.com> wrote:
>
> > Resending it due to missing [dpdk-dev] in the subject line.
> >
> > I am using Spirent to send a 2Gbps traffic to a 10G port that are looped
> > back by l2fwd+DPDK+virtio in a CentOS 32-bit and receive on the other
> port
> > only at 700 Mbps.   The CentOS 32-bit is on a Fedora 18 KVM host. The
> > virtual interfaces are configured as virtio port type, not e1000.
> vhost-net
> > was automatically used in qemu-kvm when virtio ports are used in the
> guest.
> >
> > The questions are
> > A. Why it can only reach 2Gbps
> > B. Why outw() is using 40% of the entire measurement when it only try to
> > write 2 bytes to the IO port using assembly outw command ? Is it a
> blocking
> > call ? or it wastes time is mapping from the IO address of the guest to
> the
> > physical address of the IO port on the host ?
> > C. any way to improve it ?
> > D. vmxnet PMD codes are using memory mapped IO address, not port IO
> > address. Will it be faster to use memory mapped IO address ?
> >
> > Any pointers or feedback will help.
> > Thanks
> >
> > James
>
> The outw is a VM exit to the hypervisor. It informs the hypervisor that
> data
> is ready to send and it runs then. To really get better performance, virtio
> needs to be able to do multiple packets per send. For bulk throughput
> GSO support would help, but that is a generic DPDK issues.
>
> Virtio use I/O to signal hypervisor (there is talk of using MMIO in later
> versions but it won't be faster.
>
>


More information about the dev mailing list