[dpdk-users] Dpdk poor performance on virtual machine

Stephen Hemminger stephen at networkplumber.org
Tue Dec 27 19:52:53 CET 2016


On Tue, 27 Dec 2016 15:59:08 +0000
edgar helmut <helmut.edgar100 at gmail.com> wrote:

> short explanation for how to read the comparison:
> first row is packet length
> throughput is half duplex, means:
> second row is vm throughput of port 1 to 2 (port 2 to 1 has approximately
> same throughput) in gbps.
> third row is host throughput of port 1 to 2 (port 2 to 1 has approximately
> same throughput) in gbps.
> 
> i.e. on 1500 bytes packet size testpmd delivers ~9.82 gbps from port 1 to 2
> and another ~9.82 gbps from port 2 to 1, while at the vm it only delivers
> ~3.9 gbps for each direction.
> 
> 
> On Tue, Dec 27, 2016 at 5:52 PM edgar helmut <helmut.edgar100 at gmail.com>
> wrote:
> 
> > Thanks. That's the document i am following.
> > For the best i can only ask that the hugepages won't be shared with
> > others, but it never reserve it from the pre allocated hugepages of the
> > host.
> > Did you have a chance to use hugepages for a guest
> >
> > as for the interfaces, i am using the virtio/vhost which creates the
> > macvtap:
> >     <interface type='direct' managed='yes'>
> >         <source dev='ens6f0' mode='passthrough'/>
> >         <model type='virtio'/>
> >         <driver name='vhost' queues='2'/>
> >         </driver>
> >         <address type='pci' domain='0x0000' bus='0x04' slot='0x09'  
> > function='0x0'/>  
> >     </interface>
> >
> > The following is a performance comparison host vs. vm using testpmd. as
> > you can see vm performance is poor.
> >
> > (sudo x86_64-native-linuxapp-gcc/app/testpmd -c 0x1f -n 3 -m 1024 --
> > --coremask=0x1e --portmask=3 -i)
> >
> >
> > 64 128 256 500 800 1000 1500
> > vm 0.23 0.42 0.75 1.3 2.3 2.7 3.9
> > host 3.6 6.35 8.3 9.5 9.7 9.8 9.82
> >
> > I have to improve it dramatically.
> >
> >
> >
> > On Mon, Dec 26, 2016 at 2:52 AM Hu, Xuekun <xuekun.hu at intel.com> wrote:
> >
> > Searching “hugepages” in https://libvirt.org/formatdomain.html
> >
> >
> >
> > If you are looking for to measure in and out packets through host, maybe
> > you can look at vhost/virtio interface also.
> >
> >
> >
> > After your testing, if you can report the performace out with macvtap,
> > that also helps us. J
> >
> >
> >
> >
> >
> > *From:* edgar helmut [mailto:helmut.edgar100 at gmail.com]
> > *Sent:* Saturday, December 24, 2016 11:53 PM
> >
> >
> > *To:* Hu, Xuekun <xuekun.hu at intel.com>
> > *Cc:* Wiles, Keith <keith.wiles at intel.com>; users at dpdk.org
> > *Subject:* Re: [dpdk-users] Dpdk poor performance on virtual machine
> >
> >
> >
> > any idea how to reserve hugepages for a guest (and not
> > transparent/anonymous hugepages) ?
> >
> > i am using libvirt and any backing method I am trying results with
> > anonymous hugepage.
> >
> > disabling the transparent hugepages resulted without any hugepages.
> >
> > Thanks
> >
> >
> >
> > On Sat, Dec 24, 2016 at 10:06 AM edgar helmut <helmut.edgar100 at gmail.com>
> > wrote:
> >
> > I am looking for a mean to measure in and out packets to and from the vm
> > (without asking the vm itself). While pure passthrough doesn't expose an
> > interface to query for in/out pkts the macvtap exposes such an interface.
> >
> > As for the anonymous hugepages I was looking for a more flexible method
> > and I assumed there is no much difference.
> >
> > I will make the test with reserved hugepages.
> >
> > However is there any knowledge about macvtap performance issues when
> > delivering 5-6 gbps?
> >
> >
> >
> > Thanks
> >
> >
> >
> >
> >
> > On 24 Dec 2016 9:06 AM, "Hu, Xuekun" <xuekun.hu at intel.com> wrote:
> >
> > Now your setup has a new thing, “macvtap”. I don’t know what’s the
> > performance of using macvtap. I only know it has much worse perf than the
> > “real” pci pass-through.
> >
> >
> >
> > I also don’t know why you select such config for your setup, anonymous
> > huge pages and macvtap. Any specific purpose?
> >
> >
> >
> > I think you should get a baseline first, then to get how much perf dropped
> > if using anonymous hugepages or macvtap。
> >
> > 1.      Baseline: real hugepage + real pci pass-through
> >
> > 2.      Anon hugepages vs hugepages
> >
> > 3.      Real pci pass-through vs. macvtap
> >
> >
> >
> > *From:* edgar helmut [mailto:helmut.edgar100 at gmail.com]
> > *Sent:* Saturday, December 24, 2016 3:23 AM
> > *To:* Hu, Xuekun <xuekun.hu at intel.com>
> > *Cc:* Wiles, Keith <keith.wiles at intel.com>; users at dpdk.org
> >
> >
> > *Subject:* Re: [dpdk-users] Dpdk poor performance on virtual machine
> >
> >
> >
> > Hello,
> >
> > I changed the setup but still performance are poor :( and I need your help
> > to understand the root cause.
> >
> > the setup is (sorry for long description):
> >
> > (test equipment is pktgen using dpdk installed on a second physical
> > machine coonected with 82599 NICs)
> >
> > host: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz with single socket ,
> > ubuntu 16.04, with 4 hugepages of 1G each.
> >
> > hypervizor (kvm): QEMU emulator version 2.5.0
> >
> > guest: same cpu as host, created with 3 vcpus, using ubuntu 16.04
> >
> > dpdk: tried 2.2, 16.04, 16.07, 16.11 - using testpmd and 512 pages of 2M
> > each.
> >
> > guest total memory is 2G and all of it is backed by the host with
> > transparent hugepages (I can see the AnonHugePages consumed at guest
> > creation). This memory includes the 512 hugepages for the testpmd
> > application.
> >
> > I pinned and isolated the guest's vcpus (using kernel option isolcapu),
> > and could see clearly that the isolation functions well.
> >
> >
> >
> > 2 x 82599 NICs connected as passthrough using macvtap interfaces to the
> > guest, so the guest receives and forwards packets from one interface to the
> > second and vice versa.
> >
> > at the guest I bind its interfaces using igb_uio.
> >
> > the testpmd at guest starts dropping packets at about ~800mbps between
> > both ports bi-directional using two vcpus for forwarding (one for the
> > application management and two for forwarding).
> >
> > at 1.2 gbps it drops a lot of packets.
> >
> > the same testpmd configuration on the host (between both 82599 NICs)
> > forwards about 5-6gbps on both ports bi-directional.
> >
> > I assumed that forwarding ~5-6 gbps between two ports should be trivial,
> > so it will be great if someone can share its configuration for a tested
> > setup.
> >
> > Any further idea will be highly appreciated.
> >
> >
> >
> > Thanks.
> >
> >
> >
> > On Sat, Dec 17, 2016 at 2:56 PM edgar helmut <helmut.edgar100 at gmail.com>
> > wrote:
> >
> > That's what I afraid.
> >
> > In fact i need the host to back the entire guest's memory with hugepages.
> >
> > I will find the way to do that and make the testing again.
> >
> >
> >
> >
> >
> > On 16 Dec 2016 3:14 AM, "Hu, Xuekun" <xuekun.hu at intel.com> wrote:
> >
> > You said VM’s memory was 6G, while transparent hugepages was only used ~4G
> > (4360192KB). So some were mapped to 4K pages.
> >
> >
> >
> > BTW, the memory used by transparent hugepage is not the hugepage you
> > reserved in kernel boot option.
> >
> >
> >
> > *From:* edgar helmut [mailto:helmut.edgar100 at gmail.com]
> > *Sent:* Friday, December 16, 2016 1:24 AM
> > *To:* Hu, Xuekun
> > *Cc:* Wiles, Keith; users at dpdk.org
> > *Subject:* Re: [dpdk-users] Dpdk poor performance on virtual machine
> >
> >
> >
> > in fact the vm was created with 6G RAM, its kernel boot args are defined
> > with 4 hugepages of 1G each, though when starting the vm i noted that
> > anonhugepages increased.
> >
> > The relevant qemu process id is 6074, and the following sums the amount of
> > allocated AnonHugePages:
> > sudo grep -e AnonHugePages  /proc/6074/smaps | awk  '{ if($2>0) print $2}
> > '|awk '{s+=$1} END {print s}'
> >
> > which results with 4360192
> >
> > so not all the memory is backed with transparent hugepages though it is
> > more than the amount of hugepages the guest supposed to boot with.
> >
> > How can I be sure that the required 4G hugepages are really allocated?,
> > and not, for example, only 2G out of the 4G are allocated (and the rest 2
> > are mapping of the default 4K)?
> >
> >
> >
> > thanks
> >
> >
> >
> > On Thu, Dec 15, 2016 at 4:33 PM, Hu, Xuekun <xuekun.hu at intel.com> wrote:
> >
> > Are you sure the anonhugepages size was equal to the total VM's memory
> > size?
> > Sometimes, transparent huge page mechanism doesn't grantee the app is using
> > the real huge pages.
> >
> >
> >
> > -----Original Message-----
> > From: users [mailto:users-bounces at dpdk.org] On Behalf Of edgar helmut
> > Sent: Thursday, December 15, 2016 9:32 PM
> > To: Wiles, Keith
> > Cc: users at dpdk.org
> > Subject: Re: [dpdk-users] Dpdk poor performance on virtual machine
> >
> > I have one single socket which is Intel(R) Xeon(R) CPU E5-2640 v4 @
> > 2.40GHz.
> >
> > I just made two more steps:
> > 1. setting iommu=pt for better usage of the igb_uio
> > 2. using taskset and isolcpu so now it looks like the relevant dpdk cores
> > use dedicated cores.
> >
> > It improved the performance though I still see significant difference
> > between the vm and the host which I can't fully explain.
> >
> > any further idea?
> >
> > Regards,
> > Edgar
> >
> >
> > On Thu, Dec 15, 2016 at 2:54 PM, Wiles, Keith <keith.wiles at intel.com>
> > wrote:
> >  
> > >  
> > > > On Dec 15, 2016, at 1:20 AM, edgar helmut <helmut.edgar100 at gmail.com>  
> > > wrote:  
> > > >
> > > > Hi.
> > > > Some help is needed to understand performance issue on virtual machine.
> > > >
> > > > Running testpmd over the host functions well (testpmd forwards 10g  
> > > between  
> > > > two 82599 ports).
> > > > However same application running on a virtual machine over same host
> > > > results with huge degradation in performance.
> > > > The testpmd then is not even able to read 100mbps from nic without  
> > drops,  
> > > > and from a profile i made it looks like a dpdk application runs more  
> > than  
> > > > 10 times slower than over host…  
> > >
> > > Not sure I understand the overall setup, but did you make sure the  
> > NIC/PCI  
> > > bus is on the same socket as the VM. If you have multiple sockets on your
> > > platform. If you have to access the NIC across the QPI it could explain
> > > some of the performance drop. Not sure that much drop is this problem.
> > >  
> > > >
> > > > Setup is ubuntu 16.04 for host and ubuntu 14.04 for guest.
> > > > Qemu is 2.3.0 (though I tried with a newer as well).
> > > > NICs are connected to guest using pci passthrough, and guest's cpu is  
> > set  
> > > > as passthrough (same as host).
> > > > On guest start the host allocates transparent hugepages (AnonHugePages)  
> > > so  
> > > > i assume the guest memory is backed with real hugepages on the host.
> > > > I tried binding with igb_uio and with uio_pci_generic but both results  
> > > with  
> > > > same performance.
> > > >
> > > > Due to the performance difference i guess i miss something.
> > > >
> > > > Please advise what may i miss here?
> > > > Is this a native penalty of qemu??
> > > >
> > > > Thanks
> > > > Edgar  

Did you setup KVM host to run guest  in huge pages?

https://access.redhat.com/solutions/36741


More information about the users mailing list