[dpdk-dev] Performance regression in DPDK 1.8/2.0

Paul Emmerich emmericp at net.in.tum.de
Sun Apr 26 20:50:01 CEST 2015


Hi,

I'm working on a DPDK-based packet generator [1] and I recently tried to
upgrade from DPDK 1.7.1 to 2.0.0.
However, I noticed that DPDK 1.7.1 is about 25% faster than 2.0.0 for my use
case.

So I ran some basic performance tests on the l2fwd example with DPDK 1.7.1,
1.8.0 and 2.0.0.
I used an Intel Xeon E5-2620 v3 CPU clocked down to 1.2 GHz in order to
ensure that the CPU and not the network bandwidth is the bottleneck.
I configured l2fwd to forward between two interfaces of an X540 NIC using
only a single CPU core (-q2) and measured the following throughput under
full bidirectional load:


Version  TP [Mpps] Cycles/Pkt
1.7.1    18.84     84.925690021
1.8.0    16.78     95.351609058
2.0.0    16.40     97.56097561

DPDK 1.7.1 is about 15% faster in this scenario. The obvious suspect is the
new mbuf structure introduced in DPDK 1.8, so I profiled L1 cache misses:

Version   L1 miss ratio
1.7.1     6.5%
1.8.0    13.8%
2.0.0    13.4%


FWIW the performance results with my packet generator on the same 1.2 GHz
CPU core are:

Version  TP [Mpps]  L1 cache miss ratio
1.7      11.77      4.3%
2.0      9.5        8.4%


The discussion about the original patch [2] which introduced the new mbuf
structure addresses this potential performance degradation and mentions that
it is somehow mitigated.
It even claims a 20% *increase* in performance in a specific scenario.
However, that doesn't seem to be the case for both l2fwd and my packet
generator.

Any ideas how to fix this? A 25% loss in throughput prevents me from
upgrading to DPDK 2.0.0. I need the new lcore features and the 40 GBit
driver updates, so I can't stay on 1.7.1 forever.

Paul


[1] https://github.com/emmericp/MoonGen
[2] http://comments.gmane.org/gmane.comp.networking.dpdk.devel/5155


More information about the dev mailing list