[dpdk-users] Mellanox ConnectX-4, DPDK and extreme latency issues

Arjun Roy arroy at eng.ucsd.edu
Thu Jun 22 20:39:39 CEST 2017


Greetings all.

I have a weird issue regarding excessive latency using Mellanox ConnectX-4
100Gbe cards, DPDK and packet forwarding. Specifically: running the l3fwd
and basicfwd DPDK example programs yields ping latencies of several (5-8)
milliseconds. I tried the same test using an Intel X-540 AT2 card on the
same systems and the latency was on the order of 4-5 microseconds.

Setup:

I have three systems, SysA, SysB, and SysC. Each runs Ubuntu 16.04 and
kernel 4.4.0-78-generic.
Each system is a dual socket numa machine, where each socket is a 12 core
(+12 with hyperthreading enabled) Xeon E5-2650.
SysA and SysB each have a single Mellanox ConnectX-4 card, connected to
numa node 1, showing up as enp129s0f0 and enp129s0f1.
SysC has two ConnectX-4 cards, connected to node 0 and node 1. Node 0 has
enp4s0f0 and enp4s0f1, while node 1 has enp129s0f0 and enp129s0f1.
All machines also have a single dual port Intel X-540 AT2 10Gbe NIC that
also supports DPDK.


SysC forwards packets between SysA and SysB. SysA is connected to
enp129s0f0 on SysC, while SysB is connected to enp4s0f0 on SysC. (Note: I
tried a variety of configurations; including connecting SysA and SysB to
the same physical cards on SysC, and the same latency issue still
persists). No switches involved; all direct connect.

If it helps, the driver version is the OFED 4.0-2 and the card firmware
is 12.18.2000.

Now, with this setup, with normal linux forwarding setup, I can get 0.095
msecs ping on average from SysA to SysB (or vice versa).
However, if I run the DPDK forwarding apps, I get about 5-8 msecs.
The ping test I'm using is both regular (1 second gaps between pings) and
burst mode (flooding ping packets as fast as possible). In either case the
latency is 5-8 msecs per ping.

I have been running l3fwd with this command line:
sudo ./l3fwd -l 2,3   -n4 -w 81:00.0 -w 04:00.0 --socket-mem=1024,1024 --
-p 0x3 -P  --config="(1,0,2),(0,0,3)"

In this case, I have verified that the cores and numa nodes line up; ie.
I'm assigning each port to a core on the local numa node.


Regarding my sanity check: I tried the same test with Intel X-540 cards,
wired with the same topology (SysA connects to one port on SysC, SysB
connects to the other port; note this is the same physical card) and for
the same test I get just 4-5 microseconds for ping in flood mode).

Any ideas what might be causing multiple milliseconds of latency on the
Mellanox cards?

Thanks,
-Arjun Roy


More information about the users mailing list