[dpdk-dev] [PATCH v2 00/12] net/virtio: add offload support

Olivier Matz olivier.matz at 6wind.com
Mon Oct 3 11:00:11 CEST 2016


This patchset, targetted for 16.11, introduces the support of rx and tx
offload in virtio pmd.  To achieve this, some new mbuf flags must be
introduced, as discussed in [1].

It applies on top of:
- software packet type [2]
- testpmd enhancements [3]

The new mbuf checksum flags are backward compatible for current
applications that assume that unknown_csum = good_cum (since there
was only a bad_csum flag). But it the patchset is integrated, we
should consider updating the PMDs to match the new API for 16.11.

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-October/048073.html
[3] http://dpdk.org/ml/archives/dev/2016-September/046443.html

changes v1 -> v2
- change mbuf checksum calculation static inline
- fix checksum calculation for protocol where csum=0 means no csum
- move mbuf checksum calculation in librte_net
- use RTE_MIN() to set max rx/tx queue
- rebase on top of head

Olivier Matz (12):
  virtio: move device initialization in a function
  virtio: setup and start cq in configure callback
  virtio: reinitialize the device in configure callback
  net: add function to calculate a checksum in a mbuf
  mbuf: add new Rx checksum mbuf flags
  app/testpmd: fix checksum stats in csum engine
  mbuf: new flag for LRO
  app/testpmd: display lro segment size
  virtio: add Rx checksum offload support
  virtio: add Tx checksum offload support
  virtio: add Lro support
  virtio: add Tso support

 app/test-pmd/csumonly.c                |   8 +-
 doc/guides/rel_notes/release_16_11.rst |  16 ++
 drivers/net/virtio/virtio_ethdev.c     | 182 +++++++++++++---------
 drivers/net/virtio/virtio_ethdev.h     |  18 +--
 drivers/net/virtio/virtio_pci.h        |   4 +-
 drivers/net/virtio/virtio_rxtx.c       | 270 ++++++++++++++++++++++++++++++---
 drivers/net/virtio/virtqueue.h         |   1 +
 lib/librte_mbuf/rte_mbuf.c             |  18 ++-
 lib/librte_mbuf/rte_mbuf.h             |  58 ++++++-
 lib/librte_net/rte_ip.h                |  60 ++++++++
 10 files changed, 526 insertions(+), 109 deletions(-)

Test plan
=========

(not fully replayed on v2, but no major change)

Platform description
--------------------

  guest (dpdk)
  +----------------+
  |                |
  |                |
  |         port0  +-----<---+
  |       ixgbe /  |         |
  |       directio |         |
  |                |         |
  |    port1       |         ^ flow1
  +----------------+         | (flow2 is the reverse)
         |                   |
         | virtio            |
         v                   |
  +----------------+         |
  |     tap0   /   |         |
  |1.1.1.1   /     |         |
  |ns-tap  /       |         |
  |      /         |         |
  |    /   ixgbe2  +------>--+
  |  /    1.1.1.2  |
  |/      ns-ixgbe |
  +----------------+
  host (linux, vhost-net)


flow1:
  host -(ixgbe)-> guest -(virtio)-> host
  1.1.1.2 -> 1.1.1.1

flow2:
  host -(virtio)-> guest -(ixgbe)-> host
  1.1.1.2 -> 1.1.1.1

Host configuration
------------------

Start qemu with:

- a ne2k management interface to avoi any conflict with dpdk
- 2 ixgbe interfaces given to with vm through vfio
- a virtio net device, connected to a tap interface through vhost-net

  /usr/bin/qemu-system-x86_64 -k fr -daemonize --enable-kvm -m 1G -cpu host \
    -smp 3 -serial telnet::40564,server,nowait -serial null \
    -qmp tcp::44340,server,nowait -monitor telnet::49229,server,nowait \
    -device ne2k_pci,mac=de:ad:de:01:02:03,netdev=user.0,addr=03 \
    -netdev user,id=user.0,hostfwd=tcp::34965-:22 \
    -device vfio-pci,host=0000:04:00.0 -device vfio-pci,host=0000:04:00.1 \
    -netdev type=tap,id=vhostnet0,script=no,vhost=on,queues=8 \
    -device virtio-net-pci,netdev=vhostnet0,ioeventfd=on,mq=on,vectors=17 \
    -hda "/path/to/ubuntu-14.04-template.qcow2" \
    -snapshot -vga none -display none

Move the tap interface in a netns, and configure it:

  ip netns add ns-tap
  ip netns exec ns-tap ip l set lo up
  ip link set tap0 netns ns-tap
  ip netns exec ns-tap ip l set tap0 down
  ip netns exec ns-tap ip l set addr 02:00:00:00:00:01 dev tap0
  ip netns exec ns-tap ip l set tap0 up
  ip netns exec ns-tap ip a a 1.1.1.1/24 dev tap0
  ip netns exec ns-tap arp -s 1.1.1.2 02:00:00:00:00:00
  ip netns exec ns-tap ip a

Move the ixgbe interface in a netns, and configure it:

  IXGBE=ixgbe2
  ip netns add ns-ixgbe
  ip netns exec ns-ixgbe ip l set lo up
  ip link set ${IXGBE} netns ns-ixgbe
  ip netns exec ns-ixgbe ip l set ${IXGBE} down
  ip netns exec ns-ixgbe ip l set addr 02:00:00:00:00:00 dev ${IXGBE}
  ip netns exec ns-ixgbe ip l set ${IXGBE} up
  ip netns exec ns-ixgbe ip a a 1.1.1.2/24 dev ${IXGBE}
  ip netns exec ns-ixgbe arp -s 1.1.1.1 02:00:00:00:00:01
  ip netns exec ns-ixgbe ip a

Guest configuration
-------------------

List of pci devices:

  00:02.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  00:03.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8029(AS) [10ec:8029]
  00:04.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  00:05.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]

Compile dpdk:

  cd dpdk.org
  make config T=x86_64-native-linuxapp-gcc
  make -j4

Prepare environment:

  mkdir -p /mnt/huge
  mount -t hugetlbfs nodev /mnt/huge
  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
  modprobe uio_pci_generic
  python tools/dpdk_nic_bind.py -b uio_pci_generic 0000:00:02.0
  python tools/dpdk_nic_bind.py -b uio_pci_generic 0000:00:05.0

Run test
========

The test uses iperf to validate connectivity between the 2 netns of the
host and trough the guest.

Iperf is run with:

  # flow1: host -(ixgbe)-> guest -(virtio)-> host
  ip netns exec ns-tap iperf -s
  ip netns exec ns-ixgbe iperf -c 1.1.1.1 -t 10

  # flow2: host -(virtio)-> guest -(ixgbe)-> host
  ip netns exec ns-ixgbe iperf -s
  ip netns exec ns-tap iperf -c 1.1.1.2 -t 10

The guest runs testpmd with csum forward engine, its configuration
depends on the test case.

test1: large packets (lro/tso)
------------------------------

Configuration of testpmd:

  ./build/app/testpmd -l 0,1 --log-level 8 -- --total-num-mbufs=16384 \
    -i --port-topology=chained --disable-hw-vlan-filter \
    --disable-hw-vlan-strip --enable-rx-cksum --enable-lro \
    --crc-strip --txqflags=0

  set fwd csum
  tso set 1440 0
  csum set ip hw 0
  csum set tcp hw 0
  tso set 1440 1
  #csum set ip hw 1 # not supported by virtio
  csum set tcp hw 1
  start

Iperf log:

  root at ubuntu1404:~# ip netns exec ns-ixgbe iperf -c 1.1.1.1 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.1, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.2 port 54460 connected with 1.1.1.1 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  6.14 GBytes  5.27 Gbits/sec
  root at ubuntu1404:~# ip netns exec ns-tap iperf -c 1.1.1.2 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.2, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.1 port 58312 connected with 1.1.1.2 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  6.70 GBytes  5.76 Gbits/sec

Example of what we see with "set verbose 1" in testpmd:

  -- flow1: ixgbe2 -> port0 (ixgbe) -> testpmd -> port1 (virtio) <-> tap0
  port=0, mbuf=0x7f968ad9fdc0, pkt_len=24682, nb_segs=13:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_UNKNOWN PKT_RX_IP_CKSUM_UNKNOWN
  tx: m->l2_len=14 m->l3_len=20 m->l4_len=32
  tx: m->tso_segsz=1440
  tx: flags=PKT_TX_IP_CKSUM PKT_TX_L4_NO_CKSUM PKT_TX_TCP_SEG PKT_TX_IPV4

  -- flow2: tap0 -> port1 (virtio)-> testpmd -> port0 (ixgbe) -> ixgbe2
  port=1, mbuf=0x7f968acc9f40, pkt_len=42058, nb_segs=21:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_NONE PKT_RX_IP_CKSUM_UNKNOWN PKT_RX_LRO
  rx: m->lro_segsz=1440
  tx: m->l2_len=14 m->l3_len=20 m->l4_len=32
  tx: m->tso_segsz=1440
  tx: flags=PKT_TX_IP_CKSUM PKT_TX_L4_NO_CKSUM PKT_TX_TCP_SEG PKT_TX_IPV4

test2: hardware checksum only
-----------------------------

Configuration of testpmd:

  ./build/app/testpmd -l 0,1 --log-level 8 -- --total-num-mbufs=16384 \
    -i --port-topology=chained --disable-hw-vlan-filter \
    --disable-hw-vlan-strip --enable-rx-cksum --crc-strip --txqflags=0

  set fwd csum
  csum set ip hw 0
  csum set tcp hw 0
  csum set tcp hw 1
  start

Iperf log:

  root at ubuntu1404:~# ip netns exec ns-ixgbe iperf -c 1.1.1.1 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.1, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.2 port 54462 connected with 1.1.1.1 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  4.49 GBytes  3.86 Gbits/sec
  root at ubuntu1404:~# ip netns exec ns-tap iperf -c 1.1.1.2 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.2, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.1 port 58314 connected with 1.1.1.2 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  6.66 GBytes  5.72 Gbits/sec

Example of what we see with "set verbose 1" in testpmd:

  -- flow1: ixgbe2 -> port0 (ixgbe) -> testpmd -> port1 (virtio) <-> tap0
  port=0, mbuf=0x7f0adca89b40, pkt_len=1514, nb_segs=1:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_UNKNOWN PKT_RX_IP_CKSUM_UNKNOWN
  tx: m->l2_len=14 m->l3_len=20 m->l4_len=32
  tx: flags=PKT_TX_TCP_CKSUM PKT_TX_IPV4

  -- flow2: tap0 -> port1 (virtio)-> testpmd -> port0 (ixgbe) -> ixgbe2
  port=1, mbuf=0x7f0adcb98d80, pkt_len=1514, nb_segs=1:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_NONE PKT_RX_IP_CKSUM_UNKNOWN
  tx: m->l2_len=14 m->l3_len=20 m->l4_len=32
  tx: flags=PKT_TX_IP_CKSUM PKT_TX_TCP_CKSUM PKT_TX_IPV4

test3: no offload
-----------------

Configuration of testpmd:

  ./build/app/testpmd -l 0,1 --log-level 8 -- --total-num-mbufs=16384 \
    -i --port-topology=chained --disable-hw-vlan-filter --disable-hw-vlan-strip

  set fwd csum
  start

Iperf log:

  root at ubuntu1404:~# ip netns exec ns-ixgbe iperf -c 1.1.1.1 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.1, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.2 port 54466 connected with 1.1.1.1 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  4.29 GBytes  3.68 Gbits/sec
  root at ubuntu1404:~# ip netns exec ns-tap iperf -c 1.1.1.2 -t 10
  ------------------------------------------------------------
  Client connecting to 1.1.1.2, TCP port 5001
  TCP window size: 85.0 KByte (default)
  ------------------------------------------------------------
  [  3] local 1.1.1.1 port 58316 connected with 1.1.1.2 port 5001
  [ ID] Interval       Transfer     Bandwidth
  [  3]  0.0-10.0 sec  6.66 GBytes  5.72 Gbits/sec

Example of what we see with "set verbose 1" in testpmd:

  -- flow1: ixgbe2 -> port0 (ixgbe) -> testpmd -> port1 (virtio) <-> tap0
  port=0, mbuf=0x7faf38b3e700, pkt_len=1514, nb_segs=1:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_UNKNOWN PKT_RX_IP_CKSUM_UNKNOWN
  tx: flags=PKT_TX_L4_NO_CKSUM PKT_TX_IPV4

  -- flow2: tap0 -> port1 (virtio)-> testpmd -> port0 (ixgbe) -> ixgbe2
  port=1, mbuf=0x7faf38b71500, pkt_len=1514, nb_segs=1:
  rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=32 flags=PKT_RX_L4_CKSUM_UNKNOWN PKT_RX_IP_CKSUM_UNKNOWN
  tx: flags=PKT_TX_L4_NO_CKSUM PKT_TX_IPV4

-- 
2.8.1



More information about the dev mailing list