[dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver

贾学涛 jiaxt at sinogrid.com
Thu Apr 9 05:43:26 CEST 2015


Hi Cunming,
     I applyed bifurc dirver patches and tested it follow your example. But
I can't received packets with testpmd and l2fwd.
    Kernel stack can receive packets from 10.0.0.2 before "ethtool -N
XGE4.1 flow-type ip4 src-ip 10.0.0.2 action 12". After "thtool -N XGE4.1
flow-type ip4 src-ip 10.0.0.2 action 12", kernel stack can't receive
packets from 10.0.0.2, but testpmd and l2fwd cannot receive any packets
too.
   queue 0-11 used by kernel and queue 12 used by bifurc dirver.
   How can I make it work?

2014-11-25 22:11 GMT+08:00 Cunming Liang <cunming.liang at intel.com>:

>
> This is a RFC patch set to support "bifurcated driver" in DPDK.
>
>
> What is "bifurcated driver"?
> ===========================
>
> The "bifurcated driver" stands for the kernel NIC driver that supports:
>
> 1. on-demand rx/tx queue pairs split-off and assignment to user space
>
> 2. direct NIC resource(e.g. rx/tx queue registers) access from user space
>
> 3. distributing packets to kernel or user space rx queues by
>    NIC's flow director according to the filter rules
>
> Here's the kernel patch set to support.
> http://comments.gmane.org/gmane.linux.network/333615
>
>
> Usage scenario
> =================
>
> It's well accepted by industry to use DPDK to process fast path packets in
> user space in a high performance fashion, meanwhile processing slow path
> control packets in kernel space is still needed as those packets usually
> rely on in_kernel TCP/IP stacks and/or socket programming interface.
>
> KNI(Kernel NIC Interface) mechanism in DPDK is designed to meet this
> requirement, with below limitation:
>
>   1) Software classifies packets and distributes them to kernel via DPDK
>      software rings, at the cost of significant CPU cycles and memory
> bandwidth.
>
>   2) Memory copy packets between kernel' socket buffer and mbuf brings
>      significant negative performance impact to KNI performance.
>
> The bifurcated driver provides a alternative approach that not only
> offloads
> flow classification and distribution to NIC but also support packets
> zero_copy.
>
> User can use standard ethtool to add filter rules to the NIC in order to
> distribute specific flows to the queues only accessed by kernel driver and
> stack, and add other rules to distribute packets to the queues assigned to
> user-space.
>
> For those rx/tx queue pairs that directly accessed from user space,
> DPDK takes over the packets rx/tx as well as corresponding DMA operation
> for high performance packet I/O.
>
>
> What's the impact and change to DPDK
> ======================================
>
> DPDK usually binds PCIe NIC devices by leveraging kernel' user space driver
> mechanism UIO or VFIO to map entire NIC' PCIe I/O space of NIC to user
> space.
> The bifurcated driver PMD talks to a NIC interface using raw socket APIs
> and
> only mmap() limited I/O space (e.g. certain 4K pages) for accessing
> involved
> rx/tx queue pairs. So the impact and changes mainly comes with below:
>
> - netdev
>     DPDK needs to create a af_packet socket and bind it to a bifurcated
> netdev.
>     The socket fd will be used to request 'queue pairs info',
>     'split/return queue pairs' and etc. The PCIe device ID, netdev MAC
> address,
>     numa info are also from the netdev response.
>
> - PCIe device scan and driver probe
>     netdev provides the PCIe device ID information. Refer to the device ID,
>     the correct driver should be used. And for such netdev device, the
> creation
>     of PCIe device is no longer from scan but the on-demand assignment.
>
> - PCIe BAR mapping
>     "bifurcated driver" maps several pages for the queue pairs.
>     Others BAR register space maps to a fake page. The BAR mapping go
> through
>     mmap on sockfd. Which is a little different from what UIO/VFIO does.
>
> - PMD
>     The PMD will no longer really initialize and configure NIC.
>     Instead, it only takes care the queue pair setup, rx_burst and
> tx_burst.
>
> The patch uses eal '--vdev' parameter to assign netdev iface name and
> number of
> queue pairs. Here's a example about how to configure the bifurcated driver
> and
> run DPDK testpmd with bifurcated PMD.
>
>   1. Set promisc mode
>   > ifconfig eth0 promisc
>
>   2. Turn on fdir
>   > ethtool -K eth0 ntuple on
>
>   3. Setup a flow director rule to distribute packets with source ip
>      0.0.0.0 to rxq No.0
>   > ethtool -N eth0  flow-type udp4 src-ip 0.0.0.0 action 0
>
>   4. Run testpmd on netdev 'eth0' with 1 queue pair.
>   > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 \
>   >  --vdev=rte_bifurc,iface=eth0,qpairs=1 -- \
>   >  -i --rxfreet=32 --txfreet=32 --txrst=32
>   Note:
>     iface and qpairs arguments above specify the netdev interface name and
>     number of qpairs that user space request from the "bifurcated driver"
>     respectively.
>
>   5. Setup a flow director rule to distribute packets with source ip
>      1.1.1.1 to rxq No.32. This needs to be done after testpmd starts.
>   > ethtool -N eth0 flow-type udp4 src-ip 1.1.1.1 action 32
>
> Below illustrates the detailed changes in this patch set.
>
> eal
> --------
> The first two patches are all about the eal API declaration and Linux
> version
> definition to support af_packet socket and verbs of bifurcated netdev.
> Those APIs include the verbs like open, bind, (un)map, split/retturn,
> map_umem.
> And other APIs like set_pci, get_ifinfo and get/put_devargs which help to
> generate pci device from bifurcated netdev and get basic netdev info.
>
> The third patch is used to allow probing driver on the PCIe VDEV created
> from
> a NIC interface driven by "bifurcated driver". It defines a new flag
> 'RTE_PCI_DRV_BIFURC' used for direct ring access PMD.
>
> librte_bifurc
> ---------------
> The library is used as a VDEV bus driver to scan '--vdev=rte_bifurc' VDEV
> from eal command-line. It generates the PCIe VDEV device ready for further
> driver probe. It maintains the bifurcated device information include
> sockfd,
> hwaddr, mtu, qpairs, iface_name. It's used for other direct ring access PMD
> to apply for bifurcated device info.
>
> direct ring access PMD
> -------------------------
> The patch provides direct ring access PMD for ixgbe. Comparing to the
> normal
> PMD ixgbe, it uses 'RTE_PCI_DRV_BIFURC' flag during self registration.
> It mostly reuses the existing PMD ops to avoid re-implementing everything
> from scratch. And it also modifies the rx/tx_queue_setup to allow queue
> setup from any queue offset.
>
> Supported NIC driver
> ========================
>
> The "bifurcated driver" kernel patch only supports "ixgbe" driver at the
> moment,
> so this RFC patch also provides "ixgbe" PMD via direct-mapped rings as
> sample.
> The support for 40GE(i40e) will be added in the future.
>
> In addition, for those multi-queues enabled NIC with flow director
> capability
> to do perform packet classification and distribution, there's no special
> technical gap to provide bifurcated driver approach support.
>
> Limitation
> ============
>
> By using "bifurcated driver", user space only takes over the DMA operation.
> For those NIC configure setting, it's out of control from user space PMD.
> All the NIC setting including add/del filter rules need to be done by
> standard Linux network tools(e.g. ethtool).
> So the feature support really depend on how much are supported by ethtool.
>
>
> Any questions, comments and feedback are welcome.
>
>
> -END-
>
> Signed-off-by: Cunming Liang <cunming.liang at intel.com>
> Signed-off-by: Danny Zhou <danny.zhou at intel.com>
>
> *** BLURB HERE ***
>
> Cunming Liang (6):
>   eal: common direct ring access API
>   eal: direct ring access support by linux af_packet
>   pci: allow VDEV as pci device during device driver probe
>   bifurc: add driver to scan bifurcated netdev
>   ixgbe: rx/tx queue stop bug fix
>   ixgbe: PMD for bifurc ixgbe net device
>
>  config/common_linuxapp                         |   5 +
>  lib/Makefile                                   |   1 +
>  lib/librte_bifurc/Makefile                     |  58 +++++
>  lib/librte_bifurc/rte_bifurc.c                 | 284 +++++++++++++++++++++
>  lib/librte_bifurc/rte_bifurc.h                 |  90 +++++++
>  lib/librte_eal/common/Makefile                 |   5 +
>  lib/librte_eal/common/include/rte_pci.h        |   4 +
>  lib/librte_eal/common/include/rte_pci_bifurc.h | 186 ++++++++++++++
>  lib/librte_eal/linuxapp/eal/Makefile           |   1 +
>  lib/librte_eal/linuxapp/eal/eal_pci.c          |  42 ++--
>  lib/librte_eal/linuxapp/eal/eal_pci_bifurc.c   | 336
> +++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.c                  |   3 +-
>  lib/librte_pmd_ixgbe/Makefile                  |  13 +-
>  lib/librte_pmd_ixgbe/ixgbe_bifurcate.c         | 303
> ++++++++++++++++++++++
>  lib/librte_pmd_ixgbe/ixgbe_bifurcate.h         |  57 +++++
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c              |  44 +++-
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.h              |  10 +
>  mk/rte.app.mk                                  |   6 +
>  18 files changed, 1421 insertions(+), 27 deletions(-)
>  create mode 100644 lib/librte_bifurc/Makefile
>  create mode 100644 lib/librte_bifurc/rte_bifurc.c
>  create mode 100644 lib/librte_bifurc/rte_bifurc.h
>  create mode 100644 lib/librte_eal/common/include/rte_pci_bifurc.h
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_pci_bifurc.c
>  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.c
>  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.h
>
> --
> 1.8.1.4
>
>


More information about the dev mailing list