Hemant Agrawal (Intel), Jerin Jacom (Cavium), Kannan Babu Ramia (Intel), Sujata Tibrewala (Networking Evangelist, Intel)
Zhihong Wang
With Partial Reconfigure(PR) parts of Bitstream, Field Programmable Gate Array(FPGA) not only provides one kinds of accelerator but also provides many types of accelerators at the same time. But the lack of standard software framework and APIs to integrate various FPGA devices is hindering FPGA’s integration with standard frameworks such as DPDK and its mass deployment. In this presentation, we will introduce FPGA-BUS which will provide FPGA management software frameworks without dealing with hardware differences among various FPGA devices.
Hemant Agrawal (Software Architect, NXP AG), Akhil Goyal (Software Engineer, NXP Semiconductors)
In this talk we will present a security framework for offloading cryptographic operations and specific protocol processing like IPSec to hardware. This helps in reducing the CPU cycles for packet processing. In this talk, we provide a brief overview of the rte_security APIs and its implementation for inline and lookaside offload hardwares.
Shally Verma (Manager, Project Management, Cavium)
Handling DPDK / ODP Specification definition and development for compression & crypto modules of cavium networking and storage family of devices
Zhihong Wang
This talk will help developers to improve virtual switches by better understanding the recent and upcoming improvements in DPDK virtio/vhost on both features and performance. Also some best practice is shared for both dev and ops.
Ashrut Ambastha (Sr. Staff Engineer, Mellanox Technologies)
Telcos and Cloud providers are looking for higher performance and scalability when building nextgen datacenters for NFV & SDN deployments. While running OVS over DPDK reduces the CPU overload of interrupt driven packet processing, CPU cores are still not completely freed up from polling of packet queues.
To solve this challenge, OVS-DPDK is further accelerated through HW offloads.
We introduce a classification methodology that enables a split data plane between OVS-DPDK and the NIC hardware. A flow tag that represents the matched rule in the hardware is passed to OVS which saves CPU cycles consumed for flow look ups. We present the open source work being done in the DPDK, OVS and Linux Kernel communities and significant performance gains achieved. We also present how this work can be extended to VXLAN traffic.
Vipin Varghese (System Application Engineer, Intel)
Debugging memory corruption in DPDK applications can be difficult – particularly if multiple processes are accessing huge pages simultaneously. Given a machine with stripped binaries and no GDB instance – where do you start debugging?
Amol Patel (Sys App Eng Manager, Intel)
DPDK worker threads are the Linux threads. Thread level MMU protection does NOT exist. All the worker-threads of the primary-process has access to stack and heap memory of all the other worker-threads, one thread can corrupt the other thread’s stack and heap-allocated memory. Worker-thread’s stack corruption prevention and detection can be achieved by provisioning the stack-memory from the mem-zone, the dynamically allocated objects can be allocated from the mem-zone instead of the heap. This allows protecting the thread’s stack and allows checking the corruption by dumping the mem-zones at the runtime.Any accidental Data-plane tables corruption can be prevented by using some of the general ‘C’ programming features and centralizing the data-plane object updates. Data-plane tables can be safe-guarded by placing each of them in individual mem-zones and surrounding it by the memory-guard-bands. This allows user to dump the complete table in one go and easily identify the table corruption. Memory guard-band allows user to identify any out-of-bound access for the tables.
M. Jayakumar (Software Application Engineer, Intel)
WHAT IS THE PROBLEM STATEMENT: Vendor Agnostic DPDK runs with multiple open software components. Eco System and open software developers, when picking and choosing different s/w modules need a checklist, that is applicable across all of their platforms, for ensuring their product is tuned for best performance.
HOW DOES THIS PRESENTATION ADDRESS THIS?: The presentation explains each potential bottleneck in the system along with the tools to identify those issues. In addition, for each performance deterrent, it gives vendor neutral tuning steps to achieve optimal performance. Since the steps are vendor neutral, the solutions are scaleable to multiple platforms – in terms of development and deployment.
CAN YOU GIVE SOME OUTLINE AND SAMPLE FLOW OF THE PRESENTATION?:
DPDK being a user space process – still it co-exists with kernel, OS scheduler, Kernel Drivers and Kernel Applications and each can potentially impact performance. Let us take OS scheduler as an example. It can and come take DPDK core away from its network polling task and “steal” to schedule other tasks. The tuning checklist gives steps to isolate the core from such disturbance. Similarly an optimization you do for mice flow may have to be different from elephant flow. Checklist gives the balance and optimal guidelines.
Shashi Kant Singh (System Architect, Altiostar Networks India Pvt Ltd)
For Optimal VM performance in Cloud networks, dimension of the VM plays an important role. Specifically the CPU and RAM assignment effects not just the workload performance but also the operations aspects. VMs handling line rate traffic, need DPDK enabled framework and enough number of cores for the workload processing but this makes the VMs bulky from the perspective of operations performance. Handling live migrations, failures are difficult in such cases. Reducing the CPUs cannot be done beyond certain level as it would lead to sub-optimal performance from DPDKs standpoint. Similarly Edge networks has different set of challenges for VM dimensioning. Edge cloudlets consists of mix of bare metal servers, dual sockets servers, single controller/compute node or a full fledged chassis. Each of these has different constraints and needs to be handled separately for optimal VM dimension. This presentation shall bring out these the factors that need to be considered for optimal VM dimensioning from overall performance perspective.
Magnus Karlsson( Intel), Nikhil Rao (Software Engineer, Intel), Bjorn Topel (Intel)
Deep Packet Inspection (DPI) and other specialized packet processing workloads are often run in user space due to their complexity and/or specialization. With the increase of Ethernet speeds from 40, 100 to 200 Gbits/s, the need for high-speed raw Ethernet frame delivery to Linux user-space is ever increasing. In this talk, we present AFXDP (formerly known as AFPACKET V4) designed to scale to these high networking speeds through the use of true zero-copy, lock-less data structures, elimination of syscalls and other techniques, while still abiding to the isolation and security rules of Linux. AF_XDP is currently an RFC on the Linux netdev mailing list with the goal to get it accepted upstream.
In our evaluation, AFXDP provides a performance increase of up to 40x for some microbenchmarks and 20x for tcpdump compared to previous AFPACKET (raw socket) versions in Linux. To illustrate the approach, we have implemented support for Intel I40E NICs and veth, but it should hopefully be easy to port to other NICs and virtual devices as well. AFXDP is designed as an extension to the existing XDP support in Linux so that XDP enabled devices will be able to use this. We also show how SW networking libraries and SDKs such as DPDK can benefit from AFXDP to achieve increased robustness, ease-of-use and HW independence.
Masco Kaliyamoorthy (Software Engineer, Red Hat), Venkata Anil Kumar (Red Hat), Numan Siddique (Red Hat), Yogananth Subramanian (Red Hat)
Skydive is a real time network topology and flow and protocols analyzer tool. Skydive can be used with OVS deployments – both kernel and dpdk datapaths to do on demand port, payload and statistical analysis, which helps in monitoring and troubleshooting complex openstack/nfv/sdn environments. This talk covers on using it in the OVS – DPDK deployments. The talk would also cover the OVN which provides virtual network abstractions (L2 and L3) on top of OpenvSwitch and using it with Skydive and OVS-DPDK environments.
Vishnu Itta (Senior Architect, MayaData), Mayank Patel (MayaData)
The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. New applications requiring fast access to storage can be built on top of SPDK, however they need to adhere to principles which are fundamental part of SPDK/DPDK framework. To name few of these, there is exactly one thread running on a CPU core, which never blocks and constantly polls for new events and executes corresponding handlers in a loop. Blocking locks in poller’s loop are not acceptable since those could delay execution of other handlers on reactor. Memory is allocated from pinned huge pages, which make DMA transfers to and from device possible avoiding copy of data between buffers etc. Those are fundamental design changes compared to how applications were built 5, 10 or more years ago (legacy applications). Legacy applications have usually many threads, which are using classic synchronization primitives (mutexes, readers/writers locks, etc.), which don’t scale with number of CPUs, and rely on kernel to synchronize and schedule the threads with well known ill effects. Trying to redesign legacy applications is often not feasible especially if they are more complex and have matured for many years. On the other hand, trying to reimplement them from scratch, preserving their stability and quality, can take years. Instead what we suggest is a compromise solution when legacy application runs with minimal changes while to some degree it can leverage performance that SPDK has to offer for doing IOs. Legacy application runs in a separate process from SPDK and VirtIO with vhost-user protocol is used for passing data between them. VirtIO is well proven technology which allows moving data between two processes using shared memory without a need to copy them, without heavy-weight synchronization and without involving kernel. Vhost-user client-server protocol allows to establish VirtIO data channel between two processes using unix domain socket. SPDK framework already comes with vhost-user server implementation. Missing part is vhost-user client implementation coming in form of a simple library which could be easily embedded to legacy application. A library for allocating IO buffers from huge pages is necessary too, although DPDK’s rte_eal library seems sufficient for a proof of concept. Legacy application could completely bypass the kernel for doing disk IO assuming SPDK runs a disk driver in userland and although the application cannot unleash the full potential of SPDK due to its legacy design, it is expected to perform faster. The most important question, which we are trying to find an answer for, is how much speed can be gained using this compromise solution compared to traditional way of reading/writing data blocks through block device file. While comparing IOPS from fio benchmark tool is surely interesting, we also focus and compare performance of real-world storage application – one of the most advanced local file system of present days – ZFS. Perhaps it is less known that it is possible to run ZFS file system in userspace, which was initially introduced only for testing it. It is exciting to observe how much the performance of the file system as a whole can be further improved by using SPDK as a storage backend.
In this talk, Mayank and Vishnu, who are active contributors of an open source project OpenEBS, worked on ZFS to make it SPDK enable, will share their learnings about:
– Integration of DPDK/SPDK libraries with project
– Uses of mempool library for memory allocation of frequently used objects
– Uses of ring library for message passing between threads
– Experiences with developing vhost-user client library
Ziye Yang (Senior Software Engineer, Intel)
NVMe over fabrics (NVMe-oF) extends NVMe protocol from PCIe to fabrics and aims at providing high performance on accessing remote NVMe devices. In this talk, an accelerated NVMe-oF target is introduced with SPDK (storage performance development kit) technique. SPDK provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into user space and operating in a polled mode (similar with the idea in DPDK) instead of relying on interrupts, which avoids kernel context switches and eliminates interrupt handling overhead. The accelerated NVMe-oF target relies on SPDK’s framework, user space NVMe driver, environment library encapsulated from DPDK’s EAL library (e.g., thread and memory management) and standard fabrics library (e.g., ibverbs) to provide high performance block service. Compared with Linux kernel’s NVMe-oF target, our solution can is much more efficient with 10X improvement in per CPU core aspect.
George Zhao (Director OSS & Ecosystem, Huawei),
With the development of cloud network, the networking stack needs to be re-invented. Although user application has more options to construct high performance solutions with varied stacks, there are a lot of challenges :
* Legacy TCP is best effort based and provides no performance guarantee.
* One-fits all protocol or algorithm is less feasible.
* Complicated and Heterogeneous Network Environments.
* Growing concern on network security
We want to share Huawei’s practices, an open source protocol Kit DMM(Dual-domain,Multi-protocol, Multi-instance), that provides an extendable transport protocol framework and runtime. It enables application-transparent and dynamic new protocol engagement. New protocols can be added on-demand and protocols can be managed dynamically. DMM is a new project in FD.io and has achieved great success on package forwarding, provides flexible interfaces for user applications and protocols.
Sujata Tibrewala(Networking Evangelist, Intel)