[dpdk-dev] Network Stack discussion notes from 2015 DPDK Userspace

Avi Kivity avi at scylladb.com
Mon Oct 12 10:50:54 CEST 2015


On 10/10/2015 02:19 AM, Wiles, Keith wrote:
> Here are some notes from the DPDK Network Stack discussion, I can remember please help me fill in anything I missed.
>
> Items I remember we talked about:
>
>    *   The only reason for a DPDK TCP/IP stack is for performance and possibly lower latency
>       *   Meaning the developer is willing to re-write or write his application to get the best performance.
>    *   A TCP/IPv4/v6 stack is the minimum stack we need to support applications linked with DPDK.
>       *   SCTP is also another protocol that maybe required
>       *   TCP is the primary protocol, usage model for most use cases
>       *   Stack must be able to terminate TCP traffic to an application linked to DPDK
>    *   For DPDK the customer is looking for fast applications and is willing to write the application just for DPDK network stack
>       *   Converting an existing application could  be done, but the design is for performance and may require a lot of changes to an application
>       *   Using an application API that is not Socket is fine for high performance and maybe the only way we get best performance.
>       *   Need to supply a Socket layer interface as a option if customer is willing to take a performance hit instead of rewriting the application
>    *   Native application acceleration is desired, but not required when using DPDK network stack
>    *   We have two projects related to network stack in DPDK
>       *   The first one is porting some TCP/IP stack to DPDK plus it needs to give a reasonable performance increase over native Linux applications
>          *   The stack code needs to be BSD/MIT like licensed (Open Sourced)
>          *   The stack should be up to date with the latest RFCs or at least close
>          *   A stack could be written for DPDK (not using a existing code base) and its environment for best performance
>          *   Need to be able to configure the DPDK stack(s) from the Linux command line tools if possible
>          *   Need a DPDK specific application layer API for application to interface with the network stack
>          *   Could have a socket layer API on top of the specific API for applications needing to use sockets (not expected to be the best performance)
>       *   The second item is figuring out a new IPC for East/West traffic within the same system.
>          *   The design needs to improve performance between applications and be transparent to the application when the remote end is not on the same system.
>          *   The new IPC path should be agnostic to local or remote end points
>          *   Needs to be very fast compared to current Linux IPC designs. (Will OVS work here?)

Basically, seastar [1] matches this exactly.  Its TCP stack, unlike most 
stacks, is sharded -- there is a separate stack running on each core 
(but with a single IP address), no locking, zero-copy for both transmit 
and receive.  It has a fast IPC between cores (all data sharing in 
seastar is via IPC queues; locks or atomic RMW operations are not 
used).  There is also an RPC subsystem that can be used for inter-node 
communications.  We've seen 7X performance improvements over the Linux 
TCP stack when coding a simple HTTP server.

Of course, it's not all roses. Seastar is written in C++, and the higher 
layers are asynchronous, so there's a high barrier to entry for dpdk 
developers.  Maybe it can't be merged outright, but perhaps it can 
provide some inspiration.

(seastar supports subsets of TCP, UDP, ICMP, and DHCP over IPv4; no IPv6 
support)

[1] https://github.com/scylladb/seastar

> Did I miss any details or comments, please reply and help me correct the comment or understanding.
>
> Thanks for everyone attending and packing into a small space.
>
>> Regards,
> ++Keith Wiles
> Intel Corporation



More information about the dev mailing list