Feb 13

Unlocking Secure High-Performance Networking with DPDK’s Security Framework

By Ben Thomas Blog

The DPDK project is key to rapidly developing high-speed data packet networking applications and stands as a leading open-source project that has strategically developed a robust security framework through open development and open governance, setting a standard in high-performance networking.

This framework is crucial not just as an added feature but as a core requirement, enhancing the security of packet processing across a wide range of environments, from data centers to network edges, and from infrastructure to experimental applications globally.

The lack of such a framework could escalate risks like data breaches and denial of service attacks, threatening both system integrity and compliance. However, DPDK’s ongoing commitment to developing this framework and security measures not only safeguard data and optimize system performance but also build trust and ensure compliance, fostering innovation and expanding its adoption in regulated and sensitive environments.

The DPDK community relies on the collaborative efforts of its members to enhance and maintain its security framework, a critical component given the high-performance networking demands it supports. Contributors from around the world actively participate in discussions, patch reviews, and the development of new features through open governance structures.

This collective engagement ensures that the DPDK remains at the forefront of network technology by incorporating the latest security protocols and responding swiftly to emerging vulnerabilities. Community involvement is crucial not only for the development of robust security measures but also for fostering an ecosystem where knowledge and best practices are shared openly, enhancing the reliability and security of the applications powered by DPDK.

This document outlines the security protocols and implementations supported within DPDK, serving as a guide for both users and contributors. It provides:

For Users: An in-depth understanding of DPDK’s security capabilities and practical guidance for implementation in real-world applications.
For Contributors: An overview of existing implementations, identification of areas for improvement, and opportunities to extend the framework, ensuring DPDK evolves to meet emerging security requirements.

The Imperative of Security in High-Performance Networking

High-performance networks handle vast amounts of data at extremely low latencies. However, integrating security measures in such environments presents unique challenges:

Encryption Overhead: Implementing cryptographic operations impacts system throughput and latency.
Key Management: Efficient cryptographic key management is crucial but complex, especially across distributed systems.
Secure Environment Setup: Configuring hardware and software to thwart unauthorized access is critical but demands meticulous effort.
Expanding Threat Landscape: High-speed networks face increased risks from data breaches and attacks due to their complexity and broader attack surfaces.

A Closer Look at DPDK’s Security Features

DPDK enhances high-performance network security through a blend of hardware and software solutions:

Hardware-Level Security: Utilizing features like Secure Memory Encryption and Trusted Execution Environments helps isolate and protect sensitive operations.
Software-Level Security: DPDK provides extensive support for security protocols and cryptographic operations, ensuring secure data handling across networks.

Supported Protocols and Features

DPDK’s security capabilities are robust, supporting various protocols essential for modern networking:

Transport Layer Security (TLS): Includes support for TLS 1.2 and 1.3 through hardware-accelerated cryptographic operations, although current support is primarily available on specific hardware like Marvell.
IPSec: Facilitates secure IP communications through DPDK’s crypto-dev and security API, enhancing data integrity and confidentiality.
MACSec: Implements Layer 2 security to encrypt and authenticate network traffic efficiently.
WireGuard and QUIC: Although native support is limited, DPDK’s high-performance capabilities can be leveraged to optimize these protocols, enhancing VPN and low-latency communications.

Identifying Areas for Improvement

While DPDK’s Security Framework is comprehensive, there are areas needing enhancement to keep pace with evolving security demands:

Extending TLS Support: Broadening hardware support for TLS offloading can enhance performance and encourage wider adoption. Adding TLS Handshake support would also be beneficial.
Integrating Modern Protocols: Better support for protocols like WireGuard and QUIC can position DPDK as a more versatile tool for secure, high-throughput networking.

Opportunities for Contributors

The ongoing development of DPDK’s Security Framework offers numerous opportunities for contributors:

Hardware Integration: Collaborating with hardware vendors to expand support for cryptographic offloading.
Protocol Development: Enhancing support for emerging protocols, ensuring DPDK remains relevant in an ever-evolving landscape.

Contribute today!

The DPDK project is community-driven, and the security domain presents numerous opportunities for old and new contributors to make an impact.

Read the full document here: 10. Security Support — Data Plane Development Kit 25.03.0-rc0 documentation

Learn how to start contributing here: https://www.dpdk.org/contribute/

Dec 10

DPDK 24.11: Another Step Forward for Performance Networking

By Ben Thomas Blog

DPDK has unveiled its latest release, DPDK 24.11, marking a significant step forward in performance, flexibility, and hardware integration for packet processing. Here’s what you need to know about the latest advancements, new features, and what they mean for developers and system integrators.

A Busy Release Cycle: By the Numbers

This release was the product of an impressive collaborative effort:

1329 commits by 196 authors.
Changes across 2557 files, with 376,587 insertions and 177,108 deletions.
Contributions from a wide range of organizations, ensuring a diverse and robust codebase.

The new release introduces some API/ABI compatibility breakages, marking a clear departure from previous versions. The new ABI version (25) means that while 24.11 will be supported for three years, its successors (25.03 and 25.07) will maintain ABI compatibility with this release. This makes 24.11 an excellent foundation for long-term system integration and deployment.

Key Highlights of DPDK 24.11

The new features and improvements span various areas, from power management and IPv6 handling to enhanced cryptography and logging. Below are some of the most notable updates:

1. Performance and Resource Management

Lcore Variable Allocation: Improved CPU core management allows for more dynamic and efficient resource utilization.
Bit Set and Atomic Bit Manipulation: Streamlined bit operations enhance concurrency and reduce overhead.
Power Management Enhancements:
- AMD uncore power management improves energy efficiency.
- Per-CPU power management QoS for resume latency enables finer control of power-performance trade-offs.

2. Networking Features

IPv6 Address API: Simplifies the handling of IPv6 addresses in applications.
RSS Hash Key Generation: Automates generation of RSS keys, enhancing load balancing capabilities.
Ethernet Link Lanes: Improves support for high-speed Ethernet configurations.
Flow Table Index Action: Adds more control for flow-based operations, aiding advanced packet processing.

3. Hardware and Driver Support

New drivers and enhancements to existing ones extend DPDK’s hardware compatibility:

New Hardware Integrations:
- Cisco ENIC VF, Marvell CN20K, Napatech NTN flow engine.
- Realtek R8169 and ZTE drivers expand support for a broader range of network interfaces.
Enhanced Cryptography Support:
- Symmetric Crypto SM4 and Asymmetric Crypto EdDSA bolster security capabilities.

4. Event and Logging Improvements

Event Device Features: Pre-scheduling and independent enqueue capabilities enhance event-driven processing.
Logging Rework: Revamped logging adds timestamps, color-coded outputs, syslog, and journal support, improving debugging and operational insights.

Welcoming New Contributors

The DPDK community grew stronger with 50 new contributors, spanning roles such as authors, reviewers, and testers. This diversity reflects the open source ethos and ensures continuous innovation.

Key contributors include individuals from Intel, Marvell, NVIDIA, Red Hat, and many others, with Intel leading in the number of commits.

A special thanks to the top reviewers who played a vital role in maintaining code quality and collaboration. Their efforts highlight the often-underappreciated task of reviewing contributions.

Looking Ahead: What’s Next?

The next release, DPDK 25.03, is slated for March 2025. Development for this version is already underway, with feature submissions open through December. This cadence ensures a predictable and collaborative development process.

For those planning deployments, DPDK 24.11’s three-year support period makes it the recommended version for stability and long-term integration.

Download DPDK 24.11 here

Oct 09

Explore the latest Innovations and Insights from this year’s DPDK North America Summit

By Ben Thomas Blog

This year’s DPDK North America Summit highlighted the projects ongoing technical excellence and innovation in high-performance networking. The event gathered experts and enthusiasts from around the globe, including project pioneers and new community contributors. They engaged in discussions and demonstrations focused on recent code developments, the technical board’s plans, and notably, exciting use cases and applications.

Over the past 14 years, DPDK has methodically developed an open stack that meets a broad variety of user requirements. The project’s open development approach and adaptability to community needs have been invaluable. They showcase the project’s commitment to open source principles. However, these practices have also led to a more expansive and less streamlined code base. Nevertheless, the technical board skillfully manages the necessary compromises for both core developers and end users, highlighting some exciting developments at the summit.

The review of new technology integrations and organic implementation alongside the community’s evolution has been impressive. The project’s development, impact, and extensive application across a global infrastructure have been significant. This includes not just data centers, enterprise cloud services, and network equipment, but also transportation networks, telecom systems, financial trading platforms, industrial control systems, and even particle processors, and astronomical data processing!

The project has reached a pivotal stage of maturity, with use cases and applications expanding dynamically. This evolution presents an opportune moment to explore and showcase real-world applications, extending far beyond its conventional roles in routers and firewalls.

One highlight of the summit was the presentation by Robin Jarry and David Marchand from Red Hat, who introduced “Grout,” a graph router based on DPDK. This tool is designed to simulate network functions and physical routers to replicate the behavior of typically closed-source VNFs and CNFs using an open source tool. They provided a detailed explanation of the rte_graph library’s role in data path processing and showcased Grout’s capabilities.

Another notable session was led by Dr. John Romein from the Netherlands Institute for Radio Astronomy (ASTRON). He discussed how advanced GPU technologies, specifically the NVIDIA Grace Hopper Superchip, are being utilized to process vast amounts of data from radio telescopes. This session not only emphasized the integration of DPDK with GPU technology but also demonstrated its real-world applications in astronomical data processing, pushing the boundaries of modern hardware capabilities.

Each session, from discussions on the challenges of implementing DPDK on non-cache coherent platforms by Hemant Agrawal and Gagandeep Singh from NXP, to insights into machine learning inference within network processing by Srikanth Yalavarthi from Marvell, highlighted the versatility and robustness of DPDK. These discussions underscored its ability to meet the increasingly complex demands of network performance solutions.

In summary the latest DPDK summit provided a platform for learning and sharing, reinforcing the community’s commitment to driving innovation in network performance through open development and governance, as highlighted by the ongoing initiatives of the project.

Watch all the presentations on the DPDK youtube here

Have a use case you’d like to share and feature on the website? Email marketing@dpdk.org

Aug 06

Highlights from DPDK Summit APAC

By Ben Thomas Blog

Welcome & Opening Remarks – Thomas Monjalon, Maintainer, NVIDIA

Thomas Monjalon, a maintainer at NVIDIA, opened the summit in Bangkok by emphasizing the importance of the community in the project. He highlighted the role of contributors like Ben Thomas, who handles marketing, and encouraged attendees to share their stories and get involved.

Thomas provided logistical details about and discussed the project’s history, noting its growth since its inception in 2010 and the support from the Linux Foundation. Looking ahead, Monjalon outlined priorities such as better public cloud integration, enhanced security protocols, and contributions to AI.

He stressed the long-term benefits of contributing to open source, including thorough documentation and community support, and noted that being part of the community can help individuals find new job opportunities.

Introducing UACCE Bus of DPDK – Feng Chengwen, Huawei Technologies Co., Ltd

Feng Chengwen from Huawei Technologies presented on the UACCE (Unified Accelerator Framework) integrated into DPDK. UACCE was designed to simplify usage and enhance performance and security for user space I/O and DMA operations without system involvement. It was upstreamed in version 5.7 of the Linux kernel and the latest DPDK release 24.03, allowing accelerators to access memory regions directly and eliminating address translation.

Key objectives include high performance, simplified usage, and security, with support for multiprocess memory acceleration and on-demand resource usage. UACCE addresses performance issues such as page faults and NUMA balancing using techniques like CPU pre-access and memory binding.

It is used in both host systems and virtual machines, though some features for virtual machines are still in development. Feng encouraged other developers to adopt UACCE, highlighting its broader application potential, and discussed future enhancements to integrate more devices into DPDK using the UACCE framework.

ZXDH DPU Adapter and It’s Application – Lijie Shan & Wang Junlong, ZTE

The presentation introduces the ZXDH DPU driver, highlighting its features, applications, and product portfolio. The DPU system framework includes modules for high-speed network interfaces, PCI connectivity, and advanced packet processing, supporting RDMA, NVMe protocols, and multiple accelerators for security and storage.

It enhances network and storage performance by offloading tasks from the host CPU, supporting virtualization, AI, and edge computing, with capabilities like TLS encryption. An example of offloading security group functions to the DPU demonstrates reduced CPU load and increased processing efficiency.

The product portfolio supports up to 5 million IOPS and 100 million packets per second, with ongoing development to improve TCP protocol handling and storage acceleration.

Libtpa Introduction – Yuanhan Liu, Bytedance

Yuanhan Liu from ByteDance introduces Libtpa, a user-space TCP/IP stack developed from scratch. The presentation discusses its background, design, testing, and performance.

Traditional kernel-based TCP/IP stacks have inefficiencies and overhead, and existing user-space TCP stacks face problems like breaking the kernel stack, limited zero-copy support, and inadequate testing and debugging tools.

Libtpa addresses these issues by allowing coexistence with the kernel stack, supporting multiple user-space instances, optimizing performance with zero-copy, and providing extensive testing and debugging capabilities.

Its architecture supports high throughput and low latency, achieving significant performance improvements. Libtpa includes over 200 unit tests and advanced debugging tools to ensure stability and ease of troubleshooting in production environments.

Telecom Packet Processing and Correlation Engine Using DPDK – Ilan Raman, Aviz Networks

Ilan Raman and his colleagues from Aviz Networks developed 5G packet processing applications using DPDK to manage complex 4G and 5G traffic on commodity hardware efficiently. Aviz Networks, founded in 2019 and operating in the USA, India, and Japan, focuses on providing open-source solutions for telecom operators.

They address challenges in monitoring evolving mobile technologies, scaling solutions horizontally, and reducing TCO through software-driven approaches. A primary use case is 5G correlation, which enhances network performance monitoring by correlating control and user traffic. Deployment involves DPDK-based applications on commodity hardware, processing high-bandwidth traffic, and extracting valuable metadata for insights and capacity planning.

The architecture uses a run-to-completion model, distributing functions across dedicated cores to handle various traffic types, with scalability achieved through RSS functionality in NICs. Practical learnings include configuring RSS for different packet types, ensuring symmetric load balancing, using per-core hash tables, isolating DPDK cores from the Linux kernel, and performing deep packet parsing in software.

Aviz Networks leveraged DPDK’s packet manipulation libraries for handling custom headers and achieved better performance through memory optimizations and CPU isolation.

Unified Representor with Large Scale Ports – Suanming Mou, NVIDIA Semiconductor

The presentation by Suanming Mou from NVIDIA focused on optimizing the unified representer in large-scale ports within DBK switch models. Initially, the high memory and CPU usage due to the need to poll all represent ports when packets missed hardware flow rules was a significant challenge.

The optimization approach involved setting “represent matching” to zero, directing all packets to a single uplink represent port, and copying the source port ID to packet metadata for identification in the hypervisor. This change reduced the need for extensive memory allocation and CPU polling as traffic was handled through a single proxy port.

The implementation of new flow rules for this setup resulted in substantial memory savings, decreasing from over 800 MB to around 332 MB, and improved packets per second (PPS) performance, increasing from 20 Mega PPS to 27.5 Mega PPS due to optimized polling and reduced cache misses. Overall, the optimization streamlined the polling process and significantly enhanced resource efficiency and performance in managing large-scale port traffic.

Troubleshooting Low Latency Application on CNF Deployment – Vipin Varghese & Sivaprasad Tummala, AMD

The presentation addresses the challenges encountered when transitioning applications from bare metal to container environments, emphasizing issues like reduced throughput, increased packet processing time, fluctuating latencies, and unpredictable performance with multiple container instances.

Root causes of these problems include limited access to hardware resources, library and compiler version mismatches, lack of specific patches, and performance variations based on hardware architecture and deployment models. Through several case studies, Vipin and Sivaprasad underscore the importance of profiling applications on bare metal before deployment, using tools like flame graphs and perf, and understanding hardware details such as cache domains and PCI bus partitioning for optimization.

They call for enhanced telemetry and observability in DPDK for containerized environments, noting that current tools and documentation are inadequate for complex troubleshooting. Recommendations include extending DPDK’s telemetry infrastructure, utilizing eBPF hooks for improved runtime data collection, and ensuring consistent performance through better documentation, custom plugins for CPU isolation, and awareness of hardware-specific optimizations.

Suggestions to Enhance DPDK to Enable Migration of User Space Networking Applications to DPDK – Vivek Gupta, Benison Technologies Pvt Ltd

The presentation by Vivek Gupta delves into enhancing DPDK to facilitate the migration of various user space networking applications, pinpointing a critical issue: advancements in CPU, IO, and memory technologies are not benefiting these applications. Despite significant improvements in infrastructure, user space networking applications often fail to utilize these advancements effectively. This gap highlights the need for a framework that can bring the benefits of these technological improvements to user space frameworks, ensuring better performance and efficiency.

Customers face numerous challenges when attempting to migrate their existing user space applications to DPDK or VPP environments without rewriting them. These applications, which traditionally rely on Linux kernel methods, encounter significant hurdles during migration. The proposed solution is to create a unified framework that integrates various technologies, such as EF VI and VPP, to enhance the performance of these applications. This framework would support different levels of packet processing, from L2 to L4, and provide essential mechanisms for encryption, decryption, deep packet inspection, and proxy functions.

To meet customer needs, the framework should enable applications to capture and inject packets at various levels, from the interface to higher layers. It should support state management and route updates from control applications, ensuring that applications always operate with the most current data. Additionally, the framework must offer accelerators for cryptographic and AI/ML-based processing to handle the complex requirements of modern applications. By addressing issues related to threading, caching, and reducing contention, the framework aims to significantly improve the performance of user space applications.

Practical examples underscore the potential benefits of this framework. For instance, enhancing web servers, proxy servers, and video streaming applications using the proposed framework could lead to substantial performance gains. By tackling issues such as blocking operations and optimizing thread management, applications can achieve higher throughput and better resource utilization. The framework should also cater to the needs of high-speed applications and support flexible application architectures, enabling user space applications to become more efficient and faster.

In conclusion, the proposed enhancements to DPDK aim to bridge the gap between advancements in infrastructure and the performance of user space networking applications. By providing a comprehensive framework that supports various processing levels, state management, and cryptographic acceleration, the solution promises to improve application performance, reduce contention, and enhance resource utilization. This approach will help customers migrate their applications more effectively and realize the full benefits of technological advancements in CPU, IO, and memory technologies.

Welcome Back – Prasun Kapoor, Associate Vice President, Marvell

The Asia Pacific (APAC) region, particularly India and China, has established a strong and dynamic community around the Data Plane Development Kit (DPDK). Recognizing this, the decision was made to hold the DPDK APAC Summit in Thailand, a geopolitically neutral location that facilitates easy participation from various APAC countries without visa complications.

The DPDK project is witnessing robust growth in multiple areas, including technical contributions, marketing outreach, and the number of active contributors. This growth is further bolstered by increasing interest from new prospective corporate members, indicating a healthy and expanding ecosystem.

Significant updates have been made to the University of New Hampshire (UNH) lab, which has recently incorporated the Marvell CN10K Data Processing Unit (DPU) into its testing suite. The lab now reports Data Test Suite (DTS) results for a variety of tests, and has established a community dashboard for code coverage, releasing monthly reports. Additionally, the lab has been proactive in submitting patches and bug fixes and is running compilation tests for Open vSwitch (OVS) with each DPDK patch, with future plans to include performance testing.

Marketing efforts for DPDK have seen a considerable boost, with increased engagement on platforms like LinkedIn and a notable rise in YouTube views, which is seen as a leading indicator of the project’s growing interest. The steady increase in DPDK downloads further underscores the project’s rising popularity.

Enhancements to the DPDK documentation have also been a focal point, with updates to the Poll Mode Driver (PMD) guidelines, security protocol documentation, and multiple sections of the programmer’s guide and contributor guidelines. Financially, the DPDK project is in a strong position with a healthy budget and substantial reserves. This financial stability ensures that key activities such as summits, community labs, and marketing efforts are well-supported for the foreseeable future.

Coupling Eventdev Usage with Traffic Metering & Policing (QoS) – Sachin Saxena & Apeksha Gupta, NXP

Sachin Saxena and Apeksha Gupta from NXP presented on integrating Eventdev with Traffic Metering and Policing to enhance Quality of Service (QoS). They discussed the various requirements from customers and the comprehensive solution they developed to meet these demands. Their goal was to share their extensive work and experiences with the community, offering insights into how similar challenges can be addressed effectively.

They highlighted different customer requirements, such as the need for traffic classification and scheduling in hardware, reducing CPU cycle usage, and implementing custom schedulers. By leveraging the DPDK framework, they managed to consolidate these varied needs into a generic solution. This approach not only met the specific requirements but also provided a reference for others in the community who might face similar challenges.

The technical approach of their solution involved utilizing DPDK’s metering, policing, and Eventdev frameworks. They explained how these components interact to meet the specified use cases, enhancing overall efficiency and performance. By breaking down complex use cases into manageable components and mapping these to corresponding RT library elements, they ensured a robust end-to-end functionality.

In their implementation details, they described the method of segmenting use cases into multiple components and aligning these with the appropriate RT library components. This strategy ensured that each part of the system worked seamlessly together, achieving the desired outcomes effectively and efficiently.

To illustrate their points, they shared practical use cases, including the management of scheduling priorities, grouping multiple ports, and applying markers and policers at the priority group level. These examples demonstrated how to optimize CPU cycles and prevent data loss, showcasing the practical applications of their solution in real-world scenarios.

GRO Library Enhancements – Kumara Parameshwaran Rathinavel, Microsoft

Kumara Parameshwaran Rathinavel from Microsoft has been working on enhancing the Generic Receive Offload (GRO) library. GRO is a widely used software technique that optimizes packet processing by merging multiple TCP segments into a single large segment. Kumara has been contributing to this project since his time at VMware and continues to do so at Microsoft. His work aims to improve the efficiency and performance of GRO, particularly in the context of network traffic handling.

The current implementation of GRO, which involves iteratively checking a table for flow matches, has been identified as suboptimal for handling packets received in multiple bursts. This method can lead to inefficiencies, especially as the timeout intervals increase. Kumara highlighted that the existing approach struggles with scalability and performance under these conditions, necessitating a more efficient solution.

To address these limitations, Kumara proposed a hash-based method for flow matching. This new approach significantly enhances the efficiency of the GRO process. In tests, the hash-based method demonstrated substantial performance improvements, reducing the CPU utilization of the GRO reassemble function. This method not only optimizes the flow matching process but also ensures better handling of packet bursts, leading to overall improved performance.

Recognizing the varying latency requirements of different applications, Kumara suggested implementing tuple-specific timeouts within the GRO framework. This approach allows for more flexible and optimized GRO settings tailored to the specific needs of various applications. For instance, applications with low latency requirements, such as banking transactions, can have shorter timeouts, while those with less stringent latency needs can benefit from longer timeouts. This customization ensures that all applications can operate efficiently without compromising on performance.

To validate these enhancements, Kumara used a setup involving a virtual machine as a test proxy, demonstrating notable performance gains. The improvements in GRO are particularly beneficial for network applications like load balancers, where reducing CPU utilization and improving packet processing efficiency are critical. Kumara’s work on GRO library enhancements showcases significant advancements in optimizing network traffic handling, contributing to more efficient and scalable network performance.

Refactor Power Library for Vendor Agnostic Uncore APIs – Sivaprasad Tummala & Vipin Varghese, AMD

The presentation focuses on the critical need for improved power management and efficiency for Telco operators, emphasizing the importance of vendor-agnostic solutions for scalability across different platforms. This is particularly relevant as power has become a significant concern, with the need to optimize performance per watt and manage power effectively.

AMD’s power library within the DPDK (Data Plane Development Kit) aims to address these concerns by balancing power consumption and performance. The library optimizes core and uncore frequency management and introduces adaptive algorithms for real-time monitoring and idle state management. This ensures that while cores are busy polling, they consume power efficiently without compromising on performance.

Currently, the power library is tightly coupled with Linux and requires specific modifications to accommodate new drivers, leading to inefficiencies and increased code size. Each new driver introduction necessitates changes to the core library, increasing the complexity and effort required for maintenance and updates. This approach is not scalable as the number of drivers and their capabilities grow.

To address these challenges, the refactoring efforts aim to modularize the power library, enabling plug-and-play capabilities for new drivers and reducing dependencies. This modular approach will simplify the addition of new drivers, improve performance, and enhance scalability by minimizing the library’s footprint and code complexity.

The proposed enhancements include vendor-agnostic uncore APIs to manage interconnect bus frequencies and dynamic link width management. These APIs promote a standardized interface for power management across different hardware vendors, making it easier for applications to develop power management solutions without being tied to specific vendors. This approach not only reduces complexity but also ensures compatibility and scalability across various platforms.

Q&A with the Governing Board & Technical Board – Wang Yong, Thomas Monjalon, Jerin Jacob

The Technical Board (TBoard) and Governing Board (GBoard) of the project play distinct but complementary roles in steering the community. The TBoard consists of 11 members who meet bi-weekly to discuss and resolve technical issues. These meetings, conducted via Zoom, involve all community members and focus on consensus-driven decision-making. When consensus cannot be reached, the TBoard votes on issues, requiring prior email submissions for agenda inclusion. This structured approach ensures thorough consideration and discussion before decisions are made.

The GBoard, on the other hand, sets the project’s broad direction, encompassing administrative tasks, marketing strategies, and budgeting. This board meets monthly and includes a permanent chairperson along with representatives from the Linux Foundation. The GBoard comprises 12 members: 10 from golden member companies, one from a silver member, and one from the TBoard. Every six weeks, the GBoard convenes, and every three months, they hold joint meetings with the TBoard to align on financial plans, marketing efforts, and major project decisions.

Membership in the GBoard is tiered, with gold members contributing $50,000 annually and silver members contributing $10,000 annually. These funds are crucial for project initiatives, such as summits and acquiring new servers for the lab. Gold members play a significant role in decision-making due to their financial contributions, ensuring their interests and investments are aligned with project goals.

Community involvement is a cornerstone of both boards’ operations. TBoard meetings are open to all, fostering transparency and inclusivity in technical discussions. Issues are raised via email, ensuring that all voices can be heard. The GBoard, while more focused on strategic direction, includes representatives from various companies to bring diverse perspectives to the table. This collaborative approach allows for comprehensive planning and execution of project initiatives.

Currently, the boards are prioritizing several key areas: enhancing security protocols and documentation, improving continuous integration (CI) performance testing, and integrating more functional testing in the Data Plane Development Kit (DTS). Future plans include creating a performance dashboard and requiring contributors to add tests for new features. These efforts aim to maintain high standards of performance and security, ensuring the project’s robustness and reliability for all users.

Rte_flow Match with Comparison Result – Suanming Mou, NVIDIA Semiconductor

The presentation introduces a new feature for rte_flow, which focuses on comparison operations to enhance the flexibility of flow rules. This feature allows for comparisons between fields or between a field and an immediate value, providing more dynamic and versatile rule configurations. The presenter assumes familiarity with rte_flow from previous sessions and emphasizes the advantages of this new capability.

Traditional rte_flow rules are limited to matching immediate values, which can be restrictive in certain scenarios. For instance, in TCP connection tracking, the termination of connections often goes unnoticed by software if the reset packet is handled by hardware directly. Similarly, for packet payload evaluation, users may want to skip cryptographic operations on packets without payloads. These examples highlight the need for more advanced comparison methods in flow rules.

The new feature supports a range of comparison operations, including greater than, less than, and equal comparisons. It has been initially implemented in ConnectX-7 and BlueField-3 NICs, specifically within the template API. However, there are limitations, such as the inability to mix comparison items with other items and restricted field support. The feature is designed to be flexible but currently has hardware constraints that limit its full potential.

Users can configure these comparison rules using a new `item compare` structure in the API. This involves specifying the fields to compare, the immediate values, and the desired operations, such as equal, not equal, greater than, and so forth. The configuration also supports specific bit-width comparisons, providing detailed control over how comparisons are executed. This structure aims to offer a robust framework for implementing dynamic and complex flow rules.

Several examples demonstrate the use of the new comparison item in flow rules, illustrating its practical application. Despite its benefits, the feature currently supports only single comparison rules within flow tables and a limited range of fields. The requirement for both spec and mask in the configuration is due to the template API structure, which mandates these elements even if they might not be necessary for all comparisons. Suanming Mou concludes by encouraging other developers to integrate support for this feature in their PMDs, recognizing its potential to significantly enhance rte_flow’s capabilities.

DPDK PMD Live Upgrade – Rongwei Liu, Nvidia

The DPDK PMD live upgrade process is designed to meet the critical need for upgrading or downgrading PMD versions seamlessly without disrupting ongoing services. This process ensures the transfer of user configurations while minimizing downtime to nearly zero, making it essential for applications requiring continuous operation.

Two primary approaches are detailed for conducting these upgrades. The first approach involves a graceful exit of the old PMD followed by the restart of the new PMD, during which there is a brief period of service unavailability. The second approach utilizes a standby mode, where the new PMD is prepared with the necessary configurations but remains inactive until the old PMD exits. This method ensures that there is no service disruption as the traffic seamlessly switches to the new PMD once the old one exits.

To facilitate this process, two modes are introduced: active and standby. In the active mode, the PMD manages traffic and hardware configuration directly. In standby mode, configurations are set up but do not affect traffic immediately. Instead, they become active only when the old PMD gracefully exits, ensuring that the traffic handling transitions smoothly without any interruption.

A crucial aspect of the upgrade process is the use of group zero as a dispatcher for traffic processing rules. This mechanism ensures that all configurations are synchronized and become effective immediately, eliminating any downtime or disruption in traffic flow. By inserting and managing these rules efficiently, the system can transition from the old PMD to the new one seamlessly.

Finally, the process is designed to be highly scalable, allowing for adaptable resource usage to accommodate various deployment scales. It also emphasizes the importance of a user-friendly API, ensuring that users can access and utilize the upgrade features quickly and easily, thus enhancing the overall efficiency and effectiveness of the live upgrade process.

Monitoring 400G Traffic in DPDK Using FPGA-Based SmartNIC with RTE Flow – David Vodák, Cesnet

David Vodák from Cesnet presented their journey to enable 400G traffic monitoring using DPDK and FPGA-based SmartNICs, a project initiated due to the lack of suitable FPGA cards in the market. Cesnet, a national research and educational network, designed the FPGA SmartNIC which utilizes the Intel HX I7 chip with 400G Ethernet support and PCIe gen 4/5 compatibility. This card is engineered for high-speed processing, making it ideal for their needs.

The cornerstone of their solution is the NDK platform, an open-source framework that supports up to 400G throughput. NDK facilitates parallel processing, filtering, and metadata export, which are crucial for handling high-speed network traffic. It is designed to be highly adaptable, allowing users to create new components or use existing ones to build custom firmware for various applications.

NDK’s versatility extends beyond monitoring; it is also used for high-frequency trading and CL testing. One of the open source tools developed by Cesnet, the IPFIXPROBE, supports DPDK and is employed to create detailed traffic flows from input packets. This probe exemplifies the practical applications of NDK in real-world scenarios, demonstrating its robustness and flexibility.

To ensure the reliability of their solutions, Cesnet employs rigorous testing and verification methods. Functional testing is conducted using tools like testpmd or custom DPDK applications in loopback setups. For benchmarking, they utilize external traffic generators such as Spent and Flow Test, with the latter capable of simulating realistic network traffic to provide more accurate testing results.

Looking ahead, Cesnet plans to expand the capabilities of the NDK platform to support various cards and use cases beyond traffic monitoring. Their commitment to open source development is evident, as they provide extensive resources on GitHub for the community to collaborate and innovate further. This open source approach not only fosters community involvement but also drives continuous improvement and adaptation of their technology.

Lessons Learnt from Reusing QDMA NIC to Base Band PMD – Vipin Varghese & Sivaprasad Tummala, AMD

AMD undertook a project to repurpose its QDMA NIC for Forward Error Correction (FEC) offloading in virtual RAN environments using an FPGA-based prototype. The goal was to support LDPC encode/decode functionalities without developing the BBDEV PMD from scratch. Instead, the team adapted existing QDMA NIC code, incorporating necessary modifications to create a functional BB PMD. This approach allowed for rapid prototyping and integration within a short time frame.

Throughout the project, several challenges arose, including mismatched use cases, software latencies, and inadequate thread handling. To address these, the team implemented solutions such as using selective builds and applying compiler pragmas. These strategies helped optimize the RX/TX burst functionalities and reduce instruction cache misses, which in turn minimized overall latency.

Significant efforts were made to reduce latencies, which were initially high due to multiple factors including software and PMD-related overheads. By minimizing instruction cache misses and optimizing critical functions to fit within smaller memory pages, the team achieved notable latency reductions. Further improvements were realized by implementing lockless mechanisms using RT rings, ensuring efficient NQ/DQ operations.

To meet the specific requirements of the customer for low-latency and high-throughput, the team had to go beyond simple test scenarios and adapt the software implementations to better match real-world use cases. This involved modifying the BBDEV PMD to handle multiple threads and ensuring proper mapping and distribution of LDPC encode and decode requests, which significantly improved performance and reliability.

The project highlighted several important lessons. It underscored the value of reusing existing PMDs when feasible, as well as the need to reduce code bloating and align PMD examples with actual customer use cases. The team recommended updates for the DPDK community to focus more on low latency and stress testing, and to improve lockless implementations. These insights and improvements contributed to a more robust and efficient solution, ultimately enhancing the overall performance of the system.

Closing Remarks – Nathan Southern, Sr. Projects Coordinator, The Linux Foundation

The DPDK conference in Bangkok marked a significant milestone as the first APAC event since COVID-19. With a total of 63 participants, including 30 in person and 33 online, the event exceeded expectations. This conference, considered an experiment in a geopolitically neutral location, was deemed successful and has paved the way for potential future APAC events in various locations.

DPDK, a project now 14 years old, has shown remarkable growth and resilience, countering previous perceptions of being in its sunset phase. Since Nathan joined the Linux Foundation in April 2022, the project has maintained nearly perfect member retention and continued technological advancement. This longevity and sustained momentum underscore the project’s vitality and relevance in the tech community.

Strategic efforts in marketing and documentation have significantly enhanced the project’s visibility and usability. Under the direction of Ben Thomas, marketing initiatives have been robust, and tech writer Nini Purad has overhauled the project’s documentation. These efforts aim to foster community growth and engagement, ensuring that DPDK remains a valuable resource for its users.

The DPDK project is evolving in critical areas such as security, cloud hyperscaling, and AI. This evolution is driven by community input and the guidance of the tech board, including leaders like Thomas Monjalon. Continuous community involvement is essential for future advancements, highlighting the importance of active participation from all stakeholders.

Nathan emphasized the importance of community engagement in driving DPDK’s development forward. He encouraged attendees to participate through Slack, tech board calls, and contributions to the OSS code. Additionally, the project is actively creating dynamic content, including end-user stories and developer spotlights, to promote mutual growth and expand the membership base. This focus on community and content creation is key to sustaining and growing the DPDK project.

Watch all the summit videos here.

Jun 05

Microsoft Azure Mana DPDK Q&A

By Ben Thomas Blog

In today’s rapidly evolving digital landscape, the demand for high-speed, reliable, and scalable network solutions is greater than ever. Enterprises are constantly seeking ways to optimize their network performance to handle increasingly complex workloads. The integration of the Data Plane Development Kit (DPDK) with Microsoft Azure’s Network Adapter (MANA) is a groundbreaking development in this domain.

Building on our recent user story, “Unleashing Network Performance with Microsoft Azure MANA and DPDK,” this blog post delves deeper into how this integration is revolutionizing network performance for virtual machines on Azure. DPDK’s high-performance packet processing capabilities, combined with MANA’s advanced hardware offloading and acceleration features, enable users to achieve unprecedented levels of throughput and reliability.

In this technical Q&A, Brian Denton Senior Program Manager, at Microsoft Azure Core further illuminates the technical intricacies of DPDK and MANA, including the specific optimizations implemented to ensure seamless compatibility and high performance. He also elaborates on the tools and processes provided by Microsoft to help developers leverage this powerful integration, simplifying the deployment of network functions virtualization (NFV) and other network-centric applications.

1. How does Microsoft’s MANA integrate with DPDK to enhance the packet processing capabilities of virtual machines on Azure, and what specific optimizations are implemented to ensure compatibility and high performance?

[Brian]: MANA is a critical part of our hardware offloading and acceleration effort. The end goal is to maximize workloads in hardware and minimize the host resources needed to service virtual machines. Network Virtual Appliance (NVA) partner products and large customers leverage DPDK to achieve the highest possible network performance in Azure. We are working closely with these partners and customers to ensure their products and services take advantage of DPDK on our new hardware platforms.

2. In what ways does the integration of DPDK with Microsoft’s Azure services improve the scalability and efficiency of network-intensive applications, and what are the measurable impacts on latency and throughput?

[Brian]: Network Virtual Appliances are choke points in customers networks and are often chained together to protect, deliver, and scale applications. Every application in the network path adds processing and latency between the endpoints communicating. Therefore, NVA products are heavily focused on speeds and feeds and designed to be as close to wire-speed as possible. DPDK is the primary tool used by firewalls, WAF, routers, Application Delivery Controllers (ADC), and other networking applications to reduce the impact of their products on network latency. In a virtualized environment, this becomes even more critical.

3. What tools and processes has Microsoft provided for developers to leverage DPDK within the Azure ecosystem, and how does this integration simplify the deployment of network functions virtualization (NFV) and other network-centric applications?

[Brian]: We provide documentation on running testpmd in Azure: https://aka.ms/manadpdk. Most NVA products are on older LTS Linux kernels and require backporting kernel drivers, so having a working starting point is crucial for integrating DPDK application with new Azure hardware.

4. How does DPDK integrate with the MANA hardware and software, especially considering the need for stable forward-compatible device drivers in Windows and Linux?

[Brian]: The push for hardware acceleration in a virtualized environment comes with the drawback that I/O devices are exposed to the virtual machine guests through SR-IOV. Introducing the next generation of network card often requires the adoption of new network drivers in the guest. For DPDK, this depends on the Linux kernel which may not have drivers available for new hardware, especially in older long-term support versions of Linux distros. Our goal with the MANA driver is to have a common, long-lived driver interface that will be compatible with future networking hardware in Azure. This means that DPDK applications will be forward-compatible and long-lived in Azure.

5. What steps were taken to ensure DPDK’s compatibility with both Mellanox and MANA NICs in Azure environments?

[Brian]: We introduced SR-IOV through Accelerated Networking early 2018 with the Mellanox ConnectX-3 card. Since then, we’ve added ConnectX-4 Lx, ConnectX-5, and now the Microsoft Azure Network Adapter (MANA). All these network cards still exist in the Azure fleet, and we will continue to support DPDK products leveraging Azure hardware. The introduction of new hardware does not impact the functionality of prior generations of hardware, so it’s a matter of ensuring new hardware and drivers are supported and tested prior to release.

6. How does DPDK contribute to the optimization of TCP/IP performance and VM network throughput in Azure?

[Brian]: See answer to #2. DPDK is necessary to maximize network performance for applications in Azure, especially for latency sensitive applications and heavy network processing.

7. How does DPDK interact with different operating systems supported by Azure MANA, particularly with the requirement of updating kernels in Linux distros for RDMA/InfiniBand support?

[Brian]: DPDK applications require a combination of supported kernel and user space drivers including both Ethernet and RDMA/InfiniBand. Therefore, the underlying Linux kernel must include MANA drivers to support DPDK. The latest versions of Red Hat and Ubuntu support both the Ethernet and InfiniBand Linux kernel drivers required for DPDK.

8. Can you provide some examples or case studies of real-world deployments where DPDK has been used effectively with Azure MANA?

[Brian]: DPDK applications in Azure are primarily firewall, network security, routing, and ADC products provided by our third-party Network Virtual Appliance (NVA) partners through the Marketplace. With our most recent Azure Boost preview running on MANA, we’ve seen additional interest by some of our large customers in integrating DPDK into their own proprietary services.

9. How do users typically manage the balance between using the hypervisor’s virtual switch and DPDK for network connectivity in scenarios where the operating system doesn’t support MANA?

[Brian]: In the case where the guest does not have the appropriate network drivers for the VF, the netvsc driver will automatically forward traffic to the software vmbus. The DPDK application developer needs to ensure that they support the netvsc PMD to make this work.

10.What future enhancements or features are being considered for DPDK in the context of Azure MANA, especially with ongoing updates and improvements in Azure’s cloud networking technology?

[Brian]: The supported feature list is published in the DPDK documentation: 1. Overview of Networking Drivers — Data Plane Development Kit 24.03.0-rc4 documentation (dpdk.org). We will release with the current set of features and get feedback from partners and customers on demand for any new features.

11. How does Microsoft plan to address the evolving needs of network performance and scalability in Azure with the continued development of DPDK and MANA?

[Brian]: We are focused on hardware acceleration to drive the future performance and scalability in Azure. DPDK is critical for the most demanding networking customers and we will continue to ensure that it’s supported on the next generations of hardware in Azure.

12. How does Microsoft support the community and provide documentation regarding the use of DPDK with Azure MANA, especially for new users or those transitioning from other systems?

[Brian]: Feature documentation is generated out of the codebase and results in the following:

Documentation for MANA DPDK, including running testpmd, can be found here: https://aka.ms/manadpdk

13. Are there specific resources or training modules that focus on the effective use of DPDK in Azure MANA environments?

[Brian]: We do not have specific training resources for customers to use DPDK in Azure, but that’s a good idea. Typically, DPDK is used by key partners and large customers that work directly with our development teams.

14. Will MANA provide functionality for starting and stopping queues?

[Brian]: TBD. What’s the use case and have you seen a need for this? Customers will be able to change the number of queues, but I will have to find out whether they can be stopped/started individually.

15. Is live configuration of Receive Side Scaling (RSS) possible with MANA?

[Brian]: Yes. RSS is supported by MANA.

16. Does MANA support jumbo frames?

[Brian]: Jumbo frames and MTU size tuning are available as of DPDK 24.03 and rdma-core v49.1

17. Will Large Receive Offload (LRO) and TCP Segmentation Offload (TSO) be enabled with MANA?

[Brian]: LRO in hardware (also referred to as Receive Segment Coalescing) is not supported (software should work fine)

18. Are there specific flow offloads that MANA will implement? If so, which ones?

[Brian]: MANA does not initially support DPDK flows. We will evaluate the need as customers request it.

19. How is low migration downtime achieved with DPDK?

[Brian]: This is a matter of reducing the amount of downtime during servicing events and supporting hotplugging. Applications will need to implement the netvsc PMD to service traffic while the VF is revoked and fall back to the synthetic vmbus.

20. How will you ensure feature parity with mlx4/mlx5, which support a broader range of features?

[Brian]: Mellanox creates network cards for a broad customer base that includes all the major public cloud platforms as well as retail. Microsoft does not sell the MANA NIC to retail customers and does not have to support features that are not relevant to Azure. One of the primary benefits of MANA is we can keep functionality specific to the needs of Azure and iterate quickly.

21. Is it possible to select which NIC is used in the VM (MANA or mlx), and for how long will mlx support be available?

[Brian]: No, you will never see both MANA and Mellanox NICs on the same VM instance. Additionally, when a VM is allocated (started) it will select a node from a pool of hardware configurations available for that VM size. Depending on the VM size, you could get allocated on ConnectX-3, ConnectX-4 Lx, ConnectX-5, or eventually MANA. VMs will need to support mlx4, mlx5, and mana drivers till hardware is retired from the fleet to ensure they are compatible with Accelerated Networking.

22. Will there be support for Windows and FreeBSD with DPDK for MANA?

[Brian]: There are currently no plans to support DPDK on Windows or FreeBSD. However, there is interest within Microsoft to run DPDK on Windows.

23. What applications are running on the SoC?

[Brian]: The SoC is used for hardware offloading of host agents that were formerly ran in software on the host and hypervisor. This ultimately frees up memory and CPU resources from the host that can be utilized for VMs and reduces impact of neighbor noise, jitter, and blackout times for servicing events.

24. What applications are running on the FPGA?

[Brian]: This is initially restricted to I/O hardware acceleration such as RDMA, the MANA NIC, as well as host-side security features.

Read the full user story ‘Unleashing Network Performance with Microsoft Azure MANA and DPDK’

Jun 05

Cache Awareness in DPDK Mempool

By Ben Thomas Blog

Author: Kamalakshitha Aligeri – Senior Software Engineer at Arm

The objective of DPDK is to accelerate packet processing by transferring the packets from the NIC to the application directly, bypassing the kernel. The performance of DPDK relies on various factors such as memory access latency, I/O throughput, CPU performance, etc.

Efficient packet processing relies on ensuring that packets are readily accessible in the hardware cache. Additionally, since the memory access latency of the cache is small, the packet processing performance increases if more packets can fit into the hardware cache. Therefore, it is important to know how the packet buffers are allocated in hardware cache and how it can be utilized to get the maximum performance.

With the default buffer size in DPDK, hardware cache is utilized to its full capacity, but it is not clear if this is being done intentionally. Therefore, this blog helps in understanding how the buffer size can have an impact on the performance and things to remember when changing the default buffer size in DPDK in future.

In this blog, I will describe,

1. Problem with contiguous buffers

2. Allocation of buffers with cache awareness

3. Cache awareness in DPDK mempool

4. l3fwd performance results with and without cache awareness

Problem with contiguous buffers

The mempool in DPDK is created from a large chunk of contiguous memory. The packets from the network are stored in packet buffers of fixed size (objects in mempool). The problem with contiguous buffers is when the CPU accesses only a portion of the buffer, such as in cases like DPDK’s L3 forwarding application where only metadata and packet headers are accessed. Rest of the buffer is not brought into the cache. This results in inefficient cache utilization. To gain a better understanding of this problem, its essential to understand how the buffers are allocated in hardware cache.

How are buffers mapped in Hardware Cache?

Consider a 1KB, 4-way set-associative cache with 64 bytes cache line size. The total number of cache lines would be 1KB/64B = 16. For a 4-way cache, each set will have 4 cache lines. Therefore, there will be a total of 16/4 = 4 sets.

As shown in Figure1, each memory address is divided into three parts: tag, set and offset.

• The offset bits specify the position of a byte within a cache line (Since each cache line is 64 bytes, 6 bits are needed to select a byte in a single cache line).

• The set bits determine which set the cache line belongs to (2 bits are needed to identify the set among 4 ways).

• The tag bits uniquely identify the memory block. Once the set is identified with set bits, the tag bits of the 4 ways in that set is compared against the tag bits of the memory address, to check if the address is already present in the cache.

Figure 1 Memory Address

In Figure 2, each square represents a cache line of 64 bytes. Each row represents a set. Since it’s a 4-way cache, each set contains 4 cache lines in it – C0 to C3.

Figure 2 Hardware Cache

Let’s consider a memory area that can used to create a pool of buffers. Each buffer is 128 bytes, hence occupies 2 cache lines. Assuming the first buffer address starts at 0x0, the addresses of the buffers are as shown below.

Figure 3 Contiguous buffers in memory

In the above figure the offset bits are highlighted in orange, set bits in green and tag bits in blue. Consider buffer 1’s address, where set bits “00” means the buffer maps to set0. Assuming initially all the sets are empty, buffer 1 occupies the first cache line of 2 contiguous sets.

Since buffer 1 address is 0x0 and the cache line size is 64 bytes, the first 64 bytes of the buffer occupy the cache line in set0. For the next 64 bytes, the address becomes 0x40 (0b01000000) indicating set1 because the set bits are “01”. As a result, the last 64 bytes of the buffer occupy the cache line in set1. Thus, the buffer is mapped into cache lines (S0, C0) and (S1, C0).

Figure 4 Hardware cache with buffer 1

Similarly, buffer 2 will occupy the first cache line of next two sets (S2, C0) and (S3, C0).

Figure 5 Hardware cache with 2 buffers

The set bits in buffer 3 address “00” show that the buffer 3 maps to set 0 again. Since the first cache line of set0 and set1 is occupied, buffer 3 occupies second cache line of set 0 and 1 (S0, C1) and (S1, C1).

Figure 6 Hardware cache with 3 buffers

Similarly buffer 4 occupies the second cache-line of sets 2 and 3 and so on. Each buffer is represented with a different color and a total of 8 buffers can occupy the hardware cache without any evictions.

Figure 7 Allocation of buffers in hardware cache

Although the buffer size is 128 bytes, CPU might not access all the bytes. For example, for 64 bytes packets, only the first 64 bytes of the buffer are consumed by the CPU (i.e. one cache line worth of data).

Since the buffers are two cache lines long, and are contiguous, and only the first 64 bytes of each buffer is accessed, only sets 0 and sets 2 are populated with data. Sets 1 and 3 go unused (unused sets are shown with pattern in Figure 8).

Figure 8 Unused sets in hardware cache

When buffer 9 needs to be cached, it maps to set 0 since set bits are “00”. Considering a LRU replacement policy, the least recently used cache line of 4 ways (buffer 1, 3, 5 or 7) in set0 will be evicted to accommodate buffer 9 even though set 1 and set 3 are empty.

This is highly inefficient, as we are not utilizing the cache capacity to the full.

Solution – Allocation of buffers with Cache awareness

In the above example, if the ununsed cache sets can be utilized to allocate the subsequent buffers (buffers 9 – 16), we would utilize the cache in a more efficient manner.

To accomplish this, the memory addresses of the buffers can be manipulated during the creation of mempool. This can be achieved by inserting one cache line padding after every 8 buffers, effectively aligning the buffer addresses in a way that utilizes the cache more efficiently. Let’s take the above example of contiguous buffer addresses and then compare it with same buffers but with cache line padding.

Figure 9 Without cache lines padding Figure 10 With cache lines padding

From figure 9 and 10, we can see that the buffer 9 address has changed from 0x400 to 0x440. With 0x440 address, the buffer 9 maps to set1. So, there is no need to evict any cache line from set0 and we are utilizing the unused cache set 1.

Similarly, buffer 10 maps to set3 instead of set2 and so on. This way buffer 9 to buffer 16, can occupy the sets1 and 3 that are unused by buffers1 to 8.

Figure 11 Hardware cache with cache awareness

This approach effectively distributes the allocation of buffers to better utilize the hardware cache. Since for 64-byte packets, only the first cache line of each buffer contains useful data, we are effectively utilizing the hardware cache capacity by accommodating useful packet data from 16 buffers instead of 8. This doubles the cache utilization, enhancing the overall performance of the system.

Padding of cache lines is necessary primarily when the cache size is exactly divisible by the buffer size (which means buffer size is a power of 2). In cases where the buffer size does not divide evenly into the cache size, part of the buffer is left unmapped. This residual portion effectively introduces an offset like the one achieved through padding.

Cache Awareness in DPDK Mempool

In DPDK mempool, each buffer typically has a size of 2368 bytes and consists of several distinct fields – header, object and trailer. Let’s look at each one of them.

Figure 13 Mempool buffer fields

Header: This portion of the buffer contains metadata and control information needed by DPDK to manage buffer efficiently. It includes information such as buffer length, buffer state or type and helps to iterate on mempool objects. The size of the object header is 64 bytes. Object: This section contains actual payload or data. Within the object section, there are additional fileds such as mbuf, headroom and packet data. The mbuf of 128 bytes contains metadata such as message type, offset to start of the packet data and pointer to additional mbuf structures. Then there is a headroom of 128 bytes. The packet data is 2048 bytes that contains packet headers and payload.

Trailer: The object trailer is 0 bytes, but a cookie of 8 bytes is added in debug mode. This cookie acts as a marker to prevent corruptions.

With a buffer size of 2368 bytes (not a power of 2), the buffers are inherently aligned with cache awareness without the need for cache line padding. In other words, the buffer size is such that it optimizes cache utilization without the need for additional padding.

The buffer size of 2368 bytes does not include the padding added to distribute buffers across memory channels.

To prove how the performance can vary with a buffer size that is power of 2, I ran an experiment with 2048 buffer size and compared it against the default buffer size of mempool in DPDK. In the experiment 8192 buffers are allocated in the mempool and a histogram of cache sets for all the buffers was plotted. The histogram illustrates the number buffers allocated in each cache set.

Figure 14 Histogram of buffers – 2048 bytes

With a buffer size of 2048 bytes, the same sets in the hardware cache are hit repeatedly, whereas other sets are not utilized (we can see that from the gaps in the histogram)

Figure 15 Histogram of buffers – 2368 bytes

With a buffer size of 2368 bytes, each set is being accessed only around 400 times. There are no gaps in the above histogram, indicating that the cache is being utilized efficiently.

DPDK l3fwd Performance

The improved cache utilization observed in the histogram, attributed to cache awareness, is further corroborated by the throughput numbers of the l3fwd application. The application is run on a system with 64KB 4-way set associative cache.

Below chart shows the throughput in MPPS for single core l3fwd test with 2048 and 2368 buffer sizes

Figure 16 l3fwd throughput comparison

There is a 17% performance increase with the 2368 buffer size.

Conclusion

Contiguous buffer allocation in memory with cache awareness enhances performance by minimizing cache evictions and maximizing hardware cache utilization. In scenarios where the buffer size is exactly divisible by the cache size (e.g., 2048 bytes), padding cache lines creates a offset in the memory addresses and better distribution of buffers in the cache. This led to a 17% increase in performance for DPDK l3fwd application.

However, with buffer sizes not precisely divisible by the cache size, as is the default in DPDK, padding of cache lines already occurs because of the offset in the buffer addresses, resulting in an improved performance.

For more information visit the programmers guide

May 01

DPDK Long Term Stable (LTS) Release 22.11.05

By Ben Thomas Blog

The latest DPDK Long Term Stable (LTS) Release 22.11.05 includes several updates and enhancements across various components of the DPDK framework. Significant changes in this release involve numerous file modifications, which are indicated by a large number of insertions and deletions across the codebase. The release notes document extensive additions, suggesting improvements and new features in the areas of network interface controllers (NICs), cryptographic devices, event devices, and baseband processing.

Notable adjustments were made to the build system, documentation, and driver support for various hardware. Improvements in error handling, memory management, and device operation stability are also reflected in the release notes. The release also addresses various bug fixes and performance enhancements to ensure better stability and efficiency.

Contributors

The update involved 246 files with a total of 3,235 insertions and 2,053 deletions. A big shout out to all the contributors to this release including:

Ajit Khaparde, Akhil Goyal, Akshay Dorwat, Alan Elder, Alex Vesker, Ali Alnubani, Andrew Boyer, Anoob Joseph, Arkadiusz Kusztal, Bing Zhao, Bruce Richardson, Chaoyong He, Chengwen Feng, Ciara Power, Dariusz Sosnowski, David Marchand, Dengdui Huang, Edwin Brossette, Eli Britstein, Emi Aoki, Erez Shitrit, Ferruh Yigit, Fidel Castro, Flore Norceide, Ganapati Kundapura, Gregory Etelson, Hamdan Igbaria, Hanumanth Pothula, Hao Chen, Harman Kalra, Hernan Vargas, Holly Nichols, Huisong Li, Jie Hai, Jonathan Erb, Joyce Kong, Kaiwen Deng, Kalesh AP, Kevin Traynor, Kiran Kumar K, Kishore Padmanabha, Kommula Shiva Shankar, Konstantin Ananyev, Kumara Parameshwaran, Long Li, Luca Boccassi, Maayan Kashani, Masoumeh Farhadi Nia, Maxime Coquelin, Michael Baum, Mingjin Ye, Morten Brørup, Mário Kuka, Neel Patel, Nithin Dabilpuram, Pavan Nikhilesh, Pengfei Sun, Qi Zhang, Qian Hao, Radu Nicolau, Rahul Bhansali, Rakesh Kudurumalla, Robin Jarry, Rongwei Liu, Satheesh Paul, Shai Brandes, Shaowei Sun, Shihong Wang, Shun Hao, Simei Su, Sivaprasad Tummala, Sivaramakrishnan Venkat, Stephen Hemminger, Suanming Mou, Sunil Kumar Kori, Sunyang Wu, Tom Jones, Viacheslav Ovsiienko, Wathsala Vithanage, Weiguo Li, Yajun Wu, and Yunjian Wang.

These contributors addressed various aspects from software fixes and performance enhancements to security improvements across multiple components of the system.

Download it here: DPDK 22.11.5

The git tree for this version can be accessed here: DPDK Stable 22.11

May 01

DPDK’s Role in Hyperscaling

By Ben Thomas Blog

In the rapidly evolving digital landscape, hyperscaling in the cloud has emerged as a critical strategy for businesses aiming to scale their operations efficiently. The webinar, “Hyperscaling in the Cloud,” hosted by Honnappa Nagarahalli (Arm), from the DPDK Tech Board, brings together industry experts to discuss how the Data Plane Development Kit (DPDK) is revolutionizing hyperscale cloud environments.

The Webinar Panelists

The webinar featured three distinguished panelists:

1. Brian Denton: A Senior Program Manager at Microsoft Azure, Brian brings a wealth of experience in Azure’s host networking. He shared insights into Azure’s implementation of DPDK, emphasizing its use in enhancing Ethernet, and overall network performance.

2. Rushil Gupta: As a Senior Software Engineer at Google, Rushil highlighted the critical role of DPDK in financial technology (Fintech) applications on Google Cloud Platform (GCP). His discussion focused on achieving consistency, performance, and reliability in high-frequency trading platforms.

3. Jim Thompson: Co-founder of Netgate, Jim delved into the use of DPDK in networking applications outside the traditional cloud domain. His contribution illuminated the versatility of DPDK across different cloud environments and its impact on virtual private networks (VPNs).

Insights from the Webinar

DPDK in Azure’s Cloud Networking

Brian Denton’s presentation offered a glimpse into how Microsoft Azure leverages DPDK to offload packet processing from the CPU to dedicated hardware. This approach significantly reduces latency and improves throughput, enabling Azure to offer enhanced performance for virtual machines (VMs) and networking services.

Brian shared valuable insights into how DPDK has been instrumental in Azure’s network infrastructure, particularly highlighting its impact on Azure’s host networking and the broader ecosystem of partners and customers. He explained that Azure has integrated DPDK to address the need for high-speed packet processing, which is crucial for a wide range of applications, from basic web services to complex, latency-sensitive tasks like real-time analytics and high-frequency trading.

One of the key points Brian made was about the technical architecture that enables Azure to leverage DPDK’s capabilities. He detailed how DPDK is used in conjunction with Azure’s hardware, such as SmartNICs, to offload and accelerate network functions traditionally handled by software. This hardware-software synergy, as Denton explained, not only reduces CPU overhead but also significantly decreases latency, providing Azure customers with improved network performance and efficiency.

Furthermore, Brian highlighted real-world applications of DPDK in Azure, illustrating how partners and customers utilize DPDK for scenarios that require minimal latency and maximum throughput. He also discussed the continuous evolution of Azure’s networking stack, underscored by the introduction of new hardware and the ongoing optimization of DPDK to meet the growing demands of cloud computing.

Some examples:

Clearent by Xplor used Azure SQL Database Hyperscale to revamp its merchant transaction reporting system. Previously operating on in-house systems, Clearent, which handles over 500 million transactions annually, shifted to a cloud-based setup. This move significantly boosted their ability to process and report data.

Protocall Services, a provider of telephonic crisis and behavioral health digital tools, embarked on a cloud migration journey to enhance the reliability, security, and scalability of its IT infrastructure.

DPDK’s Impact on Fintech Applications

Rushil Gupta’s discussion on the use of DPDK in financial technology (Fintech) applications, particularly in the realm of high-frequency trading (HFT) platforms on Google Cloud Platform (GCP), sheds light on how bleeding-edge network processing technologies are evolving financial markets.

In the fast-paced world of HFT, where milliseconds can equate to millions of dollars, the need for ultra-low latency and high throughput is paramount. Traditional cloud networking approaches may falter under such demanding requirements due to the involvement of kernel-based networking stacks that introduce additional latency. Here, DPDK’s bypass of the kernel networking stack, allowing direct access to network hardware, presents a compelling solution. This direct path significantly reduces latency and increases packet processing speed, enabling HFT platforms to operate at the speed required to capitalize on fleeting market opportunities.

Rushil illustrates how Google leverages DPDK to empower fintech customers on GCP, providing them with the infrastructure necessary to achieve the high throughput and low-latency communication essential for HFT platforms. One notable application is in the construction of complex event processing (CEP) systems, which are at the heart of many trading platforms. These systems analyze and act upon market data in real-time, necessitating the rapid processing capabilities that DPDK facilitates.

Rushil discusses the role of DPDK in enhancing data replication and recovery processes within fintech applications. In an industry where data integrity and availability are critical, DPDK’s efficiency in handling large volumes of data packets ensures that financial institutions can maintain robust data replication frameworks. This capability not only supports the high availability demands of trading platforms but also aids in achieving regulatory compliance related to data persistence and recovery.

Rushil explained how DPDK’s application in fintech on GCP demonstrates the technology’s pivotal role in enabling HFT and other financial services to meet their stringent performance and reliability criteria. With DPDK, Google provides a competitive edge to fintech applications, facilitating new levels of speed and efficiency in financial markets.

Some examples:

1. CME Group. As one of the world’s leading derivatives marketplaces, CME Group leverages GCP and DPDK for enhanced market data analytics and to facilitate high-speed trading. Their partnership aims to accelerate CME Group’s move to the cloud, transforming the global markets ecosystem with cloud-based innovation and scaling capacity dynamically to meet market demands.

2. Talos. Specializing in digital asset trading technology, Talos utilizes GCP’s infrastructure to support its trading platform. With DPDK, Talos benefits from reduced latency and increased throughput, essential for executing trades and managing orders across multiple exchanges and liquidity pools efficiently.

3. Clowd9. This cloud-based trading technology provider uses GCP to offer a scalable and secure platform for trading firms and financial institutions. DPDK supports Clowd9’s need for high performance and low latency in executing trades, managing risk, and processing real-time market data.

4. Freetrade. Freetrade, an investment platform, leverages GCP to power its app, offering users commission-free trading. GCP’s global infrastructure and DPDK’s network optimization capabilities ensure that Freetrade can manage high volumes of transactions and data analysis.

5. TD Securities Automated Trading (TDSAT): TDSAT uses GCP for trading fixed-income bonds, benefiting from DPDK’s high-performance packet processing capabilities. This enables TDSAT to execute trades at high speed and with precision, critical for maintaining competitiveness in the fixed income market.

These customers and use cases underscore the importance of DPDK in enhancing network performance on GCP, making it an ideal platform for capital market applications that demand high throughput, low latency, and scalability. By leveraging GCP and DPDK, capital market firms innovate and adapt quickly to market changes, manage risks more effectively, and unlock new opportunities for growth.

Broadening DPDK’s Application Scope in VPNs and Software Routers

Jim Thompson’s insights during the DPDK webinar shed light on how DPDK is leveraged in cloud networking through the lens of Netgate’s product, TNSR (pronounced ‘Tensor’). This serves as a case study of DPDK’s implementation outside its traditional use cases. TNSR, a virtual router developed by Netgate, underscores the adaptability and robustness of DPDK in addressing specific cloud networking challenges.

In cloud environments, networking demands can quickly escalate due to the sheer volume of data transfer and the need for secure connections. Traditional VPN solutions often fall short due to bandwidth limitations and the number of tunnels they can support. Jim highlighted how these constraints could hinder the scalability and performance of cloud-based services. This scenario is particularly relevant for large organizations that require extensive interconnectivity across various cloud environments.

The introduction of TNSR as a DPDK-powered solution exemplifies how DPDK’s high-performance packet processing capabilities can be extended beyond typical use cases to solve complex cloud networking problems. By utilizing DPDK’s efficient polling mode drivers (PMDs) for network and cryptography offload, TNSR significantly enhances throughput and reduces latency in VPN connections.

Jim explained how TNSR facilitates seamless connectivity between on-premise networks and cloud regions, highlighting the importance of VPN connections for secure data transfer. He underscored the limitations of existing cloud VPN solutions, such as bandwidth caps and tunnel number restrictions, which can significantly hamper large organizations’ networking needs. By leveraging DPDK, TNSR bypasses these limitations, providing a more flexible and scalable solution for cloud-based networking.

Take a look at Netgate’s customer stories here.

Conclusion

The webinar underscored DPDK’s pivotal role in enabling hyperscaling in the cloud. By providing a high-performance packet processing framework, DPDK not only enhances network efficiency but also opens new avenues for application development across various industries. As cloud architectures continue to evolve, the collaboration between cloud providers, technology firms, and the open source community will be vital in harnessing the full potential of DPDK.

Join in the Hyperscaling discussion and the community on slack here.

Feb 07

DPDK LTS 22.11.4

By Ben Thomas Blog

The latest DPDK release, version 22.11.4, brings several important updates, bug fixes, and improvements across various components of the DPDK framework. In this blog post, we’ll provide a brief summary of the key changes and highlights in this release.

Release Highlights:

1. Updated Git Tree: You can access the latest DPDK source code on the official Git repository at DPDK Stable Git Tree.

2. Bug Fixes and Backports: This release includes numerous bug fixes and patches, thanks to the efforts of the community, which contribute to the stability and reliability of the DPDK framework.

3. Security Improvements: The release addresses security-related issues and provides enhancements to improve the overall security of the DPDK.

4. Documentation Updates: The DPDK documentation has been updated, including improvements to guides for various NICs and platforms. The release also includes updates to the Security Guide and VDPA (Virtio Data Path Acceleration) documentation.

5. Performance Enhancements: DPDK is known for its high-performance networking capabilities, and this release continues to optimize and improve performance across different components.

6. Driver Updates: Multiple network drivers have been updated and improved in this release, including fixes for checksum offloading, packet handling, and performance tuning.

7. Eventdev Improvements: The eventdev subsystem has seen enhancements, including fixes related to device pointer management and driver names in the info struct.

8. Crypto Libraries: Various crypto libraries have been updated and fixed, including the addition of missing documentation for security context and memory leak fixes in the OpenSSL PMD.

9. Mempool Fixes: Several fixes have been applied to the mempool component, improving memory allocation and thread safety.

10. Other Component Updates: This release also includes fixes and improvements to other DPDK components, such as the test suite, Ethernet device drivers, and examples.

Overall, DPDK 22.11.4 is a stable release that brings a wide range of improvements, ensuring the continued reliability and performance of the DPDK framework. Users are encouraged to upgrade to this latest release to benefit from the bug fixes and enhancements provided by the DPDK community.

For detailed information about specific changes, you can refer to the official release notes here.

As always, it’s important to thoroughly test any new DPDK release in your specific networking environment to ensure compatibility and performance before deploying it in a production environment.

A Big Shoutout to Our Dedicated Contributors

We want to take this opportunity to express our gratitude to the hardworking individuals who contributed to this release. Their dedication and expertise have made DPDK even more robust and efficient.

Xueming Li, Aakash Sasidharan, Abdullah Sevincer, Akhil Goyal, Alex Vesker, Alexander Kozyrev, Amit Prakash Shukla, Anatoly Burakov, Anoob Joseph, Artemy Kovalyov, Bing Zhao, Brian Dooley, Bruce Richardson, Chaoyong He, Christian Ehrhardt, Ciara Power, David Christensen, David Marchand, Dengdui Huang, Ed Czeck, Eli Britstein, Feifei Wang, Fengjiang Liu, Ferruh Yigit, Gagandeep Singh, Ganapati Kundapura, Gregory Etelson, Harman Kalra, Harry van Haaren, Hemant Agrawal, Hernan Vargas, Huisong Li.

Jan 10

DPDK Long Term Stable Release Guide

By Ben Thomas Blog

Navigating the Data Plane Development Kit (DPDK) release landscape requires a thorough understanding of what a Long-Term Stable (LTS) release entails and why it may be the preferred choice for certain network environments. This guide dives into the nuances of DPDK LTS, its suitability for production, and the level of active support it receives.

Understanding DPDK Long Term Stable (LTS) Releases

DPDK LTS releases stand out in the networking world for their reliability over an extended period. Here’s what sets them apart:

Longevity of Support: DPDK LTS offers a commitment of three years’ worth of fixes, ensuring that a chosen release remains robust against issues found long after its initial deployment.

Consistent Improvements: A DPDK LTS release isn’t static. It evolves with a series of API/ABI compatible drop-in replacements that incorporate the latest fixes discovered in the subsequent years. For instance, a series based on a 2022 DPDK release will be refined with fixes identified during 2023-2025.

Sustainability: This approach guarantees that production environments can rely on a consistent, stable platform without the need to constantly adapt to new feature changes.

DPDK LTS releases are tailored for specific scenarios within the networking domain

Ideal for Production: LTS releases are the go-to for production environments where stability is paramount and the latest features are less of a priority.

Focus on Stability Over Novelty: Organizations that value long-term reliability over cutting-edge features will find DPDK LTS releases more suitable.

Active Maintenance and Support

The vibrancy of the DPDK LTS ecosystem is reflected in the following statistics:

Multiple Active Releases: As of now, three LTS releases are being actively maintained: 21.11, 22.11, and 23.11.

Frequent Updates: 2023 saw 9 releases across these maintained LTS versions.

Volume of Fixes: Approximately 1800 fixes have been backported to the active DPDK LTS releases in the last year alone.

Maintenance Span for DPDK LTS

The commitment to maintain a DPDK LTS release is clear and long-term:

General Fixes: All identified fixes will be backported to the LTS releases for a full three years from their release date.

Security Patches: Security-related updates may even extend beyond the three-year window, ensuring that LTS releases maintain a strong defense against vulnerabilities.

Choosing a DPDK LTS Release

When deciding on a DPDK LTS release, consider the following:

Maintenance Timeline: Evaluate whether longer support windows aligns with your deployment cycle and update capacity.

Active Maintenance Record: The volume and frequency of backported fixes provide an indication of the LTS version’s vitality and the community’s dedication to its upkeep.

Security Commitment: With a promise of three-plus years of security fixes, assess whether this meets your organization’s security and compliance requirements.

Preparing for DPDK LTS Deployment

Transitioning to or between DPDK LTS releases requires an organization to:

Stay Informed: Keep abreast of the DPDK LTS release and maintenance schedules to time updates strategically.

Test Thoroughly: Allocate resources for detailed testing to ensure the LTS version integrates seamlessly with your environment.

Anticipate Adjustments: Be prepared for any necessary changes that might arise from the introduction of backported fixes.

Conclusion

Selecting a DPDK LTS release is a strategic decision influenced by the need for stability, long-term support, and a maintenance schedule that ensures network applications remain secure and performant.

With the extended support and backporting of fixes, DPDK LTS releases offer a dependable foundation for organizations seeking a stable networking stack. This maintenance model continues to be a cornerstone of network reliability, allowing organizations to leverage stable and secure networking functions without the churn of constant feature updates.

For more information visit: https://doc.dpdk.org/guides/contributing/stable.html