Dec 05

Kamalakshitha’s Journey from Noise-Canceling Algorithms to Open Source Networking

By Ben Thomas Community Spotlight

In the fast-paced world of high-performance networking, DPDK (Data Plane Development Kit) stands as a powerful tool, and its success is due in no small part to the dedication of its community members.

One such contributor is Kamalakshitha, a talented developer whose journey took her from studying electronics and communications in India to making a notable impact in open source software at Arm.

This developer spotlight explores Kamalakshitha’s journey into tech, her entry into open source, and her contributions to the DPDK project.

From Academia to First Tech Role

Kamalakshitha’s journey in tech began with an academic foundation in Electronics and Communications, where she completed an integrated Bachelor’s and Master’s program in India. This program sparked her interest in technology, laying the groundwork for her career.

After graduation, she accepted a role as a software engineer at a startup in India. This company focused on developing noise cancellation algorithms, and her work centered on researching and implementing solutions that would filter background noise, allowing only the target sound to pass through.

This experience gave her a strong foundation in software development, algorithm design, and research-based problem-solving—skills that would later contribute to her open source career.

Master’s in Computer Engineering and Exposure to DPDK

Seeking to deepen her technical expertise, Kamalakshitha pursued a Master’s in Computer Engineering at Texas A&M University, where her interests broadened toward network performance and user-space networking.

It was during this period that she secured an internship with Arm, a leading semiconductor company. In this role, she joined Arm’s Open Source Software team, where she was introduced to DPDK and VPP (Vector Packet Processor).

Her internship with Arm marked a pivotal moment in her career. Kamalakshitha delved into the performance analysis of VPP, by collecting and analyzing PMU(Performance monitoring unit) counters to dissect code hotspots and optimize their performance

This work familiarized her with the fundamentals of user-space networking and performance optimization, opening her eyes to DPDK’s potential to improve data processing speeds in network applications by bypassing the traditional kernel-based networking stack.

Full-Time Role at Arm and Entry into Open Source

After her internship, Kamalakshitha was interested in a full-time role with Arm’s Open Source Software team. Although no positions were available in that team at the time, she secured a role in a different team, where she focused on performance analysis for networking applications.

This role concentrated on identifying performance bottlenecks without direct code contributions. However, her desire to contribute to the codebase led her to eventually rejoin the Open Source Software team when a position became available.

This transition was significant for Kamalakshitha, as it allowed her to fulfill her aspiration of not only identifying performance bottlenecks but also addressing them through code contributions. Joining the open-source team allowed her to actively engage with the DPDK community, sharing her insights and participating in open discussions.

Key Technical Contributions to DPDK

Kamalakshitha’s contributions to DPDK have been multifaceted. Here’s a look at some of the highlights of her work:

First Patch – Driver Fix
Her initial contribution to DPDK was a small but crucial driver fix. This patch taught her about the processes involved in open-source contributions and helped her become familiar with DPDK’s mailing list and review system.
Zero Copy API
Kamalakshitha volunteered to write test cases for DPDK’s Zero Copy API after spotting its development in the community mailing list. Her proactive approach and dedication to improvement led her to create test cases that enhanced the API’s robustness.
Cache-Aware Mempool Project and Blog Post
Recognizing that users could benefit from understanding the importance of buffer and cache line sizes for performance, Kamalakshitha undertook a project to create a blog post on cache-aware memory pooling. This piece provided an in-depth look at how certain buffer sizes and cache allocations could impact DPDK performance, transforming complex technical details into accessible knowledge for the community.
Multi-Packet Receive Queue (MPRQ)
Currently, Kamalakshitha is focused on improving the performance of DPDK’s multi-packet receive queue (MPRQ) on Arm systems. This project, centered on the Mellanox NIC, involves analyzing MPRQ configurations and their impact on packet processing, demonstrating her skill in both hardware-specific optimizations and cross-platform performance improvements.

Read Kamalakshitha’s blog Cache Awareness in DPDK Mempool for a detailed understanding of how buffer size and cache awareness influence packet processing in DPDK, including practical insights into buffer allocation strategies, cache utilization and performance benchmarks.

Engaging with the DPDK Community

Kamalakshitha’s involvement in the DPDK community extends beyond code contributions. She first engaged through DPDK’s mailing list, where she reviewed patches and learned the nuances of community contribution.

She credits her former manager, Honnappa Nagarahalli, with encouraging her to join Arm’s open source team, which further facilitated her entry into the DPDK community.

In addition to her formal role, Kamalakshitha has found that the open source environment allows her to connect with diverse experts across different companies. This experience has not only expanded her technical expertise but has also developed her soft skills, such as presenting technical ideas, discussing optimization methods, and building consensus on improvements with a global audience.

A Methodical Approach to Programming

Kamalakshitha’s approach to programming is systematic and meticulous. She prefers to analyze code performance and identify hotspots, using tools like perf to monitor program execution. Before implementing a fix, she visualizes potential solutions and maps out her strategy on paper.

This analog approach allows her to organize her ideas clearly, giving her a structured plan for tackling complex coding tasks. She then methodically tests her solutions, adding only incremental changes to optimize performance.

DPDK’s Future in High-Speed Networking

Kamalakshitha is particularly excited about DPDK’s future role in emerging technologies. With the expansion of 5G networks and the rise of AI-driven applications, she sees DPDK as a foundational technology enabling high-speed data processing and low-latency communication.

Whether it’s supporting 5G’s data transfer needs or facilitating communication between distributed AI nodes, DPDK provides a versatile, high-performance toolkit.

She also envisions DPDK as a vital component in building data-plane stacks on top of smart NICs and other accelerators, thus broadening its applications in cutting-edge technology.

Championing Diversity in Open Source

As a woman in a traditionally male-dominated field, Kamalakshitha is aware of the challenges and opportunities that come with increasing diversity in tech.

She notes that while progress has been made, there’s still work to be done in creating an inclusive environment. She encourages women to explore open source as a platform for professional growth, as it provides unique opportunities for visibility, learning, and collaboration.

Reflecting on her experiences, Kamalakshitha emphasizes the importance of community and role models in motivating women in tech. Seeing other female engineers in open source helps create a sense of belonging and inspires more women to participate and contribute.

Work-Life Balance and Personal Interests

Balancing a demanding career with personal life is essential for Kamalakshitha, especially after recently becoming a mother. She manages her responsibilities by setting clear priorities and taking regular breaks to recharge.

Small rituals, like preparing healthy meals or going for short walks, help her unwind. The arrival of her child has further sharpened her time management skills, as she carefully balances work and family responsibilities.

Essential Tools and Advice for New Developers

For Kamalakshitha, perf is an indispensable tool, enabling her to perform in-depth performance analysis for her projects. On a personal level, her phone is an essential device, with applications like slack keeping her connected to her team and enabling her to join meetings on the go when necessary. (join the DPDK slack channel here)

One piece of career advice that has resonated with her is the importance of understanding the basics of any project before diving in.

She believes that a clear grasp of the fundamentals not only accelerates learning but also enables more impactful contributions. She recommends that new developers have a quick chat with mentors to clarify the broader picture before delving into details.

Final Thoughts

Kamalakshitha’s journey from noise-canceling algorithms to open-source networking at Arm illustrates the power of perseverance, curiosity, and community.

Through her contributions to DPDK, she is helping shape the future of high-performance networking.

Her story is an inspiration for other developers, particularly women in tech, highlighting the benefits of open-source collaboration and the exciting opportunities it offers.

As she continues her journey in DPDK, Kamalakshitha looks forward to new projects, deeper community engagement, and expanding DPDK’s role in supporting next-generation networking technologies.

Start contributing to DPDK here.

Nov 05

Qian Xu’s Leadership in High-Performance Networking at NVIDIA

By Ben Thomas Community Spotlight

Welcome to another edition of our DPDK Developer Spotlight, where we explore the careers and insights of key figures who have shaped the DPDK community. In this spotlight, we share the journey of Qian Xu, a Software Validation Manager at NVIDIA, who has been deeply involved in the DPDK from its early days.

Early Days and Introduction to Open Source

Qian’s introduction to open source tech happened somewhat serendipitously. While working at Intel, she was assigned to the DPDK project, which Intel initiated in 2010 and released under a permissive open source license. The open source community for DPDK was established at dpdk.org in 2013 by 6WIND, and the project continued to grow until it was successfully transferred to the Linux Foundation in 2017.

In the project’s early days, Qian became fascinated by its community-driven aspects. The difference between corporate-led initiatives and the inclusive, collaborative nature of DPDK captivated her, leading her to fully engage with the community.

First Experiences in the Community

When Qian Xu first joined the DPDK community, her entry was marked by interactions that would shape her entire experience. Initially, she was primarily engaged in testing, not frequently submitting patches, which allowed her to observe the dynamics of the community from a unique vantage point.

Her initial period with the project was guided by Thomas Monjalon, one of the original maintainers of DPDK, whose reputation was well-known among developers in Shanghai and beyond. Contrary to the daunting tales of his strictness, Qian found Thomas to be incredibly helpful and supportive, a disposition that greatly eased her integration into the community.

Thomas and the other maintainers’ approachability was crucial as Qian navigated the new environment. He was not only receptive to feedback but actively encouraged Qian to voice her concerns about continuous integration practices or any other aspects of the project that might require improvement. This open dialogue fostered a collaborative relationship, proving that the community was responsive and genuinely interested in evolving based on its members’ input.

Qian’s interactions weren’t limited to informal discussions; they extended into more structured community engagements. She actively participated in various forums, detailed email threads about release testing statuses, to the DPDK Summit APAC and other international meetings, often communicating through emails and occasionally through IRC for more immediate concerns.

Technical Contributions

Qian Xu’s technical contributions have been vital to the growth of the DPDK community, especially in Continuous Integration (CI) processes, testing frameworks, and performance testing methodologies. Starting as a validation engineer, she played a key role in developing the DPDK Test Suite (DTS), one of her most notable achievements. The DTS has since become a cornerstone of the DPDK open source automation framework.

Development and Evolution of DTS

Under Qian’s guidance DTS marked a significant advancement in how DPDK could be tested and validated across different environments. Her leadership in this project involved designing the architecture of the test suite, ensuring it was robust and flexible enough to handle a variety of networking scenarios. This tool allowed for the automated testing of DPDK components, significantly reducing manual testing efforts and accelerating the developmental feedback loop.

Launching the First Performance Report

Qian was also instrumental in publishing the first DPDK performance report. This initiative set a precedent within the community, providing a transparent and replicable method to benchmark the performance of DPDK implementations. Her work laid the groundwork for subsequent reports, which have become crucial resources for developers seeking to understand and optimize DPDK performance.

Collaboration and Lab Development

Beyond software development, Qian played a pivotal role in establishing a physical testing lab for DPDK. This lab, set up in collaboration with the University of New Hampshire, provided a shared resource for developers around the world to conduct more sophisticated testing scenarios that could not be easily replicated locally. She facilitated remote debugging and testing procedures, which were critical during the integration and continuous delivery phases of the CI/CD pipeline.

DPDK into the Future

Qian envisions a dynamic and evolving future for DPDK, particularly as it integrates more deeply with AI, machine learning, 5G, and cloud computing. Over the past decade, DPDK has undergone transformative changes, significantly enhancing networking performance and becoming a fundamental component across various infrastructure domains.

Integration with AI and Machine Learning

As DPDK ventures into AI, Qian foresees the framework enhancing its capabilities to support AI operations, particularly in managing and optimizing data flow at incredible speeds. This integration aims to reduce latency in AI data processing, making real-time analytics and decision-making more efficient.

Evolving Continuous Integration (CI) Practices

Qian also highlights the potential evolution of DPDK’s CI practices. Initially, the CI systems were less mature, but they have grown more robust over time, reducing false alarms and improving stability. With the infusion of AI into CI, Qian believes that testing processes can be further automated and refined, allowing for quicker identification and resolution of critical issues before they affect the main branches. This would involve less human intervention, making the testing process more efficient and less prone to error.

The Importance of a Diverse Community

Qian’s experience as a woman in tech highlights the evolving diversity within the DPDK community and NVIDIA. She has witnessed a positive shift towards greater gender equality, supported by inclusive practices and the Linux Foundation’s collaborative approach. Her advice to aspiring female engineers is to engage deeply, contribute meaningfully, and leverage the supportive environment increasingly present in open source tech communities.

In essence, Qian Xu sees a positive trajectory for diversity in tech, especially in the DPDK Asia Pacific community, driven by systemic changes in the region and a community commitment to equality. Her experience underlines the importance of inclusive practices and the continuous effort needed to maintain and expand these gains within the tech industry.

Best Advice Ever Given

Qian often reflects on a piece of advice that has influenced her professional trajectory and personal growth. Throughout her career, from her early days in networking to her current role at NVIDIA, where her responsibilities are more focused on AI, Networking and DPDK integration, she has adhered to this guiding principle: focus deeply on your chosen domain. This advice, garnered from numerous interviews with industry leaders and her own mentors, encouraged her to dedicate herself fully to mastering the intricacies of her field.

The Growth of the APAC Community and Getting Involved

Qian emphasizes the importance of active participation in the DPDK community, especially in the APAC region. She encourages newcomers to engage in discussions, attend regional summits, and contribute code. The APAC region has seen increased participation, with diverse representation from countries like China, Japan, and India, contributing significantly to the global DPDK landscape.

Forging the Future at NVIDIA

Qian Xu envisions a future where DPDK (Data Plane Development Kit) continues to be a pivotal element in the evolution of networking and computational technologies, particularly as these fields intersect with AI and cloud computing. At NVIDIA, her focus on enhancing solution-level testing allows her to channel DPDK’s capabilities into broader technological advancements, even if her work no longer revolves directly around DPDK’s core components.

This is particularly relevant as NVIDIA pushes the boundaries of AI computing and network performance, areas where efficient data handling and processing are critical.

As highlighted in NVIDIA’s recent initiatives, such as the development of AI-driven platforms and enhancements in GPU rendering, the underlying principles of high-speed data processing championed by DPDK are integral to these advancements.

By marrying DPDK’s network processing strengths with NVIDIA’s leadership in AI and GPU technology, Qian aims to cultivate a technology landscape that not only advances NVIDIA’s commercial goals but also propels the industry forward.

Balancing Work and Life

Qian values family time and enjoys bicycling and traveling with her family to maintain a healthy work-life balance. These activities help her stay grounded and focused.

Indispensable Technology

AI and networking technologies are indispensable to Qian. She highlights AI advancements, like ChatGPT, as crucial in both her professional and personal life.

Closing Thoughts

Qian champions the contributions of female engineers in China, expressing gratitude for their increasing recognition. Her journey offers valuable lessons in cultural adaptability and proactive community engagement, inspiring both current and aspiring DPDK developers.

We hope Qian’s story motivates you to engage with the DPDK community and contribute to the future of networking and infrastructure technologies.

Get involved by joining the mailing lists here: https://www.dpdk.org/contribute/#Mailing-Lists

Oct 09

Explore the latest Innovations and Insights from this year’s DPDK North America Summit

By Ben Thomas Blog

This year’s DPDK North America Summit highlighted the projects ongoing technical excellence and innovation in high-performance networking. The event gathered experts and enthusiasts from around the globe, including project pioneers and new community contributors. They engaged in discussions and demonstrations focused on recent code developments, the technical board’s plans, and notably, exciting use cases and applications.

Over the past 14 years, DPDK has methodically developed an open stack that meets a broad variety of user requirements. The project’s open development approach and adaptability to community needs have been invaluable. They showcase the project’s commitment to open source principles. However, these practices have also led to a more expansive and less streamlined code base. Nevertheless, the technical board skillfully manages the necessary compromises for both core developers and end users, highlighting some exciting developments at the summit.

The review of new technology integrations and organic implementation alongside the community’s evolution has been impressive. The project’s development, impact, and extensive application across a global infrastructure have been significant. This includes not just data centers, enterprise cloud services, and network equipment, but also transportation networks, telecom systems, financial trading platforms, industrial control systems, and even particle processors, and astronomical data processing!

The project has reached a pivotal stage of maturity, with use cases and applications expanding dynamically. This evolution presents an opportune moment to explore and showcase real-world applications, extending far beyond its conventional roles in routers and firewalls.

One highlight of the summit was the presentation by Robin Jarry and David Marchand from Red Hat, who introduced “Grout,” a graph router based on DPDK. This tool is designed to simulate network functions and physical routers to replicate the behavior of typically closed-source VNFs and CNFs using an open source tool. They provided a detailed explanation of the rte_graph library’s role in data path processing and showcased Grout’s capabilities.

Another notable session was led by Dr. John Romein from the Netherlands Institute for Radio Astronomy (ASTRON). He discussed how advanced GPU technologies, specifically the NVIDIA Grace Hopper Superchip, are being utilized to process vast amounts of data from radio telescopes. This session not only emphasized the integration of DPDK with GPU technology but also demonstrated its real-world applications in astronomical data processing, pushing the boundaries of modern hardware capabilities.

Each session, from discussions on the challenges of implementing DPDK on non-cache coherent platforms by Hemant Agrawal and Gagandeep Singh from NXP, to insights into machine learning inference within network processing by Srikanth Yalavarthi from Marvell, highlighted the versatility and robustness of DPDK. These discussions underscored its ability to meet the increasingly complex demands of network performance solutions.

In summary the latest DPDK summit provided a platform for learning and sharing, reinforcing the community’s commitment to driving innovation in network performance through open development and governance, as highlighted by the ongoing initiatives of the project.

Watch all the presentations on the DPDK youtube here

Have a use case you’d like to share and feature on the website? Email marketing@dpdk.org

Sep 04

Explore the latest Innovations and Insights from this year’s DPDK North America Summit

By Ben Thomas Uncategorized

Watch all the presentations on the DPDK youtube here

Have a use case you’d like to share and feature on the website? Email marketing@dpdk.org

Aug 06

Highlights from DPDK Summit APAC

By Ben Thomas Blog

Welcome & Opening Remarks – Thomas Monjalon, Maintainer, NVIDIA

Thomas Monjalon, a maintainer at NVIDIA, opened the summit in Bangkok by emphasizing the importance of the community in the project. He highlighted the role of contributors like Ben Thomas, who handles marketing, and encouraged attendees to share their stories and get involved.

Thomas provided logistical details about and discussed the project’s history, noting its growth since its inception in 2010 and the support from the Linux Foundation. Looking ahead, Monjalon outlined priorities such as better public cloud integration, enhanced security protocols, and contributions to AI.

He stressed the long-term benefits of contributing to open source, including thorough documentation and community support, and noted that being part of the community can help individuals find new job opportunities.

Introducing UACCE Bus of DPDK – Feng Chengwen, Huawei Technologies Co., Ltd

Feng Chengwen from Huawei Technologies presented on the UACCE (Unified Accelerator Framework) integrated into DPDK. UACCE was designed to simplify usage and enhance performance and security for user space I/O and DMA operations without system involvement. It was upstreamed in version 5.7 of the Linux kernel and the latest DPDK release 24.03, allowing accelerators to access memory regions directly and eliminating address translation.

Key objectives include high performance, simplified usage, and security, with support for multiprocess memory acceleration and on-demand resource usage. UACCE addresses performance issues such as page faults and NUMA balancing using techniques like CPU pre-access and memory binding.

It is used in both host systems and virtual machines, though some features for virtual machines are still in development. Feng encouraged other developers to adopt UACCE, highlighting its broader application potential, and discussed future enhancements to integrate more devices into DPDK using the UACCE framework.

ZXDH DPU Adapter and It’s Application – Lijie Shan & Wang Junlong, ZTE

The presentation introduces the ZXDH DPU driver, highlighting its features, applications, and product portfolio. The DPU system framework includes modules for high-speed network interfaces, PCI connectivity, and advanced packet processing, supporting RDMA, NVMe protocols, and multiple accelerators for security and storage.

It enhances network and storage performance by offloading tasks from the host CPU, supporting virtualization, AI, and edge computing, with capabilities like TLS encryption. An example of offloading security group functions to the DPU demonstrates reduced CPU load and increased processing efficiency.

The product portfolio supports up to 5 million IOPS and 100 million packets per second, with ongoing development to improve TCP protocol handling and storage acceleration.

Libtpa Introduction – Yuanhan Liu, Bytedance

Yuanhan Liu from ByteDance introduces Libtpa, a user-space TCP/IP stack developed from scratch. The presentation discusses its background, design, testing, and performance.

Traditional kernel-based TCP/IP stacks have inefficiencies and overhead, and existing user-space TCP stacks face problems like breaking the kernel stack, limited zero-copy support, and inadequate testing and debugging tools.

Libtpa addresses these issues by allowing coexistence with the kernel stack, supporting multiple user-space instances, optimizing performance with zero-copy, and providing extensive testing and debugging capabilities.

Its architecture supports high throughput and low latency, achieving significant performance improvements. Libtpa includes over 200 unit tests and advanced debugging tools to ensure stability and ease of troubleshooting in production environments.

Telecom Packet Processing and Correlation Engine Using DPDK – Ilan Raman, Aviz Networks

Ilan Raman and his colleagues from Aviz Networks developed 5G packet processing applications using DPDK to manage complex 4G and 5G traffic on commodity hardware efficiently. Aviz Networks, founded in 2019 and operating in the USA, India, and Japan, focuses on providing open-source solutions for telecom operators.

They address challenges in monitoring evolving mobile technologies, scaling solutions horizontally, and reducing TCO through software-driven approaches. A primary use case is 5G correlation, which enhances network performance monitoring by correlating control and user traffic. Deployment involves DPDK-based applications on commodity hardware, processing high-bandwidth traffic, and extracting valuable metadata for insights and capacity planning.

The architecture uses a run-to-completion model, distributing functions across dedicated cores to handle various traffic types, with scalability achieved through RSS functionality in NICs. Practical learnings include configuring RSS for different packet types, ensuring symmetric load balancing, using per-core hash tables, isolating DPDK cores from the Linux kernel, and performing deep packet parsing in software.

Aviz Networks leveraged DPDK’s packet manipulation libraries for handling custom headers and achieved better performance through memory optimizations and CPU isolation.

Unified Representor with Large Scale Ports – Suanming Mou, NVIDIA Semiconductor

The presentation by Suanming Mou from NVIDIA focused on optimizing the unified representer in large-scale ports within DBK switch models. Initially, the high memory and CPU usage due to the need to poll all represent ports when packets missed hardware flow rules was a significant challenge.

The optimization approach involved setting “represent matching” to zero, directing all packets to a single uplink represent port, and copying the source port ID to packet metadata for identification in the hypervisor. This change reduced the need for extensive memory allocation and CPU polling as traffic was handled through a single proxy port.

The implementation of new flow rules for this setup resulted in substantial memory savings, decreasing from over 800 MB to around 332 MB, and improved packets per second (PPS) performance, increasing from 20 Mega PPS to 27.5 Mega PPS due to optimized polling and reduced cache misses. Overall, the optimization streamlined the polling process and significantly enhanced resource efficiency and performance in managing large-scale port traffic.

Troubleshooting Low Latency Application on CNF Deployment – Vipin Varghese & Sivaprasad Tummala, AMD

The presentation addresses the challenges encountered when transitioning applications from bare metal to container environments, emphasizing issues like reduced throughput, increased packet processing time, fluctuating latencies, and unpredictable performance with multiple container instances.

Root causes of these problems include limited access to hardware resources, library and compiler version mismatches, lack of specific patches, and performance variations based on hardware architecture and deployment models. Through several case studies, Vipin and Sivaprasad underscore the importance of profiling applications on bare metal before deployment, using tools like flame graphs and perf, and understanding hardware details such as cache domains and PCI bus partitioning for optimization.

They call for enhanced telemetry and observability in DPDK for containerized environments, noting that current tools and documentation are inadequate for complex troubleshooting. Recommendations include extending DPDK’s telemetry infrastructure, utilizing eBPF hooks for improved runtime data collection, and ensuring consistent performance through better documentation, custom plugins for CPU isolation, and awareness of hardware-specific optimizations.

Suggestions to Enhance DPDK to Enable Migration of User Space Networking Applications to DPDK – Vivek Gupta, Benison Technologies Pvt Ltd

The presentation by Vivek Gupta delves into enhancing DPDK to facilitate the migration of various user space networking applications, pinpointing a critical issue: advancements in CPU, IO, and memory technologies are not benefiting these applications. Despite significant improvements in infrastructure, user space networking applications often fail to utilize these advancements effectively. This gap highlights the need for a framework that can bring the benefits of these technological improvements to user space frameworks, ensuring better performance and efficiency.

Customers face numerous challenges when attempting to migrate their existing user space applications to DPDK or VPP environments without rewriting them. These applications, which traditionally rely on Linux kernel methods, encounter significant hurdles during migration. The proposed solution is to create a unified framework that integrates various technologies, such as EF VI and VPP, to enhance the performance of these applications. This framework would support different levels of packet processing, from L2 to L4, and provide essential mechanisms for encryption, decryption, deep packet inspection, and proxy functions.

To meet customer needs, the framework should enable applications to capture and inject packets at various levels, from the interface to higher layers. It should support state management and route updates from control applications, ensuring that applications always operate with the most current data. Additionally, the framework must offer accelerators for cryptographic and AI/ML-based processing to handle the complex requirements of modern applications. By addressing issues related to threading, caching, and reducing contention, the framework aims to significantly improve the performance of user space applications.

Practical examples underscore the potential benefits of this framework. For instance, enhancing web servers, proxy servers, and video streaming applications using the proposed framework could lead to substantial performance gains. By tackling issues such as blocking operations and optimizing thread management, applications can achieve higher throughput and better resource utilization. The framework should also cater to the needs of high-speed applications and support flexible application architectures, enabling user space applications to become more efficient and faster.

In conclusion, the proposed enhancements to DPDK aim to bridge the gap between advancements in infrastructure and the performance of user space networking applications. By providing a comprehensive framework that supports various processing levels, state management, and cryptographic acceleration, the solution promises to improve application performance, reduce contention, and enhance resource utilization. This approach will help customers migrate their applications more effectively and realize the full benefits of technological advancements in CPU, IO, and memory technologies.

Welcome Back – Prasun Kapoor, Associate Vice President, Marvell

The Asia Pacific (APAC) region, particularly India and China, has established a strong and dynamic community around the Data Plane Development Kit (DPDK). Recognizing this, the decision was made to hold the DPDK APAC Summit in Thailand, a geopolitically neutral location that facilitates easy participation from various APAC countries without visa complications.

The DPDK project is witnessing robust growth in multiple areas, including technical contributions, marketing outreach, and the number of active contributors. This growth is further bolstered by increasing interest from new prospective corporate members, indicating a healthy and expanding ecosystem.

Significant updates have been made to the University of New Hampshire (UNH) lab, which has recently incorporated the Marvell CN10K Data Processing Unit (DPU) into its testing suite. The lab now reports Data Test Suite (DTS) results for a variety of tests, and has established a community dashboard for code coverage, releasing monthly reports. Additionally, the lab has been proactive in submitting patches and bug fixes and is running compilation tests for Open vSwitch (OVS) with each DPDK patch, with future plans to include performance testing.

Marketing efforts for DPDK have seen a considerable boost, with increased engagement on platforms like LinkedIn and a notable rise in YouTube views, which is seen as a leading indicator of the project’s growing interest. The steady increase in DPDK downloads further underscores the project’s rising popularity.

Enhancements to the DPDK documentation have also been a focal point, with updates to the Poll Mode Driver (PMD) guidelines, security protocol documentation, and multiple sections of the programmer’s guide and contributor guidelines. Financially, the DPDK project is in a strong position with a healthy budget and substantial reserves. This financial stability ensures that key activities such as summits, community labs, and marketing efforts are well-supported for the foreseeable future.

Coupling Eventdev Usage with Traffic Metering & Policing (QoS) – Sachin Saxena & Apeksha Gupta, NXP

Sachin Saxena and Apeksha Gupta from NXP presented on integrating Eventdev with Traffic Metering and Policing to enhance Quality of Service (QoS). They discussed the various requirements from customers and the comprehensive solution they developed to meet these demands. Their goal was to share their extensive work and experiences with the community, offering insights into how similar challenges can be addressed effectively.

They highlighted different customer requirements, such as the need for traffic classification and scheduling in hardware, reducing CPU cycle usage, and implementing custom schedulers. By leveraging the DPDK framework, they managed to consolidate these varied needs into a generic solution. This approach not only met the specific requirements but also provided a reference for others in the community who might face similar challenges.

The technical approach of their solution involved utilizing DPDK’s metering, policing, and Eventdev frameworks. They explained how these components interact to meet the specified use cases, enhancing overall efficiency and performance. By breaking down complex use cases into manageable components and mapping these to corresponding RT library elements, they ensured a robust end-to-end functionality.

In their implementation details, they described the method of segmenting use cases into multiple components and aligning these with the appropriate RT library components. This strategy ensured that each part of the system worked seamlessly together, achieving the desired outcomes effectively and efficiently.

To illustrate their points, they shared practical use cases, including the management of scheduling priorities, grouping multiple ports, and applying markers and policers at the priority group level. These examples demonstrated how to optimize CPU cycles and prevent data loss, showcasing the practical applications of their solution in real-world scenarios.

GRO Library Enhancements – Kumara Parameshwaran Rathinavel, Microsoft

Kumara Parameshwaran Rathinavel from Microsoft has been working on enhancing the Generic Receive Offload (GRO) library. GRO is a widely used software technique that optimizes packet processing by merging multiple TCP segments into a single large segment. Kumara has been contributing to this project since his time at VMware and continues to do so at Microsoft. His work aims to improve the efficiency and performance of GRO, particularly in the context of network traffic handling.

The current implementation of GRO, which involves iteratively checking a table for flow matches, has been identified as suboptimal for handling packets received in multiple bursts. This method can lead to inefficiencies, especially as the timeout intervals increase. Kumara highlighted that the existing approach struggles with scalability and performance under these conditions, necessitating a more efficient solution.

To address these limitations, Kumara proposed a hash-based method for flow matching. This new approach significantly enhances the efficiency of the GRO process. In tests, the hash-based method demonstrated substantial performance improvements, reducing the CPU utilization of the GRO reassemble function. This method not only optimizes the flow matching process but also ensures better handling of packet bursts, leading to overall improved performance.

Recognizing the varying latency requirements of different applications, Kumara suggested implementing tuple-specific timeouts within the GRO framework. This approach allows for more flexible and optimized GRO settings tailored to the specific needs of various applications. For instance, applications with low latency requirements, such as banking transactions, can have shorter timeouts, while those with less stringent latency needs can benefit from longer timeouts. This customization ensures that all applications can operate efficiently without compromising on performance.

To validate these enhancements, Kumara used a setup involving a virtual machine as a test proxy, demonstrating notable performance gains. The improvements in GRO are particularly beneficial for network applications like load balancers, where reducing CPU utilization and improving packet processing efficiency are critical. Kumara’s work on GRO library enhancements showcases significant advancements in optimizing network traffic handling, contributing to more efficient and scalable network performance.

Refactor Power Library for Vendor Agnostic Uncore APIs – Sivaprasad Tummala & Vipin Varghese, AMD

The presentation focuses on the critical need for improved power management and efficiency for Telco operators, emphasizing the importance of vendor-agnostic solutions for scalability across different platforms. This is particularly relevant as power has become a significant concern, with the need to optimize performance per watt and manage power effectively.

AMD’s power library within the DPDK (Data Plane Development Kit) aims to address these concerns by balancing power consumption and performance. The library optimizes core and uncore frequency management and introduces adaptive algorithms for real-time monitoring and idle state management. This ensures that while cores are busy polling, they consume power efficiently without compromising on performance.

Currently, the power library is tightly coupled with Linux and requires specific modifications to accommodate new drivers, leading to inefficiencies and increased code size. Each new driver introduction necessitates changes to the core library, increasing the complexity and effort required for maintenance and updates. This approach is not scalable as the number of drivers and their capabilities grow.

To address these challenges, the refactoring efforts aim to modularize the power library, enabling plug-and-play capabilities for new drivers and reducing dependencies. This modular approach will simplify the addition of new drivers, improve performance, and enhance scalability by minimizing the library’s footprint and code complexity.

The proposed enhancements include vendor-agnostic uncore APIs to manage interconnect bus frequencies and dynamic link width management. These APIs promote a standardized interface for power management across different hardware vendors, making it easier for applications to develop power management solutions without being tied to specific vendors. This approach not only reduces complexity but also ensures compatibility and scalability across various platforms.

Q&A with the Governing Board & Technical Board – Wang Yong, Thomas Monjalon, Jerin Jacob

The Technical Board (TBoard) and Governing Board (GBoard) of the project play distinct but complementary roles in steering the community. The TBoard consists of 11 members who meet bi-weekly to discuss and resolve technical issues. These meetings, conducted via Zoom, involve all community members and focus on consensus-driven decision-making. When consensus cannot be reached, the TBoard votes on issues, requiring prior email submissions for agenda inclusion. This structured approach ensures thorough consideration and discussion before decisions are made.

The GBoard, on the other hand, sets the project’s broad direction, encompassing administrative tasks, marketing strategies, and budgeting. This board meets monthly and includes a permanent chairperson along with representatives from the Linux Foundation. The GBoard comprises 12 members: 10 from golden member companies, one from a silver member, and one from the TBoard. Every six weeks, the GBoard convenes, and every three months, they hold joint meetings with the TBoard to align on financial plans, marketing efforts, and major project decisions.

Membership in the GBoard is tiered, with gold members contributing $50,000 annually and silver members contributing $10,000 annually. These funds are crucial for project initiatives, such as summits and acquiring new servers for the lab. Gold members play a significant role in decision-making due to their financial contributions, ensuring their interests and investments are aligned with project goals.

Community involvement is a cornerstone of both boards’ operations. TBoard meetings are open to all, fostering transparency and inclusivity in technical discussions. Issues are raised via email, ensuring that all voices can be heard. The GBoard, while more focused on strategic direction, includes representatives from various companies to bring diverse perspectives to the table. This collaborative approach allows for comprehensive planning and execution of project initiatives.

Currently, the boards are prioritizing several key areas: enhancing security protocols and documentation, improving continuous integration (CI) performance testing, and integrating more functional testing in the Data Plane Development Kit (DTS). Future plans include creating a performance dashboard and requiring contributors to add tests for new features. These efforts aim to maintain high standards of performance and security, ensuring the project’s robustness and reliability for all users.

Rte_flow Match with Comparison Result – Suanming Mou, NVIDIA Semiconductor

The presentation introduces a new feature for rte_flow, which focuses on comparison operations to enhance the flexibility of flow rules. This feature allows for comparisons between fields or between a field and an immediate value, providing more dynamic and versatile rule configurations. The presenter assumes familiarity with rte_flow from previous sessions and emphasizes the advantages of this new capability.

Traditional rte_flow rules are limited to matching immediate values, which can be restrictive in certain scenarios. For instance, in TCP connection tracking, the termination of connections often goes unnoticed by software if the reset packet is handled by hardware directly. Similarly, for packet payload evaluation, users may want to skip cryptographic operations on packets without payloads. These examples highlight the need for more advanced comparison methods in flow rules.

The new feature supports a range of comparison operations, including greater than, less than, and equal comparisons. It has been initially implemented in ConnectX-7 and BlueField-3 NICs, specifically within the template API. However, there are limitations, such as the inability to mix comparison items with other items and restricted field support. The feature is designed to be flexible but currently has hardware constraints that limit its full potential.

Users can configure these comparison rules using a new `item compare` structure in the API. This involves specifying the fields to compare, the immediate values, and the desired operations, such as equal, not equal, greater than, and so forth. The configuration also supports specific bit-width comparisons, providing detailed control over how comparisons are executed. This structure aims to offer a robust framework for implementing dynamic and complex flow rules.

Several examples demonstrate the use of the new comparison item in flow rules, illustrating its practical application. Despite its benefits, the feature currently supports only single comparison rules within flow tables and a limited range of fields. The requirement for both spec and mask in the configuration is due to the template API structure, which mandates these elements even if they might not be necessary for all comparisons. Suanming Mou concludes by encouraging other developers to integrate support for this feature in their PMDs, recognizing its potential to significantly enhance rte_flow’s capabilities.

DPDK PMD Live Upgrade – Rongwei Liu, Nvidia

The DPDK PMD live upgrade process is designed to meet the critical need for upgrading or downgrading PMD versions seamlessly without disrupting ongoing services. This process ensures the transfer of user configurations while minimizing downtime to nearly zero, making it essential for applications requiring continuous operation.

Two primary approaches are detailed for conducting these upgrades. The first approach involves a graceful exit of the old PMD followed by the restart of the new PMD, during which there is a brief period of service unavailability. The second approach utilizes a standby mode, where the new PMD is prepared with the necessary configurations but remains inactive until the old PMD exits. This method ensures that there is no service disruption as the traffic seamlessly switches to the new PMD once the old one exits.

To facilitate this process, two modes are introduced: active and standby. In the active mode, the PMD manages traffic and hardware configuration directly. In standby mode, configurations are set up but do not affect traffic immediately. Instead, they become active only when the old PMD gracefully exits, ensuring that the traffic handling transitions smoothly without any interruption.

A crucial aspect of the upgrade process is the use of group zero as a dispatcher for traffic processing rules. This mechanism ensures that all configurations are synchronized and become effective immediately, eliminating any downtime or disruption in traffic flow. By inserting and managing these rules efficiently, the system can transition from the old PMD to the new one seamlessly.

Finally, the process is designed to be highly scalable, allowing for adaptable resource usage to accommodate various deployment scales. It also emphasizes the importance of a user-friendly API, ensuring that users can access and utilize the upgrade features quickly and easily, thus enhancing the overall efficiency and effectiveness of the live upgrade process.

Monitoring 400G Traffic in DPDK Using FPGA-Based SmartNIC with RTE Flow – David Vodák, Cesnet

David Vodák from Cesnet presented their journey to enable 400G traffic monitoring using DPDK and FPGA-based SmartNICs, a project initiated due to the lack of suitable FPGA cards in the market. Cesnet, a national research and educational network, designed the FPGA SmartNIC which utilizes the Intel HX I7 chip with 400G Ethernet support and PCIe gen 4/5 compatibility. This card is engineered for high-speed processing, making it ideal for their needs.

The cornerstone of their solution is the NDK platform, an open-source framework that supports up to 400G throughput. NDK facilitates parallel processing, filtering, and metadata export, which are crucial for handling high-speed network traffic. It is designed to be highly adaptable, allowing users to create new components or use existing ones to build custom firmware for various applications.

NDK’s versatility extends beyond monitoring; it is also used for high-frequency trading and CL testing. One of the open source tools developed by Cesnet, the IPFIXPROBE, supports DPDK and is employed to create detailed traffic flows from input packets. This probe exemplifies the practical applications of NDK in real-world scenarios, demonstrating its robustness and flexibility.

To ensure the reliability of their solutions, Cesnet employs rigorous testing and verification methods. Functional testing is conducted using tools like testpmd or custom DPDK applications in loopback setups. For benchmarking, they utilize external traffic generators such as Spent and Flow Test, with the latter capable of simulating realistic network traffic to provide more accurate testing results.

Looking ahead, Cesnet plans to expand the capabilities of the NDK platform to support various cards and use cases beyond traffic monitoring. Their commitment to open source development is evident, as they provide extensive resources on GitHub for the community to collaborate and innovate further. This open source approach not only fosters community involvement but also drives continuous improvement and adaptation of their technology.

Lessons Learnt from Reusing QDMA NIC to Base Band PMD – Vipin Varghese & Sivaprasad Tummala, AMD

AMD undertook a project to repurpose its QDMA NIC for Forward Error Correction (FEC) offloading in virtual RAN environments using an FPGA-based prototype. The goal was to support LDPC encode/decode functionalities without developing the BBDEV PMD from scratch. Instead, the team adapted existing QDMA NIC code, incorporating necessary modifications to create a functional BB PMD. This approach allowed for rapid prototyping and integration within a short time frame.

Throughout the project, several challenges arose, including mismatched use cases, software latencies, and inadequate thread handling. To address these, the team implemented solutions such as using selective builds and applying compiler pragmas. These strategies helped optimize the RX/TX burst functionalities and reduce instruction cache misses, which in turn minimized overall latency.

Significant efforts were made to reduce latencies, which were initially high due to multiple factors including software and PMD-related overheads. By minimizing instruction cache misses and optimizing critical functions to fit within smaller memory pages, the team achieved notable latency reductions. Further improvements were realized by implementing lockless mechanisms using RT rings, ensuring efficient NQ/DQ operations.

To meet the specific requirements of the customer for low-latency and high-throughput, the team had to go beyond simple test scenarios and adapt the software implementations to better match real-world use cases. This involved modifying the BBDEV PMD to handle multiple threads and ensuring proper mapping and distribution of LDPC encode and decode requests, which significantly improved performance and reliability.

The project highlighted several important lessons. It underscored the value of reusing existing PMDs when feasible, as well as the need to reduce code bloating and align PMD examples with actual customer use cases. The team recommended updates for the DPDK community to focus more on low latency and stress testing, and to improve lockless implementations. These insights and improvements contributed to a more robust and efficient solution, ultimately enhancing the overall performance of the system.

Closing Remarks – Nathan Southern, Sr. Projects Coordinator, The Linux Foundation

The DPDK conference in Bangkok marked a significant milestone as the first APAC event since COVID-19. With a total of 63 participants, including 30 in person and 33 online, the event exceeded expectations. This conference, considered an experiment in a geopolitically neutral location, was deemed successful and has paved the way for potential future APAC events in various locations.

DPDK, a project now 14 years old, has shown remarkable growth and resilience, countering previous perceptions of being in its sunset phase. Since Nathan joined the Linux Foundation in April 2022, the project has maintained nearly perfect member retention and continued technological advancement. This longevity and sustained momentum underscore the project’s vitality and relevance in the tech community.

Strategic efforts in marketing and documentation have significantly enhanced the project’s visibility and usability. Under the direction of Ben Thomas, marketing initiatives have been robust, and tech writer Nini Purad has overhauled the project’s documentation. These efforts aim to foster community growth and engagement, ensuring that DPDK remains a valuable resource for its users.

The DPDK project is evolving in critical areas such as security, cloud hyperscaling, and AI. This evolution is driven by community input and the guidance of the tech board, including leaders like Thomas Monjalon. Continuous community involvement is essential for future advancements, highlighting the importance of active participation from all stakeholders.

Nathan emphasized the importance of community engagement in driving DPDK’s development forward. He encouraged attendees to participate through Slack, tech board calls, and contributions to the OSS code. Additionally, the project is actively creating dynamic content, including end-user stories and developer spotlights, to promote mutual growth and expand the membership base. This focus on community and content creation is key to sustaining and growing the DPDK project.

Watch all the summit videos here.

Ben Thomas