Bug 1369 - net/mlx5: RX packet metadata altered/incorrect after reception
Summary: net/mlx5: RX packet metadata altered/incorrect after reception
Status: UNCONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: ethdev (show other bugs)
Version: 22.11
Hardware: x86 Linux
: Normal normal
Target Milestone: ---
Assignee: dev
URL:
Depends on:
Blocks:
 
Reported: 2024-01-25 11:44 CET by pt4hwmy
Modified: 2024-01-25 11:44 CET (History)
0 users



Attachments

Description pt4hwmy 2024-01-25 11:44:17 CET
Our platform operates with Mellanox ConnectX-5 (MT_0000000183) network adapters, equipped with the latest firmware (version 16.35.3006). It utilizes DPDK 22.11.3 and runs on Ubuntu 20.

Under heavy load, we occasionally observe the reception of packets where the metadata is incorrect or altered at a later time. The incorrect data seem to revolve around the "port" and "RSS" fields.

For packet reception, we use: 
rte_eth_rx_burst(port ..)

To detect the problem, we loop through the packets that have been received and check if the "port" specified in the "mbuf" aligns with the interface designated for the RX burst execution.

Occasionally, they will not match. Sometimes, the port appears to match initially, only to change later during processing.

Attempts to modify the packet metadata after reception, such as changing the port, result in its reverting to its previous state shortly after. 

The packet data for these anomalies seems accurate. For instance, if RX is performed on port 1 and the packet mbuf indicates port 2, the IP data within the packet still correlates with the IP ranges assigned to port 1.

Initially, we suspected a potential double free of packet buffers but found no evidence supporting this. Also, our implementation functions correctly with various other network adapters and drivers.

This implementation also worked with an older version of DPDK. We previously used DPDK 20.02 without issue, but the problem emerged after upgrading to DPDK 22.11. Upgrading further to DPDK 23.11 did not resolve the issue.

We have tried different Mellanox firmware versions and even switched to ConnectX-6 adapters, but the problem persisted. 

The only effective workaround so far has been to disable "hardware RX vector" (rx_vec_en=0), which seem to make the problem go away.

Do you have any insights into what might be causing this problem, or suggestions on how to proceed in resolving it?

Note You need to log in before you can comment on or make changes to this bug.