Bug 1238

Summary: net/mlx5: RxQ failure with vectorized MPRQ
Product: DPDK Reporter: Dmitry Kozlyuk (dmitry.kozliuk)
Component: ethdevAssignee: dev
Status: UNCONFIRMED ---    
Severity: normal    
Priority: Normal    
Version: 23.03   
Target Milestone: ---   
Hardware: x86   
OS: Linux   

Description Dmitry Kozlyuk 2023-05-26 02:31:02 CEST
CPU: Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
OFED: MLNX_OFED_LINUX-23.04-0.5.3.3
FW: 16.35.2000
Kernel: 5.10.0-21-amd64

Commit 633684e0 "net/mlx5: fix error CQE dumping for vectorized Rx" presumably introduced a regression: Rx queues eventually stop with "CQ error on CQN 0x57d, syndrome 0x1" in dmesg. There is unfortunately no repro with testpmd yet. The error happens under certain CPU load from an app and with a different delay for each RxQ, which suggests that this may be a concurrency issue.