[dpdk-dev] [Bug 746] net/mlx5: segfault on MPRQ pool exhaustion
bugzilla at dpdk.org
bugzilla at dpdk.org
Wed Jun 23 14:44:54 CEST 2021
https://bugs.dpdk.org/show_bug.cgi?id=746
Bug ID: 746
Summary: net/mlx5: segfault on MPRQ pool exhaustion
Product: DPDK
Version: unspecified
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: Normal
Component: ethdev
Assignee: dev at dpdk.org
Reporter: dmitry.kozliuk at gmail.com
Target Milestone: ---
When MPRQ is enabled and packets are delayed in the application, like it
happens when doing IPv4 reassembly, a segfault occurs in mlx5 PMD:
(gdb) bt
#0 0x000055b34ea047c6 in _mm256_storeu_si256 (__A=..., __P=0x80) at
/usr/lib/gcc/x86_64-linux-gnu/10/include/avxintrin.h:928
#1 rte_mov32 (src=0x2299c9140 "", dst=0x80 <error: Cannot access memory at
address 0x80>) at ../src/lib/librte_eal/x86/include/rte_memcpy.h:320
#2 rte_memcpy_aligned (n=60, src=0x2299c9140, dst=0x80) at
../src/lib/librte_eal/x86/include/rte_memcpy.h:847
#3 rte_memcpy (n=60, src=0x2299c9140, dst=0x80) at
../src/lib/librte_eal/x86/include/rte_memcpy.h:872
#4 mprq_buf_to_pkt (strd_cnt=1, strd_idx=0, buf=0x2299c8a00, len=60,
pkt=0x18345f0c0, rxq=0x18345ef40) at ../src/drivers/net/mlx5/mlx5_rxtx.h:820
#5 rxq_copy_mprq_mbuf_v (rxq=0x18345ef40, pkts=0x7f76e0ff6d18, pkts_n=5) at
../src/drivers/net/mlx5/mlx5_rxtx_vec.c:233
#6 0x000055b34ea0c543 in rxq_burst_mprq_v (rxq=0x18345ef40,
pkts=0x7f76e0ff6d18, pkts_n=46, err=0x7f76e0ff6a28, no_cq=0x7f76e0ff6a27) at
../src/drivers/net/mlx5/mlx5_rxtx_vec.c:456
#7 0x000055b34ea0c82e in mlx5_rx_burst_mprq_vec (dpdk_rxq=0x18345ef40,
pkts=0x7f76e0ff6a88, pkts_n=128) at ../src/drivers/net/mlx5/mlx5_rxtx_vec.c:528
#8 0x000055b34d642e83 in rte_eth_rx_burst (nb_pkts=128,
rx_pkts=0x7f76e0ff6a88, queue_id=<optimized out>, port_id=<optimized out>) at
/opt/dpdk/include/rte_ethdev.h:4889
(application stack follows)
No crash dumps are created by mlx5 PMD.
Unfortunately, it is not reproducible with public tools, like testpmd, but more
info can be provided if need be.
HW: 2 x MT28800 Family [ConnectX-5 Ex] 1019 (2 x 2 x 100G)
FW: 16.30.1004
OFED: MLNX_OFED_LINUX-5.3-1.0.0.1
DPDK: 21.02
EAL options: --in-memory --no-telemetry -a 21:00.0,mprq_en=1,dv_flow_en=0 -a
21:00.1,mprq_en=1,dv_flow_en=0 -a c1:00.0,mprq_en=1,dv_flow_en=0 -a
c1:00.1,mprq_en=1,dv_flow_en=0
Conditions depend on the number of factors as shown below.
CPU: AMD EPYC 7502 (32 cores):
Distro: Ubuntu 20.04.2 LTS
Kernel: 5.4.0-66-lowlatency
UCTX_EN=0:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - fail
UCTX_EN=1:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - ok
CPU: Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz (112 cores, HT)
Distro: Debian 10.8
Kernel: 4.19.0-13
UCTX_EN=0:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - fail
UCTX_EN=1:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - ok
mprq_en=1,dv_flow_en=1 - ok
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the dev
mailing list