[dpdk-dev] [Bug 746] net/mlx5: segfault on MPRQ pool exhaustion

bugzilla at dpdk.org bugzilla at dpdk.org
Wed Jun 23 14:44:54 CEST 2021


https://bugs.dpdk.org/show_bug.cgi?id=746

            Bug ID: 746
           Summary: net/mlx5: segfault on MPRQ pool exhaustion
           Product: DPDK
           Version: unspecified
          Hardware: x86
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev at dpdk.org
          Reporter: dmitry.kozliuk at gmail.com
  Target Milestone: ---

When MPRQ is enabled and packets are delayed in the application, like it
happens when doing IPv4 reassembly, a segfault occurs in mlx5 PMD:

(gdb) bt
#0  0x000055b34ea047c6 in _mm256_storeu_si256 (__A=..., __P=0x80) at
/usr/lib/gcc/x86_64-linux-gnu/10/include/avxintrin.h:928
#1  rte_mov32 (src=0x2299c9140 "", dst=0x80 <error: Cannot access memory at
address 0x80>) at ../src/lib/librte_eal/x86/include/rte_memcpy.h:320
#2  rte_memcpy_aligned (n=60, src=0x2299c9140, dst=0x80) at
../src/lib/librte_eal/x86/include/rte_memcpy.h:847
#3  rte_memcpy (n=60, src=0x2299c9140, dst=0x80) at
../src/lib/librte_eal/x86/include/rte_memcpy.h:872
#4  mprq_buf_to_pkt (strd_cnt=1, strd_idx=0, buf=0x2299c8a00, len=60,
pkt=0x18345f0c0, rxq=0x18345ef40) at ../src/drivers/net/mlx5/mlx5_rxtx.h:820
#5  rxq_copy_mprq_mbuf_v (rxq=0x18345ef40, pkts=0x7f76e0ff6d18, pkts_n=5) at
../src/drivers/net/mlx5/mlx5_rxtx_vec.c:233
#6  0x000055b34ea0c543 in rxq_burst_mprq_v (rxq=0x18345ef40,
pkts=0x7f76e0ff6d18, pkts_n=46, err=0x7f76e0ff6a28, no_cq=0x7f76e0ff6a27) at
../src/drivers/net/mlx5/mlx5_rxtx_vec.c:456
#7  0x000055b34ea0c82e in mlx5_rx_burst_mprq_vec (dpdk_rxq=0x18345ef40,
pkts=0x7f76e0ff6a88, pkts_n=128) at ../src/drivers/net/mlx5/mlx5_rxtx_vec.c:528
#8  0x000055b34d642e83 in rte_eth_rx_burst (nb_pkts=128,
rx_pkts=0x7f76e0ff6a88, queue_id=<optimized out>, port_id=<optimized out>) at
/opt/dpdk/include/rte_ethdev.h:4889
(application stack follows)

No crash dumps are created by mlx5 PMD.

Unfortunately, it is not reproducible with public tools, like testpmd, but more
info can be provided if need be.

HW: 2 x MT28800 Family [ConnectX-5 Ex] 1019 (2 x 2 x 100G)
FW: 16.30.1004
OFED: MLNX_OFED_LINUX-5.3-1.0.0.1
DPDK: 21.02
EAL options: --in-memory --no-telemetry -a 21:00.0,mprq_en=1,dv_flow_en=0 -a
21:00.1,mprq_en=1,dv_flow_en=0 -a c1:00.0,mprq_en=1,dv_flow_en=0 -a
c1:00.1,mprq_en=1,dv_flow_en=0

Conditions depend on the number of factors as shown below.

CPU: AMD EPYC 7502 (32 cores):
Distro: Ubuntu 20.04.2 LTS
Kernel: 5.4.0-66-lowlatency

UCTX_EN=0:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - fail

UCTX_EN=1:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - ok


CPU: Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz (112 cores, HT)
Distro: Debian 10.8
Kernel: 4.19.0-13

UCTX_EN=0:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - fail
mprq_en=1,dv_flow_en=1 - fail

UCTX_EN=1:
mprq_en=0,dv_flow_en=0 - ok
mprq_en=0,dv_flow_en=1 - ok
mprq_en=1,dv_flow_en=0 - ok
mprq_en=1,dv_flow_en=1 - ok

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the dev mailing list