[dpdk-dev] [dpdk-users] A question about Mellanox ConnectX-5 and ConnectX-4 Lx nic can't send packets?

wangyunjian wangyunjian at huawei.com
Tue Jan 11 13:29:10 CET 2022


> -----Original Message-----
> From: Dmitry Kozlyuk [mailto:dkozlyuk at nvidia.com]
> Sent: Tuesday, January 11, 2022 7:42 PM
> To: wangyunjian <wangyunjian at huawei.com>; dev at dpdk.org; users at dpdk.org;
> Matan Azrad <matan at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> Cc: Huangshaozhang <huangshaozhang at huawei.com>; dingxiaoxiong
> <dingxiaoxiong at huawei.com>
> Subject: RE: [dpdk-dev] [dpdk-users] A question about Mellanox ConnectX-5 and
> ConnectX-4 Lx nic can't send packets?
> 
> > From: wangyunjian <wangyunjian at huawei.com>
> [...]
> > > From: Dmitry Kozlyuk [mailto:dkozlyuk at nvidia.com]
> [...]
> > > Thanks for attaching all the details.
> > > Can you please reproduce it with --log-level=pmd.common.mlx5:debug
> > > and send the logs?
> > >
> > > > For example, if the environment is configured with 10GB hugepages
> > > > but each hugepage is physically discontinuous, this problem can be
> > > > reproduced.
> >
> > # ./x86_64-native-linuxapp-gcc/app/dpdk-testpmd -c 0xFC0 --iova-mode pa --
> legacy-mem -a af:00.0 -a af:00.1 --log-level=pmd.common.mlx5:debug -m
> 0,8192 -- -a -i --forward-mode=fwd --rxq=2 --txq=2
> --total-num-mbufs=1000000
> [...]
> > mlx5_common: Collecting chunks of regular mempool mb_pool_0
> > mlx5_common: Created a new MR 0x92827 in PD 0x4864ab0 for address
> > range [0x75cb6c000, 0x780000000] (592003072 bytes) for mempool
> > mb_pool_0
> > mlx5_common: Created a new MR 0x93528 in PD 0x4864ab0 for address
> > range [0x7dcb6c000, 0x800000000] (592003072 bytes) for mempool
> > mb_pool_0
> > mlx5_common: Created a new MR 0x94529 in PD 0x4864ab0 for address
> > range [0x85cb6c000, 0x880000000] (592003072 bytes) for mempool
> > mb_pool_0
> > mlx5_common: Created a new MR 0x9562a in PD 0x4864ab0 for address
> > range [0x8d6cca000, 0x8fa15e000] (592003072 bytes) for mempool
> > mb_pool_0
> 
> Thanks for the logs, UUIC they are from a successful run.
> I have reproduced an equivalent hugepage layout and mempool spread
> between hugepages, but I don't see the error behavior in several tries.
> What are the logs in case of error?

The mlx5_tx_error_cqe_handle function print log(create some dump files):
Unexpected CQE error syndrome 0x04 CQN = 32 SQN = 5570 wqe_counter = 0 wq_ci = 1 cq_ci = 0
MLX5 Error CQ: at [0x17dcb64000], len=2048

Unexpected CQE error syndrome 0x04 CQN = 32 SQN = 5570 wqe_counter = 0 wq_ci = 1 cq_ci = 1
MLX5 Error CQ: at [0x17dcb64000], len=2048


> Please note that the offending commit you found (fec28ca0e3a9) indeed
> introduced a few issues, but they were fixed later, so I'm testing with 21.11, not
> that commit.
> Unfortunately, none of those issues resembled yours.

I am also testing with 21.11. The ' --iova-mode pa --legacy-mem ' parameter must be used when starting testpmd.
This patch https://patchwork.dpdk.org/project/dpdk/patch/da0dc3b3ba2695d1ff1798fc6c921da6079f00d3.1640585898.git.wangyunjian@huawei.com/
can be merged. And when the 'socket_stats.greatest_free_size' is 1G, this issue may be reproduced.


More information about the dev mailing list