Bug 580 - [dpdk-20.11.rc-4] testpmd: Ubuntu 16.04: Failed to start port as mlx5_pci can't allocate hash list.
Summary: [dpdk-20.11.rc-4] testpmd: Ubuntu 16.04: Failed to start port as mlx5_pci can...
Status: IN_PROGRESS
Alias: None
Product: DPDK
Classification: Unclassified
Component: testpmd (show other bugs)
Version: 20.11
Hardware: x86 Linux
: Normal normal
Target Milestone: 20.11
Assignee: Asaf Penso
URL:
Depends on:
Blocks:
 
Reported: 2020-11-19 18:45 CET by Abhishek
Modified: 2022-08-28 22:11 CEST (History)
6 users (show)



Attachments
attachment-11295-0.html (3.12 KB, text/html)
2020-11-25 19:26 CET, Asaf Penso
Details

Description Abhishek 2020-11-19 18:45:04 CET
Hi,


Facing issue with Ubuntu 16.04 & Ubuntu 18.04 with mlx5 driver on Azure platform.
For Ubuntu 16.04(4.15.0-1098-azure), testpmd failed to start port. mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.

testpmd output given below:
EAL: Probing VFIO support...
EAL: Probe PCI driver: mlx5_pci (15b3:1016) device: 40db:00:02.0 (socket 0)
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
mlx5_pci: Default miss action is not supported.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
EAL: No legacy callbacks, legacy socket not created
Set io packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=180224, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
mlx5_pci: Failed to init cache list NIC_ingress_0_matcher_cache entry (nil).
mlx5_pci: port 0 failed to set defaults flows
Fail to start port 1
Please stop the ports first
Done

For Ubuntu 18.04(5.4.0-1031-azure),Getting similar errors but ports are not blocked/stopped.

testpmd output given below:
--------------------------mlx5-----------------------------------------
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:1016) device: 0002:00:02.0 (socket 0)
mlx5_pci: Retrying to allocate Rx DevX UAR
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
mlx5_pci: Default miss action is not supported.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
mlx5_pci: Can't allocate hash list mlx5_1_flow_table entry.
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
tap_nl_dump_ext_ack(): Cannot delete qdisc with handle of zero
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid
EAL: No legacy callbacks, legacy socket not created
Set txonly packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=163840, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
Port 1: 00:0D:3A:C4:E8:BB
Checking link statuses...
Done
Comment 1 Asaf Penso 2020-11-19 18:57:42 CET
Hello Abhishek,
Can you try the latest code as of now? We had several fixes recently.
Also, if that doesn't solve the issue, can you put the testpmd command line parameters?
Thanks
Comment 2 Abhishek 2020-11-19 19:07:35 CET
Hi Asaf,

I ran 20.11-rc4 that should be the latest code.
Below is testpmd command I used:

dpdk-testpmd -l 0-1 -w 0002:00:02.0 --vdev='net_vdev_netvsc0,iface=eth1' -- --port-topology=chained --nb-cores 1 --txq 1 --rxq 1 --mbcache=512 --txd=4096 --rxd=4096 --forward-mode=txonly --stats-period 1 --tx-offloads=0x800e --tx-ip=10.0.1.5,10.0.1.4
Comment 3 Abhishek 2020-11-25 17:52:51 CET
Hi,

We can still reproduce this issue even with 20.11-rc5 on Ubuntu 16.04 (4.15.0-1098-azure) and mlx5.
Is there any update on this issue?

Thanks
Comment 4 Asaf Penso 2020-11-25 19:26:06 CET
Created attachment 136 [details]
attachment-11295-0.html

Yes, a fixing patch was integrated after rc5.
Can you take the most latest code and try?
If there is still a failure please put the testpmd command again as well as the full testpmd output.

Regards,
Asaf Penso

________________________________
From: bugzilla@dpdk.org <bugzilla@dpdk.org>
Sent: Wednesday, November 25, 2020 6:52:51 PM
To: Asaf Penso <asafp@nvidia.com>
Subject: [Bug 580] [dpdk-20.11.rc-4] testpmd: Ubuntu 16.04: Failed to start port as mlx5_pci can't allocate hash list.

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D580&amp;data=04%7C01%7Casafp%40nvidia.com%7C791a0402be334892bc3108d89162928d%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637419199837127790%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=CU7y%2BQnJ9i8VDIatBGb7HDTB5nnpvY5S0f7g2yrDr60%3D&amp;reserved=0

--- Comment #3 from Abhishek (abhimarathe17@gmail.com) ---
Hi,

We can still reproduce this issue even with 20.11-rc5 on Ubuntu 16.04
(4.15.0-1098-azure) and mlx5.
Is there any update on this issue?

Thanks

--
You are receiving this mail because:
You are on the CC list for the bug.
Comment 5 Abhishek 2020-11-26 18:28:28 CET
Hi Asaf,

Now the error messages are gone, but still port is failed to start.

Testpmd Command:
dpdk-testpmd -l 0-2 -w 4dc4:00:02.0 --vdev='net_vdev_netvsc0,iface=eth1' -- --port-topology=chained --nb-cores 2 --txq 2 --rxq 2 --mbcache=512 --txd=4096 --rxd=4096 --forward-mode=txonly --stats-period 1 --tx-offloads=0x800e --tx-ip=10.0.1.4,10.0.1.5

Output:

EAL: Probing VFIO support...
EAL: Probe PCI driver: mlx5_pci (15b3:1016) device: 4dc4:00:02.0 (socket 0)
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
mlx5_pci: Default miss action is not supported.
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
EAL: No legacy callbacks, legacy socket not created
Set txonly packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=180224, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 1 (socket 0)
mlx5_pci: Failed to init cache list NIC_ingress_0_matcher_cache entry (nil).
mlx5_pci: port 0 failed to set defaults flows
Fail to start port 1
Please stop the ports first
Done
No commandline core given, start packet forwarding
Not all ports were started
[2J[1;1H
Port statistics ====================================
  ######################## NIC statistics for port 1  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

......


[2J[1;1H
Port statistics ====================================
  ######################## NIC statistics for port 1  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

Signal 15 received, preparing to exit...

Stopping port 1...
Stopping ports...
Done

Shutting down port 1...
Comment 6 Abhishek 2020-12-02 17:43:14 CET
Hi,

Are there any updates related to this issue?

Regards,
Abhishek
Comment 7 Asaf Penso 2020-12-03 10:27:46 CET
Hello Abhishek,
We couldn't not reproduce it locally and to provide your better and tigher support please send a mail to Mellanox Support Admin:
supportadmin at mellanox.com.

Regards,
Asaf Penso

>-----Original Message-----
>From: bugzilla@dpdk.org <bugzilla@dpdk.org>
>Sent: Wednesday, December 2, 2020 6:43 PM
>To: Asaf Penso <asafp@nvidia.com>
>Subject: [Bug 580] [dpdk-20.11.rc-4] testpmd: Ubuntu 16.04: Failed to start
>port as mlx5_pci can't allocate hash list.
>
>https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.
>dpdk.org%2Fshow_bug.cgi%3Fid%3D580&amp;data=04%7C01%7Casafp%40n
>vidia.com%7C460dd54ade7745ac4f2308d896e1676c%7C43083d15727340c1b7d
>b39efd9ccc17a%7C0%7C0%7C637425242143555461%7CUnknown%7CTWFpbG
>Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
>n0%3D%7C1000&amp;sdata=AA0DZNeyVON9Nxbz341oYJR8GI%2BgynWzdyut
>r67lj5Q%3D&amp;reserved=0
>
>--- Comment #6 from Abhishek (abhimarathe17@gmail.com) --- Hi,
>
>Are there any updates related to this issue?
>
>Regards,
>Abhishek
>
>--
>You are receiving this mail because:
>You are on the CC list for the bug.
Comment 8 Suanming Mou 2020-12-31 14:39:27 CET
Hi Abhishek,

Once we want to compile DPDK on ubuntu 16.04, compile will failed due to missing of /usr/include/infiniband/mlx5dv.h, I guess the original ubuntu 16.04's ibverbs-dev package will not provide that header file, can you please also share where do you get that header file?

Thanks,
SuanmingMou
Comment 9 Abhishek 2021-01-13 20:16:19 CET
Hi Suanming,

Please refer to doc [https://docs.microsoft.com/en-us/azure/virtual-network/setup-dpdk] for complete setup of DPDK on Linux VMs.

Thanks,
Abhishek
Comment 10 Asaf Penso 2021-01-28 13:23:50 CET
Hi Abhishek,
It looks like a mismatch between rdma-core and kernel.
Can we please let us know the exact rdma-core version you use?
Comment 11 Asaf Penso 2022-03-13 12:14:16 CET
Hello Abhishek, 
Can you please comment on the above question?
Comment 12 Asaf Penso 2022-08-28 22:11:23 CEST
Hello Abhishek, 
Can you please comment on the above question?

Note You need to log in before you can comment on or make changes to this bug.