Bug 1415 - Calling rte_eth_bond_8023ad_dedicated_queues_enable() leads to exhaustion of the LACP packet pool
Summary: Calling rte_eth_bond_8023ad_dedicated_queues_enable() leads to exhaustion of ...
Status: UNCONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: ethdev (show other bugs)
Version: 23.11
Hardware: All All
: Normal normal
Target Milestone: ---
Assignee: dev
URL:
Depends on:
Blocks:
 
Reported: 2024-04-11 18:50 CEST by Michel Machado
Modified: 2024-04-11 23:59 CEST (History)
0 users



Attachments

Description Michel Machado 2024-04-11 18:50:28 CEST
When dedicated queues are enabled on a bond interface by calling rte_eth_bond_8023ad_dedicated_queues_enable(), DPDK eventually starts repeatedly loggin "tx_machine(580) - Failed to allocate LACP packet from pool". According to the code of tx_machine(), this log entry means that the mbuf pool created to send LACP packets (i.e. port->mbuf_pool) is exhausted.

The problem occurs about 10 minutes after my application (i.e. https://github.com/AltraMayor/gatekeeper ) starts, and I can reproduce it with one or two members in the bond interface. The problem does not occur if I remove the call to rte_eth_bond_8023ad_dedicated_queues_enable().
Comment 1 Michel Machado 2024-04-11 23:53:08 CEST
I was able to reproduce the problem with dpdk-testpmd as well. I started dpdk-testpmd as follows:

sudo dpdk-testpmd -- -i

I executed the following commands on the prompt to set things up:

create bonding device 4 0
add bonding member 1 3
add bonding member 2 3
set bonding lacp dedicated_queues 3 enable
port start 3
set fwd flowgen
set portlist 3

Then, I checked everything was set up fine with the commands below:

show port info all
show bonding config 3
show config fwd
show port stats all

The physical ports corresponding to ports 1 and 2 were connected to LACP ports on another machine, which allowed me to check the status of the LACP on this other machine.

Finally, I started the forward process:

start

It took almost 1 hour, but I eventually got the log entry "tx_machine(580) - Failed to allocate LACP packet from pool" nonstop.
Comment 2 Michel Machado 2024-04-11 23:59:27 CEST
Someone else trying to reproduce this problem will likely get "tx_machine(579)" instead of "tx_machine(580)" because I've added the line "#define RTE_LIBRTE_BOND_DEBUG_8023AD 1" at the beginning of the file rte_eth_bond_8023ad.c to have more information in the log.

Note You need to log in before you can comment on or make changes to this bug.