[dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device
lihuisong (C)
lihuisong at huawei.com
Tue Feb 8 02:19:45 CET 2022
在 2022/2/4 20:07, Ferruh Yigit 写道:
> On 10/25/2021 7:39 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong at huawei.com>
>>
>> Stopping a bond device also stops all active slaves under the bond
>> device.
>> If this port is bond device, we need to modify the port status of all
>> slaves from RTE_PORT_STARTED to RTE_PORT_STOPPED.
>>
>> Fixes: 0e545d3047fe ("app/testpmd: check stopping port is not in
>> bonding")
>> Cc: stable at dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong at huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29 at huawei.com>
>> ---
>> app/test-pmd/cmdline.c | 1 +
>> app/test-pmd/testpmd.c | 49 +++++++++++++++++++++++++++++++++++++++---
>> app/test-pmd/testpmd.h | 3 ++-
>> 3 files changed, 49 insertions(+), 4 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index 722f4fb9d9..5bfb4b509b 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -6639,6 +6639,7 @@ static void
>> cmd_create_bonded_device_parsed(void *parsed_result,
>> "Failed to enable promiscuous mode for port %u: %s
>> - ignore\n",
>> port_id, rte_strerror(-ret));
>> + ports[port_id].bond_flag = 1;
>> ports[port_id].need_setup = 0;
>> ports[port_id].port_status = RTE_PORT_STOPPED;
>> }
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index af0e79fe6d..d6b9ebc4dd 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -65,6 +65,9 @@
>> #ifdef RTE_EXEC_ENV_WINDOWS
>> #include <process.h>
>> #endif
>> +#ifdef RTE_NET_BOND
>> +#include <rte_eth_bond.h>
>> +#endif
>> #include "testpmd.h"
>> @@ -2986,6 +2989,35 @@ start_port(portid_t pid)
>> return 0;
>> }
>> +#ifdef RTE_NET_BOND
>> +static void
>> +change_bonding_active_slave_port_status(portid_t bond_pid)
>
> The function sets the status explicitly to PORT_STOPPED, but function
> name is more generic, should we update the function name to reflect the
> functionality?
ok
>
>> +{
>> + portid_t slave_pids[RTE_MAX_ETHPORTS];
>> + struct rte_port *port;
>> + int num_active_slaves;
>> + portid_t slave_pid;
>> + int i;
>> +
>> + num_active_slaves = rte_eth_bond_active_slaves_get(bond_pid,
>> slave_pids,
>> + RTE_MAX_ETHPORTS);
>> + if (num_active_slaves < 0) {
>> + fprintf(stderr, "Failed to get slave list for port = %u\n",
>> + bond_pid);
>> + return;
>> + }
>> +
>> + for (i = 0; i < num_active_slaves; i++) {
>> + slave_pid = slave_pids[i];
>> + port = &ports[slave_pid];
>> + if (rte_atomic16_cmpset(&(port->port_status),
>> + RTE_PORT_STARTED, RTE_PORT_STOPPED) == 0)
>> + fprintf(stderr, "Port %u can not be set into stopped\n",
>> + slave_pid);
>> + }
>> +}
>> +#endif
>> +
>> void
>> stop_port(portid_t pid)
>> {
>> @@ -3042,9 +3074,20 @@ stop_port(portid_t pid)
>> if (port->flow_list)
>> port_flow_flush(pi);
>> - if (eth_dev_stop_mp(pi) != 0)
>> - RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
>> - pi);
>
> Can you please remove the 'eth_dev_stop_mp()' function in this patch,
> which is removed in patch 2/3.
ok
>
>> + if (is_proc_primary()) {
>> +#ifdef RTE_NET_BOND
>> + /*
>> + * Stopping a bond device also stops all active slaves
>> + * under the bond device. If this port is bond device,
>> + * we need to modify the port status of all slaves.
>> + */
>> + if (port->bond_flag == 1)
>> + change_bonding_active_slave_port_status(pi);
>> +#endif
>> + if (rte_eth_dev_stop(pi) != 0)
>> + RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port
>> %u\n",
>> + pi);
>
> Should we roll back the slave port status if 'rte_eth_dev_stop(pi)'
> fails?
Yes, it is necessary here for slaves to fail to execute dev_stop() in
bonding driver.
Btw, in thinking about this, I find a behavior that is not very reasonable.
Namely, only active slaves are stopped when a bonding device is stopped.
It can cause confusion in port status. For example, applications have to
only modify
active slaves status to RTE_PORT_STOPPED and non-active slaves status is
still
RTE_PORT_STARTED.
I think the bonding PMD should stop all slaves when a bonding device is
stopped.
I checked the modification history about this in the bonding PMD. This
behavior is
introduced by the following patch.
/*
commit 0911d4ec01839c9149a0df5758d00d9d57a47cea
Author: Radu Nicolau <radu.nicolau at intel.com>
Date: Thu Nov 8 15:26:42 2018 +0000
net/bonding: fix crash when stopping mode 4 port
When stopping a bonded port all slaves are deactivated. Attempting
to deactivate a slave that was never activated will result in a
segfault
when mode 4 is used.
Fixes: 7486331308f6 ("net/bonding: stop and deactivate slaves on stop")
Cc: stable at dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau at intel.com>
Acked-by: Chas Williams <chas3 at att.com>
*/
The root cause of the problem the above patch mentioned is that in mode 4,
the bonding PMD does not allocate rx/tx rings to non-active slave devices.
The call stack is as follows:
#0 0x0000000000b1250c in rte_ring_dequeue_bulk_elem (available=0x0,
n=1, esize=8, obj_table=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring_elem.h:380
#1 rte_ring_dequeue_elem (esize=8, obj_p=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring_elem.h:476
#2 rte_ring_dequeue (obj_p=0xffffffff7c80, r=0x0) at
../dpdk-next-net/lib/ring/rte_ring.h:463
#3 bond_mode_8023ad_deactivate_slave (bond_dev=0x4753200
<rte_eth_devices+33024>, slave_id=0) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_8023ad.c:1163
#4 0x0000000000b29e10 in deactivate_slave (eth_dev=0x4753200
<rte_eth_devices+33024>, port_id=0) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_api.c:117
#5 0x0000000000b44208 in bond_ethdev_stop (eth_dev=0x4753200
<rte_eth_devices+33024>) at
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_pmd.c:2103
#6 0x00000000007966fc in rte_eth_dev_stop (port_id=2) at
../dpdk-next-net/lib/ethdev/rte_ethdev.c:1894
#7 0x000000000055ea60 in eth_dev_stop_mp (port_id=2) at
../dpdk-next-net/app/test-pmd/testpmd.c:613
#8 0x0000000000565230 in stop_port (pid=2) at
../dpdk-next-net/app/test-pmd/testpmd.c:3059
#9 0x00000000004f7614 in cmd_operate_specific_port_parsed
(parsed_result=0xffffffff91b0, cl=0x4829250, data=0x0) at
../dpdk-next-net/app/test-pmd/cmdline.c:1261
#10 0x000000000078be24 in cmdline_parse (cl=0x4829250, buf=0x4829298
"port stop 2\n") at ../dpdk-next-net/lib/cmdline/cmdline_parse.c:290
#11 0x0000000000789c34 in cmdline_valid_buffer (rdl=0x4829260,
buf=0x4829298 "port stop 2\n", size=13) at
../dpdk-next-net/lib/cmdline/cmdline.c:26
#12 0x000000000078f160 in rdline_char_in (rdl=0x4829260, c=10 '\n') at
../dpdk-next-net/lib/cmdline/cmdline_rdline.c:446
#13 0x000000000078a0c8 in cmdline_in (cl=0x4829250, buf=0xfffffffff2e7
"\n", size=1) at ../dpdk-next-net/lib/cmdline/cmdline.c:148
#14 0x000000000078a3b4 in cmdline_interact (cl=0x4829250) at
../dpdk-next-net/lib/cmdline/cmdline.c:222
#15 0x000000000050bf98 in prompt () at
../dpdk-next-net/app/test-pmd/cmdline.c:18001
#16 0x00000000005687c4 in main (argc=4, argv=0xfffffffff510) at
../dpdk-next-net/app/test-pmd/testpmd.c:4268
For the problem Radu encountered, we only need to ensure that
non-active slaves doesn't deactivate.
I plan to add a patch in this patchset to fix this problem.
What do you think, Ferruh?
>
>> + }
>> if (rte_atomic16_cmpset(&(port->port_status),
>> RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index e3995d24ab..ad3b4f875c 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -237,7 +237,8 @@ struct rte_port {
>> struct rte_eth_txconf tx_conf[RTE_MAX_QUEUES_PER_PORT+1]; /**<
>> per queue tx configuration */
>> struct rte_ether_addr *mc_addr_pool; /**< pool of multicast
>> addrs */
>> uint32_t mc_addr_nb; /**< nb. of addr. in
>> mc_addr_pool */
>> - uint8_t slave_flag; /**< bonding slave port */
>> + uint8_t slave_flag : 1, /**< bonding slave port */
>> + bond_flag : 1; /**< port is bond device */
>
> Can't we detect if the port is a bonding port without introducing a new
> variable/state?
The bonding device is also an ethdev. I do not find the external API that
can be used to detect whether a port is a bonding port.
>
>> struct port_flow *flow_list; /**< Associated flows. */
>> struct port_indirect_action *actions_list;
>> /**< Associated indirect actions. */
>
> .
More information about the dev
mailing list