[dpdk-stable] [dpdk-dev] [PATCH] net/failsafe: fix fd leak

Gaëtan Rivet grive at u256.net
Tue May 5 20:35:27 CEST 2020


On 05/05/20 09:14 +0000, Ali Alnubani wrote:
> > -----Original Message-----
> > From: Gaëtan Rivet <grive at u256.net>
> > Sent: Monday, May 4, 2020 7:22 PM
> > To: Ali Alnubani <alialnu at mellanox.com>
> > Cc: Ferruh Yigit <ferruh.yigit at intel.com>; wangyunjian
> > <wangyunjian at huawei.com>; dev at dpdk.org; jerry.lilijun at huawei.com;
> > xudingke at huawei.com; stable at dpdk.org; Raslan Darawsheh
> > <rasland at mellanox.com>
> > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> > 
> > On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces at dpdk.org> On Behalf Of Ferruh Yigit
> > > > Sent: Monday, April 27, 2020 7:56 PM
> > > > To: Gaëtan Rivet <grive at u256.net>; wangyunjian
> > > > <wangyunjian at huawei.com>
> > > > Cc: dev at dpdk.org; jerry.lilijun at huawei.com; xudingke at huawei.com;
> > > > stable at dpdk.org
> > > > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd
> > > > leak
> > > >
> > > > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > > > > On 27/04/20 18:44 +0800, wangyunjian wrote:
> > > > >> From: Yunjian Wang <wangyunjian at huawei.com>
> > > > >>
> > > > >> Zero is a valid fd. The fd won't be closed thus leading fd leak,
> > > > >> when it is zero.
> > > > >>
> > > > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx
> > > > >> interrupts")
> > > > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt
> > > > >> mode")
> > > > >> Cc: stable at dpdk.org
> > > > >>
> > > > >
> > > > > Hello Yunjian,
> > > > >
> > > > > Nothing prevents a DPDK app from closing 0 and getting it from
> > > > > another call, good catch.
> > > > >
> > > > >> Signed-off-by: Yunjian Wang <wangyunjian at huawei.com>
> > > > >
> > > > > Acked-by: Gaetan Rivet <grive at u256.net>
> > > >
> > > > Applied to dpdk-next-net/master, thanks.
> > >
> > > This patch is causing Testpmd to quit when I issue a "port stop" command.
> > Testpmd log:
> > >
> > > """
> > > x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i
> > > --forward-mode=mac
> > > EAL: Detected 8 lcore(s)
> > > EAL: Detected 1 NUMA nodes
> > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > > EAL: Selected IOVA mode 'PA'
> > > EAL: No available hugepages reported in hugepages-1048576kB
> > > EAL: Probing VFIO support...
> > > EAL: PCI device 0002:00:02.0 on NUMA socket 0
> > > EAL:   probe driver: 15b3:1004 net_mlx4
> > > Interactive-mode selected
> > > Set mac packet forwarding mode
> > > Warning: NUMA should be configured manually by using --port-numa-config
> > and --ring-numa-config parameters along with --numa.
> > > testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456,
> > > size=2176, socket=0
> > > testpmd: preferred mempool ops selected: ring_mp_mc
> > >
> > > Warning! port-topology=paired and odd forward ports number, the last port
> > will pair with itself.
> > >
> > > Configuring Port 1 (socket 0)
> > > Port 1: 00:15:5D:26:2B:00
> > > Checking link statuses...
> > > Done
> > > testpmd> port stop 1
> > > Stopping ports...
> > > Checking link statuses...
> > > Done
> > > testpmd>
> > > Stopping port 1...
> > > Stopping ports...
> > > Done
> > >
> > > Shutting down port 1...
> > > Closing ports...
> > > Done
> > >
> > > Bye...
> > > """
> > >
> > > My terminal gets broken at this point, and I have to reinitialize it with a
> > "reset".
> > >
> > > - Ali
> > 
> > Hi Ali,
> > 
> > Thanks for the report, I am looking into it.
> > 
> > Are you testing failsafe on Azure?
> 
> This reproduces with Failsafe, but not necessarily on Azure. You can try to reproduce on any platform if you pass something like '-w 00:00.0 --vdev="net_failsafe0,dev(0000:08:00.0)"'.
> 

Hi,

Indeed, I am able to reproduce the issue using this command:
   bash> ./build/app/dpdk-testpmd -n4 -m 4096 --no-huge --vdev='net_failsafe0,dev(net_ring0)' -- -i
(no need of PCI bus nor hugepages to validate failsafe sometimes).

I was asking about Azure because you did not give the command line
options for fail-safe, so I assumed it had been probed automagically.

I made a fix, will send soon.

Regards,
-- 
Gaëtan


More information about the stable mailing list