[dpdk-dev] app/testpmd: fix failsafe PMD failure on exit

Message ID 20180522183509.66644-1-ferruh.yigit@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Ferruh Yigit May 22, 2018, 6:35 p.m. UTC
  vdevs detach on testpmd exit implemented as workaround to fix
a virtio-user issue. The issue was virtio-user cleanup is not
called and existing socket file not cleaned up which will fail
next run.

The vdev cleanup causing problems in failsafe PMD.

Reduce the cleanup to only virtio-user and add a comment that this
workaround should be converted to a proper cleanup, not something
specific to virtio-user, and not something specific to vdev and
testpmd.

Fixes: fe890955114d ("app/testpmd: fix exit for vdevs")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
Cc: Zhiyong Yang <zhiyong.yang@intel.com>
Cc: Bernard Iremonger <bernard.iremonger@intel.com>
Cc: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/testpmd.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)
  

Comments

Thomas Monjalon May 22, 2018, 7:47 p.m. UTC | #1
22/05/2018 20:35, Ferruh Yigit:
> vdevs detach on testpmd exit implemented as workaround to fix
> a virtio-user issue. The issue was virtio-user cleanup is not
> called and existing socket file not cleaned up which will fail
> next run.
> 
> The vdev cleanup causing problems in failsafe PMD.
> 
> Reduce the cleanup to only virtio-user and add a comment that this
> workaround should be converted to a proper cleanup, not something
> specific to virtio-user, and not something specific to vdev and
> testpmd.
> 
> Fixes: fe890955114d ("app/testpmd: fix exit for vdevs")
> 
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

OK to squash it with above patch.
Thanks for the update.

Note: this patch is not related to failsafe.
There was a deadlock introduced in 18.05-rc1 when detaching failsafe,
but it is fixed by using a recursive lock in vdev.
  
Ferruh Yigit May 22, 2018, 8:49 p.m. UTC | #2
On 5/22/2018 8:47 PM, Thomas Monjalon wrote:
> 22/05/2018 20:35, Ferruh Yigit:
>> vdevs detach on testpmd exit implemented as workaround to fix
>> a virtio-user issue. The issue was virtio-user cleanup is not
>> called and existing socket file not cleaned up which will fail
>> next run.
>>
>> The vdev cleanup causing problems in failsafe PMD.
>>
>> Reduce the cleanup to only virtio-user and add a comment that this
>> workaround should be converted to a proper cleanup, not something
>> specific to virtio-user, and not something specific to vdev and
>> testpmd.
>>
>> Fixes: fe890955114d ("app/testpmd: fix exit for vdevs")
>>
>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> OK to squash it with above patch.

Squashed into relevant commit in next-net, thanks.

> Thanks for the update.
> 
> Note: this patch is not related to failsafe.
> There was a deadlock introduced in 18.05-rc1 when detaching failsafe,
> but it is fixed by using a recursive lock in vdev.

Thanks for the clarification, I mixed the issues.
  
Ferruh Yigit May 22, 2018, 8:50 p.m. UTC | #3
On 5/22/2018 9:49 PM, Ferruh Yigit wrote:
> On 5/22/2018 8:47 PM, Thomas Monjalon wrote:
>> 22/05/2018 20:35, Ferruh Yigit:
>>> vdevs detach on testpmd exit implemented as workaround to fix
>>> a virtio-user issue. The issue was virtio-user cleanup is not
>>> called and existing socket file not cleaned up which will fail
>>> next run.
>>>
>>> The vdev cleanup causing problems in failsafe PMD.
>>>
>>> Reduce the cleanup to only virtio-user and add a comment that this
>>> workaround should be converted to a proper cleanup, not something
>>> specific to virtio-user, and not something specific to vdev and
>>> testpmd.
>>>
>>> Fixes: fe890955114d ("app/testpmd: fix exit for vdevs")
>>>
>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>
>> OK to squash it with above patch.
> 
> Squashed into relevant commit in next-net, thanks.

Commit message updated after squash, @Zhiyong @Thomas please shout if something
is wrong in commit log.
  
Yuanhan Liu May 27, 2018, 4:06 a.m. UTC | #4
On Tue, May 22, 2018 at 07:35:08PM +0100, Ferruh Yigit wrote:
> vdevs detach on testpmd exit implemented as workaround to fix
> a virtio-user issue. The issue was virtio-user cleanup is not
> called and existing socket file not cleaned up which will fail
> next run.
> 
> The vdev cleanup causing problems in failsafe PMD.
> 
> Reduce the cleanup to only virtio-user and add a comment that this
> workaround should be converted to a proper cleanup, not something
> specific to virtio-user, and not something specific to vdev and
> testpmd.
> 
> Fixes: fe890955114d ("app/testpmd: fix exit for vdevs")
> 
...
>  pmd_test_exit(void)
>  {
> -	const struct rte_bus *bus;
>  	struct rte_device *device;
>  	portid_t pt_id;
>  	int ret;
> @@ -2025,13 +2024,21 @@ pmd_test_exit(void)
>  	if (ports != NULL) {
>  		no_link_check = 1;
>  		RTE_ETH_FOREACH_DEV(pt_id) {
> -			device = rte_eth_devices[pt_id].device;
> -			bus = rte_bus_find_by_device(device);
>  			printf("\nShutting down port %d...\n", pt_id);
>  			fflush(stdout);
>  			stop_port(pt_id);
>  			close_port(pt_id);
> -			if (bus && !strcmp(bus->name, "vdev"))
> +
> +			/*
> +			 * This is a workaround to fix a virtio-user issue that
> +			 * requires to call clean-up routine to remove existing
> +			 * socket.

I came across this patch while I was cherry-picking patches to 17.11.4
release. And this patch seems wrong to me.

Any particular reason why the socket removal can not be done in virtio-user
pmd, say at its close method?

	--yliu
> +			 * This workaround valid only for testpmd, needs a fix
> +			 * valid for all applications.
> +			 * TODO: Implement proper resource cleanup
> +			 */
> +			device = rte_eth_devices[pt_id].device;
> +			if (device && !strcmp(device->driver->name, "net_virtio_user"))
>  				detach_port(pt_id);
>  		}
>  	}
> -- 
> 2.14.3
  
Thomas Monjalon May 27, 2018, 12:37 p.m. UTC | #5
27/05/2018 06:06, Yuanhan Liu:
> On Tue, May 22, 2018 at 07:35:08PM +0100, Ferruh Yigit wrote:
> > +			/*
> > +			 * This is a workaround to fix a virtio-user issue that
> > +			 * requires to call clean-up routine to remove existing
> > +			 * socket.
> 
> I came across this patch while I was cherry-picking patches to 17.11.4
> release. And this patch seems wrong to me.

Yes it is far from perfect.

> Any particular reason why the socket removal can not be done in virtio-user
> pmd, say at its close method?

The socket is removed in the remove function of the driver.
The right fix is to call the remove functions of all driver from
the EAL cleanup function.
We have decided of this last minute workaround for testpmd
because we need it for testing convenience, but do not want to
take any risk with a proper fix as it is really late for that.


> > +			 * This workaround valid only for testpmd, needs a fix
> > +			 * valid for all applications.
> > +			 * TODO: Implement proper resource cleanup
> > +			 */
> > +			device = rte_eth_devices[pt_id].device;
> > +			if (device && !strcmp(device->driver->name, "net_virtio_user"))
> >  				detach_port(pt_id);
> >  		}
> >  	}
>
  
Yuanhan Liu June 2, 2018, 5:01 a.m. UTC | #6
On Sun, May 27, 2018 at 02:37:28PM +0200, Thomas Monjalon wrote:
> 27/05/2018 06:06, Yuanhan Liu:
> > On Tue, May 22, 2018 at 07:35:08PM +0100, Ferruh Yigit wrote:
> > > +			/*
> > > +			 * This is a workaround to fix a virtio-user issue that
> > > +			 * requires to call clean-up routine to remove existing
> > > +			 * socket.
> > 
> > I came across this patch while I was cherry-picking patches to 17.11.4
> > release. And this patch seems wrong to me.
> 
> Yes it is far from perfect.
> 
> > Any particular reason why the socket removal can not be done in virtio-user
> > pmd, say at its close method?
> 
> The socket is removed in the remove function of the driver.
> The right fix is to call the remove functions of all driver from
> the EAL cleanup function.
> We have decided of this last minute workaround for testpmd
> because we need it for testing convenience, but do not want to
> take any risk with a proper fix as it is really late for that.

If there must be a workaround, I would do it at virtio-user pmd.

	--yliu
> 
> 
> > > +			 * This workaround valid only for testpmd, needs a fix
> > > +			 * valid for all applications.
> > > +			 * TODO: Implement proper resource cleanup
> > > +			 */
> > > +			device = rte_eth_devices[pt_id].device;
> > > +			if (device && !strcmp(device->driver->name, "net_virtio_user"))
> > >  				detach_port(pt_id);
> > >  		}
> > >  	}
> > 
> 
> 
> 
>
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index cfa6da60c..c3990d693 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2014,7 +2014,6 @@  detach_port(portid_t port_id)
 void
 pmd_test_exit(void)
 {
-	const struct rte_bus *bus;
 	struct rte_device *device;
 	portid_t pt_id;
 	int ret;
@@ -2025,13 +2024,21 @@  pmd_test_exit(void)
 	if (ports != NULL) {
 		no_link_check = 1;
 		RTE_ETH_FOREACH_DEV(pt_id) {
-			device = rte_eth_devices[pt_id].device;
-			bus = rte_bus_find_by_device(device);
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
 			close_port(pt_id);
-			if (bus && !strcmp(bus->name, "vdev"))
+
+			/*
+			 * This is a workaround to fix a virtio-user issue that
+			 * requires to call clean-up routine to remove existing
+			 * socket.
+			 * This workaround valid only for testpmd, needs a fix
+			 * valid for all applications.
+			 * TODO: Implement proper resource cleanup
+			 */
+			device = rte_eth_devices[pt_id].device;
+			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port(pt_id);
 		}
 	}