[dpdk-stable] [dpdk-dev] [PATCH 2/2] app/testpmd: fix invalid port detaching

Matan Azrad matan at mellanox.com
Thu Jan 23 15:05:20 CET 2020


Hi

From: Yigit, Ferruh
> On 11/12/2019 8:47 AM, Matan Azrad wrote:
> > The port was not validated before detaching.
> >
> > Ignore port detach operation when the port is not valid.
> >
> > Fixes: f8e5baa2662d ("app/testpmd: check not detaching device twice")
> > Cc: thomas at monjalon.net
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Matan Azrad <matan at mellanox.com>
> > ---
> >  app/test-pmd/testpmd.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > 4444346..370eefe 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -2545,6 +2545,9 @@ struct extmem_param {
> >
> >  	printf("Removing a device...\n");
> >
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> > +		return;
> > +
> >  	dev = rte_eth_devices[port_id].device;
> >  	if (dev == NULL) {
> >  		printf("Device already removed\n");
> >
> 
> The patch is already in 19.11 [1] but it is breaking the testpmd hotplug
> support.
> Before 'detach_port_device()' called, the port has been stopped and closed
> [2], which will make port fail from 'port_id_is_invalid()' check and the device
> removal path never fully called.
> The implication is, since device not detached, vfio request interrupt keeps
> triggered continuously and re-starts the detach path, but because of the half
> cleaned device it fails and app gets stuck with a continuous log [3].
> 
> I wonder if the actual hotplug has been tested with this patch, the commit
> log is not clear about the motivation and implication of the patch, I am not
> clear why this check is added but I am sending a patch soon to remove it
> back.

The motivation of this patch was to prevent double detach on same port, so the user cannot call detach of invalid port.

I agree this patch is not good and we need a fix but I think the bug is conceptual.

Testpmd tries to do detach by port_id which is derived by ethdev port id while detach work with rte_device.

For example:
you can see in the line above after +++: dev = rte_eth_devices[port_id].device,
Testpmd may access invalid  or reallocated ethdev structure to get the device name and may even detach unwanted rte_device.

So, detach is broken with and without this patch.


I think Testpmd should change the concept of rte_device mapping and put attention to next:
1. Don't detach by ethdev port ID.
2. Multiple ethdev port IDs may related to the same rte_device.

The Testpmd user should be sure that all the port IDs of the rte_device are released before the detach call and Testpmd maybe need to validate it.
And like attach, detach should be triggered by PCI address \ rte_device name.


Matan





















 
  


> Regards,
> ferruh
> 
> 
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.dp
> dk.org%2Fdpdk%2Fcommit%2F%3Fid%3D43d0e304980a1527bcac92dc679057
> b189e2545a&data=02%7C01%7Cmatan%40mellanox.com%7Cc3f40356d
> d124e20faf708d7a006e68c%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7
> C0%7C637153823809699996&sdata=dBy9m%2BxCA%2Bme1IpX2LqPARa
> 62giznKi8Xbtu220GA%2Bg%3D&reserved=0
> 
> [2]
> rmv_port_callback
>   stop_port(port_id);
>   close_port(port_id);
>   detach_port_device(port_id);
> 
> [3]
> EAL: can not get port by device 0000:00:05.0!
> EAL: can not get port by device 0000:00:05.0!
> EAL: can not get port by device 0000:00:05.0!
> EAL: can not get port by device 0000:00:05.0!
> EAL: can not get port by device 0000:00:05.0!
> EAL: can not get port by device 0000:00:05.0!
> ...


More information about the stable mailing list