[dpdk-dev] igb_uio: remove device reset in open

Message ID 20171020165511.47899-1-ferruh.yigit@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Ferruh Yigit Oct. 20, 2017, 4:55 p.m. UTC
  Remove device reset during application start, the reset for application
exit still there.

Reset in open removed because of following comments:
1- Device reset not completed when VF driver loaded, which cause VF PMD
   initialization error.
   Adding delay can solve the issue but will increase driver load time.

2- Reset will be issues all devices unconditionally, not very efficient
   way.

Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
Cc: Jianfeng Tan <jianfeng.tan@intel.com>
Cc: Jingjing Wu <jingjing.wu@intel.com>
Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Cc: Gregory Etelson <gregory@weka.io>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: George Prekas <george.prekas@epfl.ch>
Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 --
 1 file changed, 2 deletions(-)
  

Comments

Ferruh Yigit Oct. 20, 2017, 4:57 p.m. UTC | #1
On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> Remove device reset during application start, the reset for application
> exit still there.
> 
> Reset in open removed because of following comments:
> 1- Device reset not completed when VF driver loaded, which cause VF PMD
>    initialization error.
>    Adding delay can solve the issue but will increase driver load time.
> 
> 2- Reset will be issues all devices unconditionally, not very efficient
>    way.
> 
> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

Hi Jingjing, Shijith, Gregory, Harish,

Can you please test this on top of current master (which has already Jingjin's
fix) ?

Thanks,
ferruh

> ---
> Cc: Jianfeng Tan <jianfeng.tan@intel.com>
> Cc: Jingjing Wu <jingjing.wu@intel.com>
> Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> Cc: Gregory Etelson <gregory@weka.io>
> Cc: Harish Patil <harish.patil@cavium.com>
> Cc: George Prekas <george.prekas@epfl.ch>
> Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> ---
>  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index f7ef82554..fd320d87d 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -336,8 +336,6 @@ igbuio_pci_open(struct uio_info *info, struct inode *inode)
>  	struct pci_dev *dev = udev->pdev;
>  	int err;
>  
> -	pci_reset_function(dev);
> -
>  	/* set bus master, which was cleared by the reset function */
>  	pci_set_master(dev);
>  
>
  
Gregory Etelson Oct. 20, 2017, 7:01 p.m. UTC | #2
On Friday, 20 October 2017 19:57:38 IDT Ferruh Yigit wrote:
> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> > Remove device reset during application start, the reset for application
> > exit still there.
> > 
> > Reset in open removed because of following comments:
> > 1- Device reset not completed when VF driver loaded, which cause VF PMD
> > 
> >    initialization error.
> >    Adding delay can solve the issue but will increase driver load time.
> > 
> > 2- Reset will be issues all devices unconditionally, not very efficient
> > 
> >    way.
> > 
> > Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device
> > file") Cc: stable@dpdk.org
> > 
> > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Hi Jingjing, Shijith, Gregory, Harish,
> 
> Can you please test this on top of current master (which has already
> Jingjin's fix) ?
> 
> Thanks,
> ferruh
> 

sure.

> > ---
> > Cc: Jianfeng Tan <jianfeng.tan@intel.com>
> > Cc: Jingjing Wu <jingjing.wu@intel.com>
> > Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > Cc: Gregory Etelson <gregory@weka.io>
> > Cc: Harish Patil <harish.patil@cavium.com>
> > Cc: George Prekas <george.prekas@epfl.ch>
> > Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> > ---
> > 
> >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c index f7ef82554..fd320d87d
> > 100644
> > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > @@ -336,8 +336,6 @@ igbuio_pci_open(struct uio_info *info, struct inode
> > *inode)> 
> >  	struct pci_dev *dev = udev->pdev;
> >  	int err;
> > 
> > -	pci_reset_function(dev);
> > -
> > 
> >  	/* set bus master, which was cleared by the reset function */
> >  	pci_set_master(dev);
  
Patil, Harish Oct. 20, 2017, 10:18 p.m. UTC | #3
-----Original Message-----
From: Ferruh Yigit <ferruh.yigit@intel.com>

Date: Friday, October 20, 2017 at 9:57 AM
To: Jingjing Wu <jingjing.wu@intel.com>, "Thotton, Shijith"
<Shijith.Thotton@cavium.com>, Gregory Etelson <gregory@weka.io>, Harish
Patil <Harish.Patil@cavium.com>
Cc: Thomas Monjalon <thomas@monjalon.net>, "dev@dpdk.org" <dev@dpdk.org>,
"stable@dpdk.org" <stable@dpdk.org>, Jianfeng Tan
<jianfeng.tan@intel.com>, George Prekas <george.prekas@epfl.ch>, Sergio
Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Subject: Re: [PATCH] igb_uio: remove device reset in open

>On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

>> Remove device reset during application start, the reset for application

>> exit still there.

>> 

>> Reset in open removed because of following comments:

>> 1- Device reset not completed when VF driver loaded, which cause VF PMD

>>    initialization error.

>>    Adding delay can solve the issue but will increase driver load time.

>> 

>> 2- Reset will be issues all devices unconditionally, not very efficient

>>    way.

>> 

>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of

>>device file")

>> Cc: stable@dpdk.org

>> 

>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

>

>Hi Jingjing, Shijith, Gregory, Harish,

>

>Can you please test this on top of current master (which has already

>Jingjin's

>fix) ?

>

>Thanks,

>Ferruh



Ferruh,
Sure, will try and get back to you.
Thanks.
>

>> ---

>> Cc: Jianfeng Tan <jianfeng.tan@intel.com>

>> Cc: Jingjing Wu <jingjing.wu@intel.com>

>> Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com>

>> Cc: Gregory Etelson <gregory@weka.io>

>> Cc: Harish Patil <harish.patil@cavium.com>

>> Cc: George Prekas <george.prekas@epfl.ch>

>> Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>

>> ---

>>  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 --

>>  1 file changed, 2 deletions(-)

>> 

>> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

>>b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

>> index f7ef82554..fd320d87d 100644

>> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

>> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

>> @@ -336,8 +336,6 @@ igbuio_pci_open(struct uio_info *info, struct inode

>>*inode)

>>  	struct pci_dev *dev = udev->pdev;

>>  	int err;

>>  

>> -	pci_reset_function(dev);

>> -

>>  	/* set bus master, which was cleared by the reset function */

>>  	pci_set_master(dev);

>>  

>> 

>
  
Shijith Thotton Oct. 23, 2017, 12:28 p.m. UTC | #4
On Fri, Oct 20, 2017 at 09:57:38AM -0700, Ferruh Yigit wrote:
> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> > Remove device reset during application start, the reset for application
> > exit still there.
> > 
> > Reset in open removed because of following comments:
> > 1- Device reset not completed when VF driver loaded, which cause VF PMD
> >    initialization error.
> >    Adding delay can solve the issue but will increase driver load time.
> > 
> > 2- Reset will be issues all devices unconditionally, not very efficient
> >    way.
> > 
> > Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Hi Jingjing, Shijith, Gregory, Harish,
> 
> Can you please test this on top of current master (which has already Jingjin's
> fix) ?

Hi Ferruh,

Tested for LiquidIO cards and it works.

Thanks,
Shijith
  
Ferruh Yigit Oct. 23, 2017, 4:36 p.m. UTC | #5
On 10/23/2017 5:28 AM, Shijith Thotton wrote:
> On Fri, Oct 20, 2017 at 09:57:38AM -0700, Ferruh Yigit wrote:
>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>> Remove device reset during application start, the reset for application
>>> exit still there.
>>>
>>> Reset in open removed because of following comments:
>>> 1- Device reset not completed when VF driver loaded, which cause VF PMD
>>>    initialization error.
>>>    Adding delay can solve the issue but will increase driver load time.
>>>
>>> 2- Reset will be issues all devices unconditionally, not very efficient
>>>    way.
>>>
>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>
>> Hi Jingjing, Shijith, Gregory, Harish,
>>
>> Can you please test this on top of current master (which has already Jingjin's
>> fix) ?
> 
> Hi Ferruh,
> 
> Tested for LiquidIO cards and it works.

Thanks Shijith,

Is following commit needs to be reverted after pci_reset removed from open():

Commit: 9ed3f38770c6 ("net/liquidio: remove FLR request to PF driver")

Thanks,
ferruh

> 
> Thanks,
> Shijith
>
  
Shijith Thotton Oct. 23, 2017, 7:03 p.m. UTC | #6
On Mon, Oct 23, 2017 at 09:36:38AM -0700, Ferruh Yigit wrote:
> On 10/23/2017 5:28 AM, Shijith Thotton wrote:
> > On Fri, Oct 20, 2017 at 09:57:38AM -0700, Ferruh Yigit wrote:
> >> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> >>> Remove device reset during application start, the reset for application
> >>> exit still there.
> >>>
> >>> Reset in open removed because of following comments:
> >>> 1- Device reset not completed when VF driver loaded, which cause VF PMD
> >>>    initialization error.
> >>>    Adding delay can solve the issue but will increase driver load time.
> >>>
> >>> 2- Reset will be issues all devices unconditionally, not very efficient
> >>>    way.
> >>>
> >>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> >>
> >> Hi Jingjing, Shijith, Gregory, Harish,
> >>
> >> Can you please test this on top of current master (which has already Jingjin's
> >> fix) ?
> > 
> > Hi Ferruh,
> > 
> > Tested for LiquidIO cards and it works.
> 
> Thanks Shijith,
> 
> Is following commit needs to be reverted after pci_reset removed from open():
> 
> Commit: 9ed3f38770c6 ("net/liquidio: remove FLR request to PF driver")

Not required. The commit can stay.

Thanks,
Shijith
  
Jingjing Wu Oct. 24, 2017, 2:45 a.m. UTC | #7
> -----Original Message-----
> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> Sent: Monday, October 23, 2017 8:28 PM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Wu, Jingjing <jingjing.wu@intel.com>; Gregory Etelson <gregory@weka.io>;
> Harish Patil <harish.patil@cavium.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org; stable@dpdk.org; Tan, Jianfeng
> <jianfeng.tan@intel.com>; George Prekas <george.prekas@epfl.ch>; Gonzalez
> Monroy, Sergio <sergio.gonzalez.monroy@intel.com>
> Subject: Re: [PATCH] igb_uio: remove device reset in open
> 
> On Fri, Oct 20, 2017 at 09:57:38AM -0700, Ferruh Yigit wrote:
> > On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> > > Remove device reset during application start, the reset for
> > > application exit still there.
> > >
> > > Reset in open removed because of following comments:
> > > 1- Device reset not completed when VF driver loaded, which cause VF PMD
> > >    initialization error.
> > >    Adding delay can solve the issue but will increase driver load time.
> > >
> > > 2- Reset will be issues all devices unconditionally, not very efficient
> > >    way.
> > >
> > > Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
> > > device file")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> >
> > Hi Jingjing, Shijith, Gregory, Harish,
> >
> > Can you please test this on top of current master (which has already
> > Jingjin's
> > fix) ?
> 
> Hi Ferruh,
> 
> Tested for LiquidIO cards and it works.
> 
> Thanks,
> Shijith

Hi, Ferruh

For i40e, it's OK too.

Thanks
Jingjing
  
Ferruh Yigit Oct. 24, 2017, 9:38 p.m. UTC | #8
On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> Remove device reset during application start, the reset for application
> exit still there.
> 
> Reset in open removed because of following comments:
> 1- Device reset not completed when VF driver loaded, which cause VF PMD
>    initialization error.
>    Adding delay can solve the issue but will increase driver load time.
> 
> 2- Reset will be issues all devices unconditionally, not very efficient
>    way.
> 
> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

Tested-by: Harish Patil <harish.patil@cavium.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Tested-by: Jingjing Wu <jingjing.wu@intel.com>

Applied to dpdk/master, thanks.
  
Mody, Rasesh Oct. 25, 2017, 11:43 p.m. UTC | #9
Hi Ferruh,

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit

> Sent: Friday, October 20, 2017 9:58 AM

> 

> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

> > Remove device reset during application start, the reset for

> > application exit still there.

> >

> > Reset in open removed because of following comments:

> > 1- Device reset not completed when VF driver loaded, which cause VF PMD

> >    initialization error.

> >    Adding delay can solve the issue but will increase driver load time.

> >

> > 2- Reset will be issues all devices unconditionally, not very efficient

> >    way.

> >

> > Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of

> > device file")

> > Cc: stable@dpdk.org

> >

> > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

> 

> Hi Jingjing, Shijith, Gregory, Harish,

> 

> Can you please test this on top of current master (which has already Jingjin's

> fix) ?


The original FLR change during igb_uio open()/release() in DPDK17.08 also impacts BNX2X PMD and it exhibits the issues with bare metal testing.
 
Now, we tested this change for BNX2X PMD using latest dpdk, which has this fix where FLR is invoked only in the release(). However, we ran into an issue when trying to reload the testpmd application in quick succession. The pci reset, called during the igb_uio release() operation, is taking longer time and adapter is still doing the FLR when we relaunch the application. We see this behavior with bare metal testing.

Thanks!
-Rasesh

> 

> Thanks,

> ferruh

> 

> > ---

> > Cc: Jianfeng Tan <jianfeng.tan@intel.com>

> > Cc: Jingjing Wu <jingjing.wu@intel.com>

> > Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com>

> > Cc: Gregory Etelson <gregory@weka.io>

> > Cc: Harish Patil <harish.patil@cavium.com>

> > Cc: George Prekas <george.prekas@epfl.ch>

> > Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>

> > ---

> >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 --

> >  1 file changed, 2 deletions(-)

> >

> > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

> > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

> > index f7ef82554..fd320d87d 100644

> > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

> > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c

> > @@ -336,8 +336,6 @@ igbuio_pci_open(struct uio_info *info, struct inode

> *inode)

> >  	struct pci_dev *dev = udev->pdev;

> >  	int err;

> >

> > -	pci_reset_function(dev);

> > -

> >  	/* set bus master, which was cleared by the reset function */

> >  	pci_set_master(dev);

> >

> >
  
Jianfeng Tan Oct. 26, 2017, 9:28 a.m. UTC | #10
Hi Rasesh,


On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
> Hi Ferruh,
>
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Friday, October 20, 2017 9:58 AM
>>
>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>> Remove device reset during application start, the reset for
>>> application exit still there.
>>>
>>> Reset in open removed because of following comments:
>>> 1- Device reset not completed when VF driver loaded, which cause VF PMD
>>>     initialization error.
>>>     Adding delay can solve the issue but will increase driver load time.
>>>
>>> 2- Reset will be issues all devices unconditionally, not very efficient
>>>     way.
>>>
>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
>>> device file")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>> Hi Jingjing, Shijith, Gregory, Harish,
>>
>> Can you please test this on top of current master (which has already Jingjin's
>> fix) ?
> The original FLR change during igb_uio open()/release() in DPDK17.08 also impacts BNX2X PMD and it exhibits the issues with bare metal testing.
>   
> Now, we tested this change for BNX2X PMD using latest dpdk, which has this fix where FLR is invoked only in the release(). However, we ran into an issue when trying to reload the testpmd application in quick succession. The pci reset, called during the igb_uio release() operation, is taking longer time and adapter is still doing the FLR when we relaunch the application. We see this behavior with bare metal testing.

If we don't reset that device, it will continue working which is a more 
serious issue IMO.

How long does it take to reset BTW?

Thanks,
Jianfeng
  
Ferruh Yigit Oct. 27, 2017, 12:49 a.m. UTC | #11
On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:
> Hi Rasesh,
> 
> 
> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
>> Hi Ferruh,
>>
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>>> Sent: Friday, October 20, 2017 9:58 AM
>>>
>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>>> Remove device reset during application start, the reset for
>>>> application exit still there.
>>>>
>>>> Reset in open removed because of following comments:
>>>> 1- Device reset not completed when VF driver loaded, which cause VF PMD
>>>>     initialization error.
>>>>     Adding delay can solve the issue but will increase driver load time.
>>>>
>>>> 2- Reset will be issues all devices unconditionally, not very efficient
>>>>     way.
>>>>
>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
>>>> device file")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>> Hi Jingjing, Shijith, Gregory, Harish,
>>>
>>> Can you please test this on top of current master (which has already Jingjin's
>>> fix) ?
>> The original FLR change during igb_uio open()/release() in DPDK17.08 also impacts BNX2X PMD and it exhibits the issues with bare metal testing.
>>   
>> Now, we tested this change for BNX2X PMD using latest dpdk, which has this fix where FLR is invoked only in the release(). 

Good to hear this fixed the problem.

>> However, we ran into an issue when trying to reload the testpmd application in quick succession. The pci reset, called during the igb_uio release() operation, is taking longer time and adapter is still doing the FLR when we relaunch the application. We see this behavior with bare metal testing.
> 
> If we don't reset that device, it will continue working which is a more 
> serious issue IMO.

+1

> How long does it take to reset BTW?

I was wondering same thing.

> 
> Thanks,
> Jianfeng
>
  
Mody, Rasesh Nov. 1, 2017, 6:58 a.m. UTC | #12
Hi Jianfeng and Ferruh,

> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> Sent: Thursday, October 26, 2017 5:50 PM

> 

> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:

> > Hi Rasesh,

> >

> >

> > On 10/26/2017 7:43 AM, Mody, Rasesh wrote:

> >> Hi Ferruh,

> >>

> >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit

> >>> Sent: Friday, October 20, 2017 9:58 AM

> >>>

> >>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

> >>>> Remove device reset during application start, the reset for

> >>>> application exit still there.

> >>>>

> >>>> Reset in open removed because of following comments:

> >>>> 1- Device reset not completed when VF driver loaded, which cause VF

> PMD

> >>>>     initialization error.

> >>>>     Adding delay can solve the issue but will increase driver load time.

> >>>>

> >>>> 2- Reset will be issues all devices unconditionally, not very efficient

> >>>>     way.

> >>>>

> >>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of

> >>>> device file")

> >>>> Cc: stable@dpdk.org

> >>>>

> >>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

> >>> Hi Jingjing, Shijith, Gregory, Harish,

> >>>

> >>> Can you please test this on top of current master (which has already

> >>> Jingjin's

> >>> fix) ?

> >> The original FLR change during igb_uio open()/release() in DPDK17.08 also

> impacts BNX2X PMD and it exhibits the issues with bare metal testing.

> >>

> >> Now, we tested this change for BNX2X PMD using latest dpdk, which has

> this fix where FLR is invoked only in the release().

> 

> Good to hear this fixed the problem.


Yes, it fixed the issue caused by pci reset during application start.

> 

> >> However, we ran into an issue when trying to reload the testpmd

> application in quick succession. The pci reset, called during the igb_uio

> release() operation, is taking longer time and adapter is still doing the FLR

> when we relaunch the application. We see this behavior with bare metal

> testing.

> >

> > If we don't reset that device, it will continue working which is a

> > more serious issue IMO.

> 

> +1


I think, it would better for the individual PMDs to take care of the reset during the application exit.

> > How long does it take to reset BTW?

> 

> I was wondering same thing.


A five minutes delay was introduced for the reload of the application, however, we continue to see the issue with FLR during the pci release() operation.

Thanks!
-Rasesh

> 

> >

> > Thanks,

> > Jianfeng

> >
  
Stephen Hemminger Nov. 1, 2017, 2:12 p.m. UTC | #13
On Wed, 1 Nov 2017 06:58:53 +0000
"Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:

> Hi Jianfeng and Ferruh,
> 
> > From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> > Sent: Thursday, October 26, 2017 5:50 PM
> > 
> > On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:  
> > > Hi Rasesh,
> > >
> > >
> > > On 10/26/2017 7:43 AM, Mody, Rasesh wrote:  
> > >> Hi Ferruh,
> > >>  
> > >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> > >>> Sent: Friday, October 20, 2017 9:58 AM
> > >>>
> > >>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:  
> > >>>> Remove device reset during application start, the reset for
> > >>>> application exit still there.
> > >>>>
> > >>>> Reset in open removed because of following comments:
> > >>>> 1- Device reset not completed when VF driver loaded, which cause VF  
> > PMD  
> > >>>>     initialization error.
> > >>>>     Adding delay can solve the issue but will increase driver load time.
> > >>>>
> > >>>> 2- Reset will be issues all devices unconditionally, not very efficient
> > >>>>     way.
> > >>>>
> > >>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
> > >>>> device file")
> > >>>> Cc: stable@dpdk.org
> > >>>>
> > >>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>  
> > >>> Hi Jingjing, Shijith, Gregory, Harish,
> > >>>
> > >>> Can you please test this on top of current master (which has already
> > >>> Jingjin's
> > >>> fix) ?  
> > >> The original FLR change during igb_uio open()/release() in DPDK17.08 also  
> > impacts BNX2X PMD and it exhibits the issues with bare metal testing.  
> > >>
> > >> Now, we tested this change for BNX2X PMD using latest dpdk, which has  
> > this fix where FLR is invoked only in the release().
> > 
> > Good to hear this fixed the problem.  
> 
> Yes, it fixed the issue caused by pci reset during application start.
> 
> >   
> > >> However, we ran into an issue when trying to reload the testpmd  
> > application in quick succession. The pci reset, called during the igb_uio
> > release() operation, is taking longer time and adapter is still doing the FLR
> > when we relaunch the application. We see this behavior with bare metal
> > testing.  
> > >
> > > If we don't reset that device, it will continue working which is a
> > > more serious issue IMO.  
> > 
> > +1  
> 
> I think, it would better for the individual PMDs to take care of the reset during the application exit.

That will never be possible. Poll Mode Drivers are userspace entities and part of the application. If application crashes, there is no way for PMD to do cleanup, it must
be handled by kernel.
  
Mody, Rasesh Nov. 2, 2017, 8:03 a.m. UTC | #14
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, November 01, 2017 7:12 AM
> 
> On Wed, 1 Nov 2017 06:58:53 +0000
> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:
> 
> > Hi Jianfeng and Ferruh,
> >
> > > From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> > > Sent: Thursday, October 26, 2017 5:50 PM
> > >
> > > On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:
> > > > Hi Rasesh,
> > > >
> > > >
> > > > On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
> > > >> Hi Ferruh,
> > > >>
> > > >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh
> > > >>> Yigit
> > > >>> Sent: Friday, October 20, 2017 9:58 AM
> > > >>>
> > > >>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
> > > >>>> Remove device reset during application start, the reset for
> > > >>>> application exit still there.
> > > >>>>
> > > >>>> Reset in open removed because of following comments:
> > > >>>> 1- Device reset not completed when VF driver loaded, which
> > > >>>> cause VF
> > > PMD
> > > >>>>     initialization error.
> > > >>>>     Adding delay can solve the issue but will increase driver load time.
> > > >>>>
> > > >>>> 2- Reset will be issues all devices unconditionally, not very efficient
> > > >>>>     way.
> > > >>>>
> > > >>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and
> > > >>>> release of device file")
> > > >>>> Cc: stable@dpdk.org
> > > >>>>
> > > >>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > > >>> Hi Jingjing, Shijith, Gregory, Harish,
> > > >>>
> > > >>> Can you please test this on top of current master (which has
> > > >>> already Jingjin's
> > > >>> fix) ?
> > > >> The original FLR change during igb_uio open()/release() in
> > > >> DPDK17.08 also
> > > impacts BNX2X PMD and it exhibits the issues with bare metal testing.
> > > >>
> > > >> Now, we tested this change for BNX2X PMD using latest dpdk, which
> > > >> has
> > > this fix where FLR is invoked only in the release().
> > >
> > > Good to hear this fixed the problem.
> >
> > Yes, it fixed the issue caused by pci reset during application start.
> >
> > >
> > > >> However, we ran into an issue when trying to reload the testpmd
> > > application in quick succession. The pci reset, called during the
> > > igb_uio
> > > release() operation, is taking longer time and adapter is still
> > > doing the FLR when we relaunch the application. We see this behavior
> > > with bare metal testing.
> > > >
> > > > If we don't reset that device, it will continue working which is a
> > > > more serious issue IMO.
> > >
> > > +1
> >
> > I think, it would better for the individual PMDs to take care of the reset
> during the application exit.
> 
> That will never be possible. Poll Mode Drivers are userspace entities and part
> of the application. If application crashes, there is no way for PMD to do
> cleanup, it must be handled by kernel.

The pci reset in release is breaking the BNX2X PMD. Could we revert this reset and get it included with a solution that works for all in the next release?

Thanks!
-Rasesh
  
Ferruh Yigit Nov. 2, 2017, 8:55 a.m. UTC | #15
On 11/2/2017 1:03 AM, Mody, Rasesh wrote:
>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>> Sent: Wednesday, November 01, 2017 7:12 AM
>>
>> On Wed, 1 Nov 2017 06:58:53 +0000
>> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:
>>
>>> Hi Jianfeng and Ferruh,
>>>
>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>> Sent: Thursday, October 26, 2017 5:50 PM
>>>>
>>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:
>>>>> Hi Rasesh,
>>>>>
>>>>>
>>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
>>>>>> Hi Ferruh,
>>>>>>
>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh
>>>>>>> Yigit
>>>>>>> Sent: Friday, October 20, 2017 9:58 AM
>>>>>>>
>>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>>>>>>> Remove device reset during application start, the reset for
>>>>>>>> application exit still there.
>>>>>>>>
>>>>>>>> Reset in open removed because of following comments:
>>>>>>>> 1- Device reset not completed when VF driver loaded, which
>>>>>>>> cause VF
>>>> PMD
>>>>>>>>     initialization error.
>>>>>>>>     Adding delay can solve the issue but will increase driver load time.
>>>>>>>>
>>>>>>>> 2- Reset will be issues all devices unconditionally, not very efficient
>>>>>>>>     way.
>>>>>>>>
>>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and
>>>>>>>> release of device file")
>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>
>>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>>>>>> Hi Jingjing, Shijith, Gregory, Harish,
>>>>>>>
>>>>>>> Can you please test this on top of current master (which has
>>>>>>> already Jingjin's
>>>>>>> fix) ?
>>>>>> The original FLR change during igb_uio open()/release() in
>>>>>> DPDK17.08 also
>>>> impacts BNX2X PMD and it exhibits the issues with bare metal testing.
>>>>>>
>>>>>> Now, we tested this change for BNX2X PMD using latest dpdk, which
>>>>>> has
>>>> this fix where FLR is invoked only in the release().
>>>>
>>>> Good to hear this fixed the problem.
>>>
>>> Yes, it fixed the issue caused by pci reset during application start.
>>>
>>>>
>>>>>> However, we ran into an issue when trying to reload the testpmd
>>>> application in quick succession. The pci reset, called during the
>>>> igb_uio
>>>> release() operation, is taking longer time and adapter is still
>>>> doing the FLR when we relaunch the application. We see this behavior
>>>> with bare metal testing.
>>>>>
>>>>> If we don't reset that device, it will continue working which is a
>>>>> more serious issue IMO.
>>>>
>>>> +1
>>>
>>> I think, it would better for the individual PMDs to take care of the reset
>> during the application exit.
>>
>> That will never be possible. Poll Mode Drivers are userspace entities and part
>> of the application. If application crashes, there is no way for PMD to do
>> cleanup, it must be handled by kernel.
> 
> The pci reset in release is breaking the BNX2X PMD. Could we revert this reset and get it included with a solution that works for all in the next release?

Hi Rasesh,

I am not sure if there is more to do for solution for next releases, and related
to your case, indeed I wasn't expecting a device reset will take more than five
minutes...

Would you be OK to control the reset via a compile time config option, which is
enabled by default. So you will need to disable it to prevent the reset?

> 
> Thanks!
> -Rasesh
>
  
Mody, Rasesh Nov. 2, 2017, 5:34 p.m. UTC | #16
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> Sent: Thursday, November 02, 2017 1:55 AM

> 

> On 11/2/2017 1:03 AM, Mody, Rasesh wrote:

> >> From: Stephen Hemminger [mailto:stephen@networkplumber.org]

> >> Sent: Wednesday, November 01, 2017 7:12 AM

> >>

> >> On Wed, 1 Nov 2017 06:58:53 +0000

> >> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:

> >>

> >>> Hi Jianfeng and Ferruh,

> >>>

> >>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >>>> Sent: Thursday, October 26, 2017 5:50 PM

> >>>>

> >>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:

> >>>>> Hi Rasesh,

> >>>>>

> >>>>>

> >>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:

> >>>>>> Hi Ferruh,

> >>>>>>

> >>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh

> >>>>>>> Yigit

> >>>>>>> Sent: Friday, October 20, 2017 9:58 AM

> >>>>>>>

> >>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

> >>>>>>>> Remove device reset during application start, the reset for

> >>>>>>>> application exit still there.

> >>>>>>>>

> >>>>>>>> Reset in open removed because of following comments:

> >>>>>>>> 1- Device reset not completed when VF driver loaded, which

> >>>>>>>> cause VF

> >>>> PMD

> >>>>>>>>     initialization error.

> >>>>>>>>     Adding delay can solve the issue but will increase driver load

> time.

> >>>>>>>>

> >>>>>>>> 2- Reset will be issues all devices unconditionally, not very

> efficient

> >>>>>>>>     way.

> >>>>>>>>

> >>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and

> >>>>>>>> release of device file")

> >>>>>>>> Cc: stable@dpdk.org

> >>>>>>>>

> >>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

> >>>>>>> Hi Jingjing, Shijith, Gregory, Harish,

> >>>>>>>

> >>>>>>> Can you please test this on top of current master (which has

> >>>>>>> already Jingjin's

> >>>>>>> fix) ?

> >>>>>> The original FLR change during igb_uio open()/release() in

> >>>>>> DPDK17.08 also

> >>>> impacts BNX2X PMD and it exhibits the issues with bare metal testing.

> >>>>>>

> >>>>>> Now, we tested this change for BNX2X PMD using latest dpdk, which

> >>>>>> has

> >>>> this fix where FLR is invoked only in the release().

> >>>>

> >>>> Good to hear this fixed the problem.

> >>>

> >>> Yes, it fixed the issue caused by pci reset during application start.

> >>>

> >>>>

> >>>>>> However, we ran into an issue when trying to reload the testpmd

> >>>> application in quick succession. The pci reset, called during the

> >>>> igb_uio

> >>>> release() operation, is taking longer time and adapter is still

> >>>> doing the FLR when we relaunch the application. We see this

> >>>> behavior with bare metal testing.

> >>>>>

> >>>>> If we don't reset that device, it will continue working which is a

> >>>>> more serious issue IMO.

> >>>>

> >>>> +1

> >>>

> >>> I think, it would better for the individual PMDs to take care of the

> >>> reset

> >> during the application exit.

> >>

> >> That will never be possible. Poll Mode Drivers are userspace entities

> >> and part of the application. If application crashes, there is no way

> >> for PMD to do cleanup, it must be handled by kernel.

> >

> > The pci reset in release is breaking the BNX2X PMD. Could we revert this

> reset and get it included with a solution that works for all in the next release?

> 

> Hi Rasesh,

> 

> I am not sure if there is more to do for solution for next releases, and related

> to your case, indeed I wasn't expecting a device reset will take more than

> five minutes...

> 

> Would you be OK to control the reset via a compile time config option, which

> is enabled by default. So you will need to disable it to prevent the reset?


Hi Ferruh,

As I understand, we will have a compile time config option, enabled by default, to guard the pci_reset_function() in the igbuio_pci_release(). We will disable this config option to prevent the reset when using BNX2X.
The controlled reset should work for us.

Thanks!
-Rasesh
  
Ferruh Yigit Nov. 2, 2017, 6:09 p.m. UTC | #17
On 11/2/2017 10:34 AM, Mody, Rasesh wrote:
>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>> Sent: Thursday, November 02, 2017 1:55 AM
>>
>> On 11/2/2017 1:03 AM, Mody, Rasesh wrote:
>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>>> Sent: Wednesday, November 01, 2017 7:12 AM
>>>>
>>>> On Wed, 1 Nov 2017 06:58:53 +0000
>>>> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:
>>>>
>>>>> Hi Jianfeng and Ferruh,
>>>>>
>>>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>>>> Sent: Thursday, October 26, 2017 5:50 PM
>>>>>>
>>>>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:
>>>>>>> Hi Rasesh,
>>>>>>>
>>>>>>>
>>>>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
>>>>>>>> Hi Ferruh,
>>>>>>>>
>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh
>>>>>>>>> Yigit
>>>>>>>>> Sent: Friday, October 20, 2017 9:58 AM
>>>>>>>>>
>>>>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>>>>>>>>> Remove device reset during application start, the reset for
>>>>>>>>>> application exit still there.
>>>>>>>>>>
>>>>>>>>>> Reset in open removed because of following comments:
>>>>>>>>>> 1- Device reset not completed when VF driver loaded, which
>>>>>>>>>> cause VF
>>>>>> PMD
>>>>>>>>>>     initialization error.
>>>>>>>>>>     Adding delay can solve the issue but will increase driver load
>> time.
>>>>>>>>>>
>>>>>>>>>> 2- Reset will be issues all devices unconditionally, not very
>> efficient
>>>>>>>>>>     way.
>>>>>>>>>>
>>>>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and
>>>>>>>>>> release of device file")
>>>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>>>>>>>> Hi Jingjing, Shijith, Gregory, Harish,
>>>>>>>>>
>>>>>>>>> Can you please test this on top of current master (which has
>>>>>>>>> already Jingjin's
>>>>>>>>> fix) ?
>>>>>>>> The original FLR change during igb_uio open()/release() in
>>>>>>>> DPDK17.08 also
>>>>>> impacts BNX2X PMD and it exhibits the issues with bare metal testing.
>>>>>>>>
>>>>>>>> Now, we tested this change for BNX2X PMD using latest dpdk, which
>>>>>>>> has
>>>>>> this fix where FLR is invoked only in the release().
>>>>>>
>>>>>> Good to hear this fixed the problem.
>>>>>
>>>>> Yes, it fixed the issue caused by pci reset during application start.
>>>>>
>>>>>>
>>>>>>>> However, we ran into an issue when trying to reload the testpmd
>>>>>> application in quick succession. The pci reset, called during the
>>>>>> igb_uio
>>>>>> release() operation, is taking longer time and adapter is still
>>>>>> doing the FLR when we relaunch the application. We see this
>>>>>> behavior with bare metal testing.
>>>>>>>
>>>>>>> If we don't reset that device, it will continue working which is a
>>>>>>> more serious issue IMO.
>>>>>>
>>>>>> +1
>>>>>
>>>>> I think, it would better for the individual PMDs to take care of the
>>>>> reset
>>>> during the application exit.
>>>>
>>>> That will never be possible. Poll Mode Drivers are userspace entities
>>>> and part of the application. If application crashes, there is no way
>>>> for PMD to do cleanup, it must be handled by kernel.
>>>
>>> The pci reset in release is breaking the BNX2X PMD. Could we revert this
>> reset and get it included with a solution that works for all in the next release?
>>
>> Hi Rasesh,
>>
>> I am not sure if there is more to do for solution for next releases, and related
>> to your case, indeed I wasn't expecting a device reset will take more than
>> five minutes...
>>
>> Would you be OK to control the reset via a compile time config option, which
>> is enabled by default. So you will need to disable it to prevent the reset?
> 
> Hi Ferruh,
> 
> As I understand, we will have a compile time config option, enabled by default, to guard the pci_reset_function() in the igbuio_pci_release(). We will disable this config option to prevent the reset when using BNX2X.

Yep, this is the idea.

> The controlled reset should work for us.

If there is no objection, I can send a patch for this.

> 
> Thanks!
> -Rasesh
>
  
Mody, Rasesh Nov. 2, 2017, 6:45 p.m. UTC | #18
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> Sent: Thursday, November 02, 2017 11:10 AM

> 

> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:

> >> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >> Sent: Thursday, November 02, 2017 1:55 AM

> >>

> >> On 11/2/2017 1:03 AM, Mody, Rasesh wrote:

> >>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]

> >>>> Sent: Wednesday, November 01, 2017 7:12 AM

> >>>>

> >>>> On Wed, 1 Nov 2017 06:58:53 +0000

> >>>> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:

> >>>>

> >>>>> Hi Jianfeng and Ferruh,

> >>>>>

> >>>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >>>>>> Sent: Thursday, October 26, 2017 5:50 PM

> >>>>>>

> >>>>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:

> >>>>>>> Hi Rasesh,

> >>>>>>>

> >>>>>>>

> >>>>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:

> >>>>>>>> Hi Ferruh,

> >>>>>>>>

> >>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh

> >>>>>>>>> Yigit

> >>>>>>>>> Sent: Friday, October 20, 2017 9:58 AM

> >>>>>>>>>

> >>>>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

> >>>>>>>>>> Remove device reset during application start, the reset for

> >>>>>>>>>> application exit still there.

> >>>>>>>>>>

> >>>>>>>>>> Reset in open removed because of following comments:

> >>>>>>>>>> 1- Device reset not completed when VF driver loaded, which

> >>>>>>>>>> cause VF

> >>>>>> PMD

> >>>>>>>>>>     initialization error.

> >>>>>>>>>>     Adding delay can solve the issue but will increase driver

> >>>>>>>>>> load

> >> time.

> >>>>>>>>>>

> >>>>>>>>>> 2- Reset will be issues all devices unconditionally, not very

> >> efficient

> >>>>>>>>>>     way.

> >>>>>>>>>>

> >>>>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and

> >>>>>>>>>> release of device file")

> >>>>>>>>>> Cc: stable@dpdk.org

> >>>>>>>>>>

> >>>>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

> >>>>>>>>> Hi Jingjing, Shijith, Gregory, Harish,

> >>>>>>>>>

> >>>>>>>>> Can you please test this on top of current master (which has

> >>>>>>>>> already Jingjin's

> >>>>>>>>> fix) ?

> >>>>>>>> The original FLR change during igb_uio open()/release() in

> >>>>>>>> DPDK17.08 also

> >>>>>> impacts BNX2X PMD and it exhibits the issues with bare metal

> testing.

> >>>>>>>>

> >>>>>>>> Now, we tested this change for BNX2X PMD using latest dpdk,

> >>>>>>>> which has

> >>>>>> this fix where FLR is invoked only in the release().

> >>>>>>

> >>>>>> Good to hear this fixed the problem.

> >>>>>

> >>>>> Yes, it fixed the issue caused by pci reset during application start.

> >>>>>

> >>>>>>

> >>>>>>>> However, we ran into an issue when trying to reload the testpmd

> >>>>>> application in quick succession. The pci reset, called during the

> >>>>>> igb_uio

> >>>>>> release() operation, is taking longer time and adapter is still

> >>>>>> doing the FLR when we relaunch the application. We see this

> >>>>>> behavior with bare metal testing.

> >>>>>>>

> >>>>>>> If we don't reset that device, it will continue working which is

> >>>>>>> a more serious issue IMO.

> >>>>>>

> >>>>>> +1

> >>>>>

> >>>>> I think, it would better for the individual PMDs to take care of

> >>>>> the reset

> >>>> during the application exit.

> >>>>

> >>>> That will never be possible. Poll Mode Drivers are userspace

> >>>> entities and part of the application. If application crashes, there

> >>>> is no way for PMD to do cleanup, it must be handled by kernel.

> >>>

> >>> The pci reset in release is breaking the BNX2X PMD. Could we revert

> >>> this

> >> reset and get it included with a solution that works for all in the next

> release?

> >>

> >> Hi Rasesh,

> >>

> >> I am not sure if there is more to do for solution for next releases,

> >> and related to your case, indeed I wasn't expecting a device reset

> >> will take more than five minutes...

> >>

> >> Would you be OK to control the reset via a compile time config

> >> option, which is enabled by default. So you will need to disable it to

> prevent the reset?

> >

> > Hi Ferruh,

> >

> > As I understand, we will have a compile time config option, enabled by

> default, to guard the pci_reset_function() in the igbuio_pci_release(). We

> will disable this config option to prevent the reset when using BNX2X.

> 

> Yep, this is the idea.

> 

> > The controlled reset should work for us.

> 

> If there is no objection, I can send a patch for this.


We are ok as we have at least some way to disable the reset, please send a patch.

Thanks!
-Rasesh
  
Ferruh Yigit Nov. 3, 2017, 12:31 a.m. UTC | #19
On 11/2/2017 11:45 AM, Mody, Rasesh wrote:
>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>> Sent: Thursday, November 02, 2017 11:10 AM
>>
>> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:
>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>> Sent: Thursday, November 02, 2017 1:55 AM
>>>>
>>>> On 11/2/2017 1:03 AM, Mody, Rasesh wrote:
>>>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>>>>> Sent: Wednesday, November 01, 2017 7:12 AM
>>>>>>
>>>>>> On Wed, 1 Nov 2017 06:58:53 +0000
>>>>>> "Mody, Rasesh" <Rasesh.Mody@cavium.com> wrote:
>>>>>>
>>>>>>> Hi Jianfeng and Ferruh,
>>>>>>>
>>>>>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>>>>>> Sent: Thursday, October 26, 2017 5:50 PM
>>>>>>>>
>>>>>>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:
>>>>>>>>> Hi Rasesh,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:
>>>>>>>>>> Hi Ferruh,
>>>>>>>>>>
>>>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh
>>>>>>>>>>> Yigit
>>>>>>>>>>> Sent: Friday, October 20, 2017 9:58 AM
>>>>>>>>>>>
>>>>>>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:
>>>>>>>>>>>> Remove device reset during application start, the reset for
>>>>>>>>>>>> application exit still there.
>>>>>>>>>>>>
>>>>>>>>>>>> Reset in open removed because of following comments:
>>>>>>>>>>>> 1- Device reset not completed when VF driver loaded, which
>>>>>>>>>>>> cause VF
>>>>>>>> PMD
>>>>>>>>>>>>     initialization error.
>>>>>>>>>>>>     Adding delay can solve the issue but will increase driver
>>>>>>>>>>>> load
>>>> time.
>>>>>>>>>>>>
>>>>>>>>>>>> 2- Reset will be issues all devices unconditionally, not very
>>>> efficient
>>>>>>>>>>>>     way.
>>>>>>>>>>>>
>>>>>>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and
>>>>>>>>>>>> release of device file")
>>>>>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>>>>>>>>>> Hi Jingjing, Shijith, Gregory, Harish,
>>>>>>>>>>>
>>>>>>>>>>> Can you please test this on top of current master (which has
>>>>>>>>>>> already Jingjin's
>>>>>>>>>>> fix) ?
>>>>>>>>>> The original FLR change during igb_uio open()/release() in
>>>>>>>>>> DPDK17.08 also
>>>>>>>> impacts BNX2X PMD and it exhibits the issues with bare metal
>> testing.
>>>>>>>>>>
>>>>>>>>>> Now, we tested this change for BNX2X PMD using latest dpdk,
>>>>>>>>>> which has
>>>>>>>> this fix where FLR is invoked only in the release().
>>>>>>>>
>>>>>>>> Good to hear this fixed the problem.
>>>>>>>
>>>>>>> Yes, it fixed the issue caused by pci reset during application start.
>>>>>>>
>>>>>>>>
>>>>>>>>>> However, we ran into an issue when trying to reload the testpmd
>>>>>>>> application in quick succession. The pci reset, called during the
>>>>>>>> igb_uio
>>>>>>>> release() operation, is taking longer time and adapter is still
>>>>>>>> doing the FLR when we relaunch the application. We see this
>>>>>>>> behavior with bare metal testing.
>>>>>>>>>
>>>>>>>>> If we don't reset that device, it will continue working which is
>>>>>>>>> a more serious issue IMO.
>>>>>>>>
>>>>>>>> +1
>>>>>>>
>>>>>>> I think, it would better for the individual PMDs to take care of
>>>>>>> the reset
>>>>>> during the application exit.
>>>>>>
>>>>>> That will never be possible. Poll Mode Drivers are userspace
>>>>>> entities and part of the application. If application crashes, there
>>>>>> is no way for PMD to do cleanup, it must be handled by kernel.
>>>>>
>>>>> The pci reset in release is breaking the BNX2X PMD. Could we revert
>>>>> this
>>>> reset and get it included with a solution that works for all in the next
>> release?
>>>>
>>>> Hi Rasesh,
>>>>
>>>> I am not sure if there is more to do for solution for next releases,
>>>> and related to your case, indeed I wasn't expecting a device reset
>>>> will take more than five minutes...
>>>>
>>>> Would you be OK to control the reset via a compile time config
>>>> option, which is enabled by default. So you will need to disable it to
>> prevent the reset?
>>>
>>> Hi Ferruh,
>>>
>>> As I understand, we will have a compile time config option, enabled by
>> default, to guard the pci_reset_function() in the igbuio_pci_release(). We
>> will disable this config option to prevent the reset when using BNX2X.
>>
>> Yep, this is the idea.
>>
>>> The controlled reset should work for us.
>>
>> If there is no objection, I can send a patch for this.
> 
> We are ok as we have at least some way to disable the reset, please send a patch.

Sent http://dpdk.org/dev/patchwork/patch/31143/, can you please test?

> 
> Thanks!
> -Rasesh
>
  
Mody, Rasesh Nov. 3, 2017, 7:18 p.m. UTC | #20
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> Sent: Thursday, November 02, 2017 5:32 PM

> 

> On 11/2/2017 11:45 AM, Mody, Rasesh wrote:

> >> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >> Sent: Thursday, November 02, 2017 11:10 AM

> >>

> >> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:

> >>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >>>> Sent: Thursday, November 02, 2017 1:55 AM

> >>>>

> >>>> On 11/2/2017 1:03 AM, Mody, Rasesh wrote:

> >>>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]

> >>>>>> Sent: Wednesday, November 01, 2017 7:12 AM

> >>>>>>

> >>>>>> On Wed, 1 Nov 2017 06:58:53 +0000 "Mody, Rasesh"

> >>>>>> <Rasesh.Mody@cavium.com> wrote:

> >>>>>>

> >>>>>>> Hi Jianfeng and Ferruh,

> >>>>>>>

> >>>>>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]

> >>>>>>>> Sent: Thursday, October 26, 2017 5:50 PM

> >>>>>>>>

> >>>>>>>> On 10/26/2017 2:28 AM, Tan, Jianfeng wrote:

> >>>>>>>>> Hi Rasesh,

> >>>>>>>>>

> >>>>>>>>>

> >>>>>>>>> On 10/26/2017 7:43 AM, Mody, Rasesh wrote:

> >>>>>>>>>> Hi Ferruh,

> >>>>>>>>>>

> >>>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of

> Ferruh

> >>>>>>>>>>> Yigit

> >>>>>>>>>>> Sent: Friday, October 20, 2017 9:58 AM

> >>>>>>>>>>>

> >>>>>>>>>>> On 10/20/2017 9:55 AM, Ferruh Yigit wrote:

> >>>>>>>>>>>> Remove device reset during application start, the reset for

> >>>>>>>>>>>> application exit still there.

> >>>>>>>>>>>>

> >>>>>>>>>>>> Reset in open removed because of following comments:

> >>>>>>>>>>>> 1- Device reset not completed when VF driver loaded, which

> >>>>>>>>>>>> cause VF

> >>>>>>>> PMD

> >>>>>>>>>>>>     initialization error.

> >>>>>>>>>>>>     Adding delay can solve the issue but will increase

> >>>>>>>>>>>> driver load

> >>>> time.

> >>>>>>>>>>>>

> >>>>>>>>>>>> 2- Reset will be issues all devices unconditionally, not

> >>>>>>>>>>>> very

> >>>> efficient

> >>>>>>>>>>>>     way.

> >>>>>>>>>>>>

> >>>>>>>>>>>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and

> >>>>>>>>>>>> release of device file")

> >>>>>>>>>>>> Cc: stable@dpdk.org

> >>>>>>>>>>>>

> >>>>>>>>>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

> >>>>>>>>>>> Hi Jingjing, Shijith, Gregory, Harish,

> >>>>>>>>>>>

> >>>>>>>>>>> Can you please test this on top of current master (which has

> >>>>>>>>>>> already Jingjin's

> >>>>>>>>>>> fix) ?

> >>>>>>>>>> The original FLR change during igb_uio open()/release() in

> >>>>>>>>>> DPDK17.08 also

> >>>>>>>> impacts BNX2X PMD and it exhibits the issues with bare metal

> >> testing.

> >>>>>>>>>>

> >>>>>>>>>> Now, we tested this change for BNX2X PMD using latest dpdk,

> >>>>>>>>>> which has

> >>>>>>>> this fix where FLR is invoked only in the release().

> >>>>>>>>

> >>>>>>>> Good to hear this fixed the problem.

> >>>>>>>

> >>>>>>> Yes, it fixed the issue caused by pci reset during application start.

> >>>>>>>

> >>>>>>>>

> >>>>>>>>>> However, we ran into an issue when trying to reload the

> >>>>>>>>>> testpmd

> >>>>>>>> application in quick succession. The pci reset, called during

> >>>>>>>> the igb_uio

> >>>>>>>> release() operation, is taking longer time and adapter is still

> >>>>>>>> doing the FLR when we relaunch the application. We see this

> >>>>>>>> behavior with bare metal testing.

> >>>>>>>>>

> >>>>>>>>> If we don't reset that device, it will continue working which

> >>>>>>>>> is a more serious issue IMO.

> >>>>>>>>

> >>>>>>>> +1

> >>>>>>>

> >>>>>>> I think, it would better for the individual PMDs to take care of

> >>>>>>> the reset

> >>>>>> during the application exit.

> >>>>>>

> >>>>>> That will never be possible. Poll Mode Drivers are userspace

> >>>>>> entities and part of the application. If application crashes,

> >>>>>> there is no way for PMD to do cleanup, it must be handled by kernel.

> >>>>>

> >>>>> The pci reset in release is breaking the BNX2X PMD. Could we

> >>>>> revert this

> >>>> reset and get it included with a solution that works for all in the

> >>>> next

> >> release?

> >>>>

> >>>> Hi Rasesh,

> >>>>

> >>>> I am not sure if there is more to do for solution for next

> >>>> releases, and related to your case, indeed I wasn't expecting a

> >>>> device reset will take more than five minutes...

> >>>>

> >>>> Would you be OK to control the reset via a compile time config

> >>>> option, which is enabled by default. So you will need to disable it

> >>>> to

> >> prevent the reset?

> >>>

> >>> Hi Ferruh,

> >>>

> >>> As I understand, we will have a compile time config option, enabled

> >>> by

> >> default, to guard the pci_reset_function() in the

> >> igbuio_pci_release(). We will disable this config option to prevent the

> reset when using BNX2X.

> >>

> >> Yep, this is the idea.

> >>

> >>> The controlled reset should work for us.

> >>

> >> If there is no objection, I can send a patch for this.

> >

> > We are ok as we have at least some way to disable the reset, please send a

> patch.

> 

> Sent http://dpdk.org/dev/patchwork/patch/31143/, can you please test?


The testing of BNX2X looks OK with this patch. However, the solution has following drawbacks:
 - an application will need to be recompiled to have the igb_uio kernel module rebuilt to support bnx2x devices
 - this will break pre-compiled solutions that are provided with an OS such as RHOSP or as part of a pre-compiled VNF

We can live with this temporary solution for now. In the long term, we may have to revisit this.
We are also looking at why bnx2x FLR is taking this long.

Thanks!
-Rasesh
  
Thomas Monjalon Nov. 3, 2017, 7:24 p.m. UTC | #21
03/11/2017 20:18, Mody, Rasesh:
> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> > On 11/2/2017 11:45 AM, Mody, Rasesh wrote:
> > > From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
> > >> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:
> > > We are ok as we have at least some way to disable the reset, please send a
> > patch.
> > 
> > Sent http://dpdk.org/dev/patchwork/patch/31143/, can you please test?
> 
> The testing of BNX2X looks OK with this patch. However, the solution has following drawbacks:
>  - an application will need to be recompiled to have the igb_uio kernel module rebuilt to support bnx2x devices
>  - this will break pre-compiled solutions that are provided with an OS such as RHOSP or as part of a pre-compiled VNF
> 
> We can live with this temporary solution for now. In the long term, we may have to revisit this.
> We are also looking at why bnx2x FLR is taking this long.

Yes, I think the long term solution should be to fix your PMD
or your firmware.
Please keep us posted when you find the root cause with your device.
Thanks
  
Ferruh Yigit Nov. 3, 2017, 10:20 p.m. UTC | #22
On 11/3/2017 12:24 PM, Thomas Monjalon wrote:
> 03/11/2017 20:18, Mody, Rasesh:
>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>> On 11/2/2017 11:45 AM, Mody, Rasesh wrote:
>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>>> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:
>>>> We are ok as we have at least some way to disable the reset, please send a
>>> patch.
>>>
>>> Sent http://dpdk.org/dev/patchwork/patch/31143/, can you please test?
>>
>> The testing of BNX2X looks OK with this patch. However, the solution has following drawbacks:
>>  - an application will need to be recompiled to have the igb_uio kernel module rebuilt to support bnx2x devices
>>  - this will break pre-compiled solutions that are provided with an OS such as RHOSP or as part of a pre-compiled VNF
>>
>> We can live with this temporary solution for now. In the long term, we may have to revisit this.
>> We are also looking at why bnx2x FLR is taking this long.
> 
> Yes, I think the long term solution should be to fix your PMD
> or your firmware.
> Please keep us posted when you find the root cause with your device.
> Thanks

Lee Roberts suggested doing device specific action instead of build time option.

I send following:
http://dpdk.org/dev/patchwork/patch/31168/

What do you think about this approach?

Meanwhile I will send a more proper patch to discuss on.


Thanks,
ferruh
  
Ferruh Yigit Nov. 3, 2017, 10:39 p.m. UTC | #23
On 11/3/2017 3:20 PM, Ferruh Yigit wrote:
> On 11/3/2017 12:24 PM, Thomas Monjalon wrote:
>> 03/11/2017 20:18, Mody, Rasesh:
>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>> On 11/2/2017 11:45 AM, Mody, Rasesh wrote:
>>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>>>> On 11/2/2017 10:34 AM, Mody, Rasesh wrote:
>>>>> We are ok as we have at least some way to disable the reset, please send a
>>>> patch.
>>>>
>>>> Sent http://dpdk.org/dev/patchwork/patch/31143/, can you please test?
>>>
>>> The testing of BNX2X looks OK with this patch. However, the solution has following drawbacks:
>>>  - an application will need to be recompiled to have the igb_uio kernel module rebuilt to support bnx2x devices
>>>  - this will break pre-compiled solutions that are provided with an OS such as RHOSP or as part of a pre-compiled VNF
>>>
>>> We can live with this temporary solution for now. In the long term, we may have to revisit this.
>>> We are also looking at why bnx2x FLR is taking this long.
>>
>> Yes, I think the long term solution should be to fix your PMD
>> or your firmware.
>> Please keep us posted when you find the root cause with your device.
>> Thanks
> 
> Lee Roberts suggested doing device specific action instead of build time option.
> 
> I send following:
> http://dpdk.org/dev/patchwork/patch/31168/
> 
> What do you think about this approach?
> 
> Meanwhile I will send a more proper patch to discuss on.

http://dpdk.org/dev/patchwork/patch/31169/
  

Patch

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index f7ef82554..fd320d87d 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -336,8 +336,6 @@  igbuio_pci_open(struct uio_info *info, struct inode *inode)
 	struct pci_dev *dev = udev->pdev;
 	int err;
 
-	pci_reset_function(dev);
-
 	/* set bus master, which was cleared by the reset function */
 	pci_set_master(dev);