[dpdk-dev,v2] igb_uio: issue FLR during open and release of device file

Message ID 1499426031-2664-1-git-send-email-shijith.thotton@caviumnetworks.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Shijith Thotton July 7, 2017, 11:13 a.m. UTC
  Set UIO info device file operations open and release. Call pci reset
function inside open and release to clear device state at start and end.
Copied this behaviour from vfio_pci kernel module code. With this patch,
it is not mandatory to issue FLR by PMD's during init and close.

Bus master enable and disable are added in open and release respectively
to take care of device DMA.

Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
---
v2 changes:
 - Replaced pci_try_reset_function with pci_reset_function as it is not
   available in older kernel versions.

v1 changes:
 - Added pci set master inside open and clear master inside release.
 - Remove obvious comments.

RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html

 lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33 +++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
  

Comments

Ferruh Yigit July 7, 2017, 3:10 p.m. UTC | #1
On 7/7/2017 12:13 PM, Shijith Thotton wrote:
> Set UIO info device file operations open and release. Call pci reset
> function inside open and release to clear device state at start and end.
> Copied this behaviour from vfio_pci kernel module code. With this patch,
> it is not mandatory to issue FLR by PMD's during init and close.
> 
> Bus master enable and disable are added in open and release respectively
> to take care of device DMA.
> 
> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>

Gregory,

Would you mind testing this one?

Thanks,
ferruh
  
Gregory Etelson July 10, 2017, 3:07 a.m. UTC | #2
Hello Ferruh,

I could not reproduce server crash with the patch.
However, some tests report ixgbe_vf_pmd and i40e_vf_pmd 
do not receive and transmit frames after process restart,
although PMD initialization completed successfully
Is there a way to collect PF firmware dump for investigation ?

Regards,
Gregory

On Friday, 7 July 2017 18:10:40 IDT Ferruh Yigit wrote:
> On 7/7/2017 12:13 PM, Shijith Thotton wrote:
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> > 
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> > 
> > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> 
> Gregory,
> 
> Would you mind testing this one?
> 
> Thanks,
> ferruh
> 
>
  
Jianfeng Tan July 10, 2017, 3:38 a.m. UTC | #3
Hi Thotton,

> -----Original Message-----
> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> Sent: Friday, July 7, 2017 7:14 PM
> To: dev@dpdk.org
> Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> Tan, Jianfeng; Lu, Wenzhuo
> Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> 
> Set UIO info device file operations open and release. Call pci reset
> function inside open and release to clear device state at start and end.
> Copied this behaviour from vfio_pci kernel module code. With this patch,
> it is not mandatory to issue FLR by PMD's during init and close.

I'm afraid this will not work for restarted DPDK process. In current probe(), we set up the I/O mem and I/O port; and those sys files are used by EAL IGB_UIO initialization code to map I/O mem and port. After reset in release(), we will lose those sys files in next open().

Thanks,
Jianfeng

> 
> Bus master enable and disable are added in open and release respectively
> to take care of device DMA.
> 
> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> ---
> v2 changes:
>  - Replaced pci_try_reset_function with pci_reset_function as it is not
>    available in older kernel versions.
> 
> v1 changes:
>  - Added pci set master inside open and clear master inside release.
>  - Remove obvious comments.
> 
> RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html
> 
>  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33
> +++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index b9d427c..07a19a3 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -170,6 +170,37 @@ struct rte_uio_pci_dev {
>  	return IRQ_HANDLED;
>  }
> 
> +/**
> + * This gets called while opening uio device file.
> + */
> +static int
> +igbuio_pci_open(struct uio_info *info, struct inode *inode)
> +{
> +	struct rte_uio_pci_dev *udev = info->priv;
> +	struct pci_dev *dev = udev->pdev;
> +
> +	pci_reset_function(dev);
> +
> +	/* set bus master, which was cleared by the reset function */
> +	pci_set_master(dev);
> +
> +	return 0;
> +}
> +
> +static int
> +igbuio_pci_release(struct uio_info *info, struct inode *inode)
> +{
> +	struct rte_uio_pci_dev *udev = info->priv;
> +	struct pci_dev *dev = udev->pdev;
> +
> +	/* stop the device from further DMA */
> +	pci_clear_master(dev);
> +
> +	pci_reset_function(dev);
> +
> +	return 0;
> +}
> +
>  #ifdef CONFIG_XEN_DOM0
>  static int
>  igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct
> *vma)
> @@ -372,6 +403,8 @@ struct rte_uio_pci_dev {
>  	udev->info.version = "0.1";
>  	udev->info.handler = igbuio_pci_irqhandler;
>  	udev->info.irqcontrol = igbuio_pci_irqcontrol;
> +	udev->info.open = igbuio_pci_open;
> +	udev->info.release = igbuio_pci_release;
>  #ifdef CONFIG_XEN_DOM0
>  	/* check if the driver run on Xen Dom0 */
>  	if (xen_initial_domain())
> --
> 1.8.3.1
  
Shijith Thotton July 10, 2017, 7:10 a.m. UTC | #4
On Mon, Jul 10, 2017 at 03:38:34AM +0000, Tan, Jianfeng wrote:
> Hi Thotton,
> 
> > -----Original Message-----
> > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > Sent: Friday, July 7, 2017 7:14 PM
> > To: dev@dpdk.org
> > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> > Tan, Jianfeng; Lu, Wenzhuo
> > Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> > 
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> 
> I'm afraid this will not work for restarted DPDK process. In current probe(), we set up the I/O mem and I/O port; and those sys files are used by EAL IGB_UIO initialization code to map I/O mem and port. After reset in release(), we will lose those sys files in next open().
> 
> Thanks,
> Jianfeng
> 
> > 
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> > 
> > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > ---
> > v2 changes:
> >  - Replaced pci_try_reset_function with pci_reset_function as it is not
> >    available in older kernel versions.
> > 
> > v1 changes:
> >  - Added pci set master inside open and clear master inside release.
> >  - Remove obvious comments.
> > 
> > RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html
> > 
> >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33
> > +++++++++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > index b9d427c..07a19a3 100644
> > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > @@ -170,6 +170,37 @@ struct rte_uio_pci_dev {
> >  	return IRQ_HANDLED;
> >  }
> > 
> > +/**
> > + * This gets called while opening uio device file.
> > + */
> > +static int
> > +igbuio_pci_open(struct uio_info *info, struct inode *inode)
> > +{
> > +	struct rte_uio_pci_dev *udev = info->priv;
> > +	struct pci_dev *dev = udev->pdev;
> > +
> > +	pci_reset_function(dev);
> > +
> > +	/* set bus master, which was cleared by the reset function */
> > +	pci_set_master(dev);
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +igbuio_pci_release(struct uio_info *info, struct inode *inode)
> > +{
> > +	struct rte_uio_pci_dev *udev = info->priv;
> > +	struct pci_dev *dev = udev->pdev;
> > +
> > +	/* stop the device from further DMA */
> > +	pci_clear_master(dev);
> > +
> > +	pci_reset_function(dev);
> > +
> > +	return 0;
> > +}
> > +
> >  #ifdef CONFIG_XEN_DOM0
> >  static int
> >  igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct
> > *vma)
> > @@ -372,6 +403,8 @@ struct rte_uio_pci_dev {
> >  	udev->info.version = "0.1";
> >  	udev->info.handler = igbuio_pci_irqhandler;
> >  	udev->info.irqcontrol = igbuio_pci_irqcontrol;
> > +	udev->info.open = igbuio_pci_open;
> > +	udev->info.release = igbuio_pci_release;
> >  #ifdef CONFIG_XEN_DOM0
> >  	/* check if the driver run on Xen Dom0 */
> >  	if (xen_initial_domain())
> > --
> > 1.8.3.1
>

Hi Jianfeng,

I have tested the patch with LiquidIO VFs in VM using testpmd and could not see
any issue over multiple runs.

Thanks,
Shijith
  
Jianfeng Tan July 10, 2017, 9 a.m. UTC | #5
> -----Original Message-----
> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> Sent: Monday, July 10, 2017 3:11 PM
> To: Tan, Jianfeng
> Cc: dev@dpdk.org; Yigit, Ferruh; Gregory Etelson; Thomas Monjalon;
> Stephen Hemminger; Lu, Wenzhuo
> Subject: Re: [PATCH v2] igb_uio: issue FLR during open and release of device
> file
> 
> On Mon, Jul 10, 2017 at 03:38:34AM +0000, Tan, Jianfeng wrote:
> > Hi Thotton,
> >
> > > -----Original Message-----
> > > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > > Sent: Friday, July 7, 2017 7:14 PM
> > > To: dev@dpdk.org
> > > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen
> Hemminger;
> > > Tan, Jianfeng; Lu, Wenzhuo
> > > Subject: [PATCH v2] igb_uio: issue FLR during open and release of device
> file
> > >
> > > Set UIO info device file operations open and release. Call pci reset
> > > function inside open and release to clear device state at start and end.
> > > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > > it is not mandatory to issue FLR by PMD's during init and close.
> >
> > I'm afraid this will not work for restarted DPDK process. In current probe(),
> we set up the I/O mem and I/O port; and those sys files are used by EAL
> IGB_UIO initialization code to map I/O mem and port. After reset in release(),
> we will lose those sys files in next open().
> >
> > Thanks,
> > Jianfeng
> >
> > >
> > > Bus master enable and disable are added in open and release
> respectively
> > > to take care of device DMA.
> > >
> > > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > > ---
> > > v2 changes:
> > >  - Replaced pci_try_reset_function with pci_reset_function as it is not
> > >    available in older kernel versions.
> > >
> > > v1 changes:
> > >  - Added pci set master inside open and clear master inside release.
> > >  - Remove obvious comments.
> > >
> > > RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html
> > >
> > >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33
> > > +++++++++++++++++++++++++++++++
> > >  1 file changed, 33 insertions(+)
> > >
> > > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > index b9d427c..07a19a3 100644
> > > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > @@ -170,6 +170,37 @@ struct rte_uio_pci_dev {
> > >  	return IRQ_HANDLED;
> > >  }
> > >
> > > +/**
> > > + * This gets called while opening uio device file.
> > > + */
> > > +static int
> > > +igbuio_pci_open(struct uio_info *info, struct inode *inode)
> > > +{
> > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > +	struct pci_dev *dev = udev->pdev;
> > > +
> > > +	pci_reset_function(dev);
> > > +
> > > +	/* set bus master, which was cleared by the reset function */
> > > +	pci_set_master(dev);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int
> > > +igbuio_pci_release(struct uio_info *info, struct inode *inode)
> > > +{
> > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > +	struct pci_dev *dev = udev->pdev;
> > > +
> > > +	/* stop the device from further DMA */
> > > +	pci_clear_master(dev);
> > > +
> > > +	pci_reset_function(dev);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  #ifdef CONFIG_XEN_DOM0
> > >  static int
> > >  igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct
> > > *vma)
> > > @@ -372,6 +403,8 @@ struct rte_uio_pci_dev {
> > >  	udev->info.version = "0.1";
> > >  	udev->info.handler = igbuio_pci_irqhandler;
> > >  	udev->info.irqcontrol = igbuio_pci_irqcontrol;
> > > +	udev->info.open = igbuio_pci_open;
> > > +	udev->info.release = igbuio_pci_release;
> > >  #ifdef CONFIG_XEN_DOM0
> > >  	/* check if the driver run on Xen Dom0 */
> > >  	if (xen_initial_domain())
> > > --
> > > 1.8.3.1
> >
> 
> Hi Jianfeng,
> 
> I have tested the patch with LiquidIO VFs in VM using testpmd and could not
> see
> any issue over multiple runs.

I got that, you are using pci_reset_function() instead of pci_disable_device (the function I was trying). So only one question left, from the comment of pci_reset_function(), it "saves and restores device state over  the reset", then is __pci_reset_function() is more proper here?

Thanks,
Jianfeng

> 
> Thanks,
> Shijith
  
Shijith Thotton July 10, 2017, 10:42 a.m. UTC | #6
On Mon, Jul 10, 2017 at 09:00:38AM +0000, Tan, Jianfeng wrote:
> 
> 
> > -----Original Message-----
> > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > Sent: Monday, July 10, 2017 3:11 PM
> > To: Tan, Jianfeng
> > Cc: dev@dpdk.org; Yigit, Ferruh; Gregory Etelson; Thomas Monjalon;
> > Stephen Hemminger; Lu, Wenzhuo
> > Subject: Re: [PATCH v2] igb_uio: issue FLR during open and release of device
> > file
> > 
> > On Mon, Jul 10, 2017 at 03:38:34AM +0000, Tan, Jianfeng wrote:
> > > Hi Thotton,
> > >
> > > > -----Original Message-----
> > > > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > > > Sent: Friday, July 7, 2017 7:14 PM
> > > > To: dev@dpdk.org
> > > > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen
> > Hemminger;
> > > > Tan, Jianfeng; Lu, Wenzhuo
> > > > Subject: [PATCH v2] igb_uio: issue FLR during open and release of device
> > file
> > > >
> > > > Set UIO info device file operations open and release. Call pci reset
> > > > function inside open and release to clear device state at start and end.
> > > > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > > > it is not mandatory to issue FLR by PMD's during init and close.
> > >
> > > I'm afraid this will not work for restarted DPDK process. In current probe(),
> > we set up the I/O mem and I/O port; and those sys files are used by EAL
> > IGB_UIO initialization code to map I/O mem and port. After reset in release(),
> > we will lose those sys files in next open().
> > >
> > > Thanks,
> > > Jianfeng
> > >
> > > >
> > > > Bus master enable and disable are added in open and release
> > respectively
> > > > to take care of device DMA.
> > > >
> > > > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > > > ---
> > > > v2 changes:
> > > >  - Replaced pci_try_reset_function with pci_reset_function as it is not
> > > >    available in older kernel versions.
> > > >
> > > > v1 changes:
> > > >  - Added pci set master inside open and clear master inside release.
> > > >  - Remove obvious comments.
> > > >
> > > > RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html
> > > >
> > > >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33
> > > > +++++++++++++++++++++++++++++++
> > > >  1 file changed, 33 insertions(+)
> > > >
> > > > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > index b9d427c..07a19a3 100644
> > > > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > @@ -170,6 +170,37 @@ struct rte_uio_pci_dev {
> > > >  	return IRQ_HANDLED;
> > > >  }
> > > >
> > > > +/**
> > > > + * This gets called while opening uio device file.
> > > > + */
> > > > +static int
> > > > +igbuio_pci_open(struct uio_info *info, struct inode *inode)
> > > > +{
> > > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > > +	struct pci_dev *dev = udev->pdev;
> > > > +
> > > > +	pci_reset_function(dev);
> > > > +
> > > > +	/* set bus master, which was cleared by the reset function */
> > > > +	pci_set_master(dev);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static int
> > > > +igbuio_pci_release(struct uio_info *info, struct inode *inode)
> > > > +{
> > > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > > +	struct pci_dev *dev = udev->pdev;
> > > > +
> > > > +	/* stop the device from further DMA */
> > > > +	pci_clear_master(dev);
> > > > +
> > > > +	pci_reset_function(dev);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  #ifdef CONFIG_XEN_DOM0
> > > >  static int
> > > >  igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct
> > > > *vma)
> > > > @@ -372,6 +403,8 @@ struct rte_uio_pci_dev {
> > > >  	udev->info.version = "0.1";
> > > >  	udev->info.handler = igbuio_pci_irqhandler;
> > > >  	udev->info.irqcontrol = igbuio_pci_irqcontrol;
> > > > +	udev->info.open = igbuio_pci_open;
> > > > +	udev->info.release = igbuio_pci_release;
> > > >  #ifdef CONFIG_XEN_DOM0
> > > >  	/* check if the driver run on Xen Dom0 */
> > > >  	if (xen_initial_domain())
> > > > --
> > > > 1.8.3.1
> > >
> > 
> > Hi Jianfeng,
> > 
> > I have tested the patch with LiquidIO VFs in VM using testpmd and could not
> > see
> > any issue over multiple runs.
> 
> I got that, you are using pci_reset_function() instead of pci_disable_device (the function I was trying). So only one question left, from the comment of pci_reset_function(), it "saves and restores device state over  the reset", then is __pci_reset_function() is more proper here?

Per comments of __pci_reset_function:
 * Resetting the device will make the contents of PCI configuration space
 * random, so any caller of this must be prepared to reinitialise the
 * device including MSI, bus mastering, BARs, decoding IO and memory spaces,
 * etc.

So thought, pci_reset_function would be proper as it saves and restores state.
Please correct if I assumed it wrong.

Shijith
  
Gregory Etelson July 11, 2017, 5:42 a.m. UTC | #7
Hello Ferruh,

Both patches
[1] http://dpdk.org/dev/patchwork/patch/26633/ and 
[2] http://dpdk.org/dev/patchwork/patch/25061/ have failed the same test.
This is kind of strange because [2] has already passed that test numerous times.
I'll recalibrate my cluster and run the test again.
Besides that, [1] does the job

Regards,
Gregory

On Monday, 10 July 2017 06:07:45 IDT Gregory Etelson wrote:


Hello Ferruh,

I could not reproduce server crash with the patch.
However, some tests report ixgbe_vf_pmd and i40e_vf_pmd 
do not receive and transmit frames after process restart,
although PMD initialization completed successfully
Is there a way to collect PF firmware dump for investigation ?

Regards,
Gregory

On Friday, 7 July 2017 18:10:40 IDT Ferruh Yigit wrote:
> On 7/7/2017 12:13 PM, Shijith Thotton wrote:
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> > 
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> > 
> > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> 
> Gregory,
> 
> Would you mind testing this one?
> 
> Thanks,
> ferruh
> 
>
  
Gregory Etelson July 11, 2017, 11:36 a.m. UTC | #8
Hello Ferruh,

All tests have passed successfully.


Regards,
Gregory


Hello Ferruh,

Both patches
[1] http://dpdk.org/dev/patchwork/patch/26633/ and 
[2] http://dpdk.org/dev/patchwork/patch/25061/ have failed the same test.
This is kind of strange because [2] has already passed that test numerous times.
I'll recalibrate my cluster and run the test again.
Besides that, [1] does the job

Regards,
Gregory

On Monday, 10 July 2017 06:07:45 IDT Gregory Etelson wrote:


Hello Ferruh,

I could not reproduce server crash with the patch.
However, some tests report ixgbe_vf_pmd and i40e_vf_pmd 
do not receive and transmit frames after process restart,
although PMD initialization completed successfully
Is there a way to collect PF firmware dump for investigation ?

Regards,
Gregory

On Friday, 7 July 2017 18:10:40 IDT Ferruh Yigit wrote:
> On 7/7/2017 12:13 PM, Shijith Thotton wrote:
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> > 
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> > 
> > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> 
> Gregory,
> 
> Would you mind testing this one?
> 
> Thanks,
> ferruh
> 
>
  
Jianfeng Tan July 12, 2017, 3:37 a.m. UTC | #9
> -----Original Message-----
> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> Sent: Monday, July 10, 2017 6:43 PM
> To: Tan, Jianfeng
> Cc: dev@dpdk.org; Yigit, Ferruh; Gregory Etelson; Thomas Monjalon;
> Stephen Hemminger; Lu, Wenzhuo
> Subject: Re: [PATCH v2] igb_uio: issue FLR during open and release of device
> file
> 
> On Mon, Jul 10, 2017 at 09:00:38AM +0000, Tan, Jianfeng wrote:
> >
> >
> > > -----Original Message-----
> > > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > > Sent: Monday, July 10, 2017 3:11 PM
> > > To: Tan, Jianfeng
> > > Cc: dev@dpdk.org; Yigit, Ferruh; Gregory Etelson; Thomas Monjalon;
> > > Stephen Hemminger; Lu, Wenzhuo
> > > Subject: Re: [PATCH v2] igb_uio: issue FLR during open and release of
> device
> > > file
> > >
> > > On Mon, Jul 10, 2017 at 03:38:34AM +0000, Tan, Jianfeng wrote:
> > > > Hi Thotton,
> > > >
> > > > > -----Original Message-----
> > > > > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > > > > Sent: Friday, July 7, 2017 7:14 PM
> > > > > To: dev@dpdk.org
> > > > > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen
> > > Hemminger;
> > > > > Tan, Jianfeng; Lu, Wenzhuo
> > > > > Subject: [PATCH v2] igb_uio: issue FLR during open and release of
> device
> > > file
> > > > >
> > > > > Set UIO info device file operations open and release. Call pci reset
> > > > > function inside open and release to clear device state at start and end.
> > > > > Copied this behaviour from vfio_pci kernel module code. With this
> patch,
> > > > > it is not mandatory to issue FLR by PMD's during init and close.
> > > >
> > > > I'm afraid this will not work for restarted DPDK process. In current
> probe(),
> > > we set up the I/O mem and I/O port; and those sys files are used by EAL
> > > IGB_UIO initialization code to map I/O mem and port. After reset in
> release(),
> > > we will lose those sys files in next open().
> > > >
> > > > Thanks,
> > > > Jianfeng
> > > >
> > > > >
> > > > > Bus master enable and disable are added in open and release
> > > respectively
> > > > > to take care of device DMA.
> > > > >
> > > > > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > > > > ---
> > > > > v2 changes:
> > > > >  - Replaced pci_try_reset_function with pci_reset_function as it is not
> > > > >    available in older kernel versions.
> > > > >
> > > > > v1 changes:
> > > > >  - Added pci set master inside open and clear master inside release.
> > > > >  - Remove obvious comments.
> > > > >
> > > > > RFC: http://dpdk.org/ml/archives/dev/2017-May/066917.html
> > > > >
> > > > >  lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 33
> > > > > +++++++++++++++++++++++++++++++
> > > > >  1 file changed, 33 insertions(+)
> > > > >
> > > > > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > > b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > > index b9d427c..07a19a3 100644
> > > > > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> > > > > @@ -170,6 +170,37 @@ struct rte_uio_pci_dev {
> > > > >  	return IRQ_HANDLED;
> > > > >  }
> > > > >
> > > > > +/**
> > > > > + * This gets called while opening uio device file.
> > > > > + */
> > > > > +static int
> > > > > +igbuio_pci_open(struct uio_info *info, struct inode *inode)
> > > > > +{
> > > > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > > > +	struct pci_dev *dev = udev->pdev;
> > > > > +
> > > > > +	pci_reset_function(dev);
> > > > > +
> > > > > +	/* set bus master, which was cleared by the reset function */
> > > > > +	pci_set_master(dev);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int
> > > > > +igbuio_pci_release(struct uio_info *info, struct inode *inode)
> > > > > +{
> > > > > +	struct rte_uio_pci_dev *udev = info->priv;
> > > > > +	struct pci_dev *dev = udev->pdev;
> > > > > +
> > > > > +	/* stop the device from further DMA */
> > > > > +	pci_clear_master(dev);
> > > > > +
> > > > > +	pci_reset_function(dev);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > >  #ifdef CONFIG_XEN_DOM0
> > > > >  static int
> > > > >  igbuio_dom0_mmap_phys(struct uio_info *info, struct
> vm_area_struct
> > > > > *vma)
> > > > > @@ -372,6 +403,8 @@ struct rte_uio_pci_dev {
> > > > >  	udev->info.version = "0.1";
> > > > >  	udev->info.handler = igbuio_pci_irqhandler;
> > > > >  	udev->info.irqcontrol = igbuio_pci_irqcontrol;
> > > > > +	udev->info.open = igbuio_pci_open;
> > > > > +	udev->info.release = igbuio_pci_release;
> > > > >  #ifdef CONFIG_XEN_DOM0
> > > > >  	/* check if the driver run on Xen Dom0 */
> > > > >  	if (xen_initial_domain())
> > > > > --
> > > > > 1.8.3.1
> > > >
> > >
> > > Hi Jianfeng,
> > >
> > > I have tested the patch with LiquidIO VFs in VM using testpmd and could
> not
> > > see
> > > any issue over multiple runs.
> >
> > I got that, you are using pci_reset_function() instead of pci_disable_device
> (the function I was trying). So only one question left, from the comment of
> pci_reset_function(), it "saves and restores device state over  the reset",
> then is __pci_reset_function() is more proper here?
> 
> Per comments of __pci_reset_function:
>  * Resetting the device will make the contents of PCI configuration space
>  * random, so any caller of this must be prepared to reinitialise the
>  * device including MSI, bus mastering, BARs, decoding IO and memory
> spaces,
>  * etc.
> 
> So thought, pci_reset_function would be proper as it saves and restores
> state.

Make sense. My was thinking the device will be re-initialized anyway and not necessary to restore the state. But we cannot leave BARs random as device cannot manage the physical memory layout.

Testing on virtio devices shows that function works well. And this avoids compatibility issue (as my patch).

Great work!

Thanks,
Jianfeng
  
Jianfeng Tan July 12, 2017, 3:40 a.m. UTC | #10
> -----Original Message-----
> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> Sent: Friday, July 7, 2017 7:14 PM
> To: dev@dpdk.org
> Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> Tan, Jianfeng; Lu, Wenzhuo
> Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> 
> Set UIO info device file operations open and release. Call pci reset
> function inside open and release to clear device state at start and end.
> Copied this behaviour from vfio_pci kernel module code. With this patch,
> it is not mandatory to issue FLR by PMD's during init and close.
> 
> Bus master enable and disable are added in open and release respectively
> to take care of device DMA.
> 
> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>

Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
  
Gregory Etelson July 16, 2017, 4:22 a.m. UTC | #11
Hello Shijith,

Please add the patch to uio_pci_generic.c file in Linux kernel
We experience similar faults with NVMe devices  

On Wednesday, 12 July 2017 06:40:55 IDT Tan, Jianfeng wrote:
> 
> > -----Original Message-----
> > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > Sent: Friday, July 7, 2017 7:14 PM
> > To: dev@dpdk.org
> > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> > Tan, Jianfeng; Lu, Wenzhuo
> > Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> > 
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> > 
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> > 
> > Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> 
> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
  
Ferruh Yigit July 19, 2017, 1:32 p.m. UTC | #12
On 7/12/2017 4:40 AM, Tan, Jianfeng wrote:
> 
> 
>> -----Original Message-----
>> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
>> Sent: Friday, July 7, 2017 7:14 PM
>> To: dev@dpdk.org
>> Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
>> Tan, Jianfeng; Lu, Wenzhuo
>> Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
>>
>> Set UIO info device file operations open and release. Call pci reset
>> function inside open and release to clear device state at start and end.
>> Copied this behaviour from vfio_pci kernel module code. With this patch,
>> it is not mandatory to issue FLR by PMD's during init and close.
>>
>> Bus master enable and disable are added in open and release respectively
>> to take care of device DMA.
>>
>> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> 
> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
  
Gregory Etelson July 19, 2017, 4:19 p.m. UTC | #13
On Wednesday, 19 July 2017 16:32:34 IDT Ferruh Yigit wrote:
> On 7/12/2017 4:40 AM, Tan, Jianfeng wrote:
> > 
> > 
> >> -----Original Message-----
> >> From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> >> Sent: Friday, July 7, 2017 7:14 PM
> >> To: dev@dpdk.org
> >> Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> >> Tan, Jianfeng; Lu, Wenzhuo
> >> Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> >>
> >> Set UIO info device file operations open and release. Call pci reset
> >> function inside open and release to clear device state at start and end.
> >> Copied this behaviour from vfio_pci kernel module code. With this patch,
> >> it is not mandatory to issue FLR by PMD's during init and close.
> >>
> >> Bus master enable and disable are added in open and release respectively
> >> to take care of device DMA.
> >>
> >> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > 
> > Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 

Acked-by: Gregory Etelson <gregory@weka.io>
  
Thomas Monjalon July 20, 2017, 10:36 p.m. UTC | #14
> > >> Set UIO info device file operations open and release. Call pci reset
> > >> function inside open and release to clear device state at start and end.
> > >> Copied this behaviour from vfio_pci kernel module code. With this patch,
> > >> it is not mandatory to issue FLR by PMD's during init and close.
> > >>
> > >> Bus master enable and disable are added in open and release respectively
> > >> to take care of device DMA.
> > >>
> > >> Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
> > > 
> > > Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > 
> > Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Acked-by: Gregory Etelson <gregory@weka.io>

Applied, thanks
  

Patch

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index b9d427c..07a19a3 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -170,6 +170,37 @@  struct rte_uio_pci_dev {
 	return IRQ_HANDLED;
 }
 
+/**
+ * This gets called while opening uio device file.
+ */
+static int
+igbuio_pci_open(struct uio_info *info, struct inode *inode)
+{
+	struct rte_uio_pci_dev *udev = info->priv;
+	struct pci_dev *dev = udev->pdev;
+
+	pci_reset_function(dev);
+
+	/* set bus master, which was cleared by the reset function */
+	pci_set_master(dev);
+
+	return 0;
+}
+
+static int
+igbuio_pci_release(struct uio_info *info, struct inode *inode)
+{
+	struct rte_uio_pci_dev *udev = info->priv;
+	struct pci_dev *dev = udev->pdev;
+
+	/* stop the device from further DMA */
+	pci_clear_master(dev);
+
+	pci_reset_function(dev);
+
+	return 0;
+}
+
 #ifdef CONFIG_XEN_DOM0
 static int
 igbuio_dom0_mmap_phys(struct uio_info *info, struct vm_area_struct *vma)
@@ -372,6 +403,8 @@  struct rte_uio_pci_dev {
 	udev->info.version = "0.1";
 	udev->info.handler = igbuio_pci_irqhandler;
 	udev->info.irqcontrol = igbuio_pci_irqcontrol;
+	udev->info.open = igbuio_pci_open;
+	udev->info.release = igbuio_pci_release;
 #ifdef CONFIG_XEN_DOM0
 	/* check if the driver run on Xen Dom0 */
 	if (xen_initial_domain())