eal: Fixes VFIO/sysfs race condition

Message ID 20200331165657.29368-1-michael.haeuptle@hpe.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series eal: Fixes VFIO/sysfs race condition |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Michael Haeuptle March 31, 2020, 4:56 p.m. UTC
  This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Stojaczyk, Dariusz April 1, 2020, 8:50 a.m. UTC | #1
> From: dev <dev-bounces@dpdk.org> On Behalf Of Michael Haeuptle
> Sent: Tuesday, March 31, 2020 6:57 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition
> 
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
> 
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
> 
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> ---

Acked-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
  
Burakov, Anatoly April 2, 2020, 10:10 a.m. UTC | #2
On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
> 
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
> 
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> ---
>   lib/librte_eal/linux/eal_vfio.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> index 4502aefed..1979f6fdd 100644
> --- a/lib/librte_eal/linux/eal_vfio.c
> +++ b/lib/librte_eal/linux/eal_vfio.c
> @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
>   	}
>   
>   	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> -	if (vfio_group_fd < 0) {
> +	if (vfio_group_fd <= 0) {
>   		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
>   		return -1;
>   	}
> 

If it's returning an invalid value, is that a kernel bug?

I mean, looks fine to me, so

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
David Marchand April 6, 2020, 1:25 p.m. UTC | #3
On Thu, Apr 2, 2020 at 12:11 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> > This fix treats a 0 return value from vfio_open_group_fd
> > in vfio_get_group_fd as the intended error condition instead
> > of putting an incorrect 0 file descriptor in the vfio_group table.
> >
> > Sometimes, the creation of device files in sysfs is not
> > instantaneously causing vfio_open_groupfd to return 0.
> > This has been observed when hot removing/adding multiple
> > NVMe devices (>=4).
> >
> > Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> > ---
> >   lib/librte_eal/linux/eal_vfio.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> > index 4502aefed..1979f6fdd 100644
> > --- a/lib/librte_eal/linux/eal_vfio.c
> > +++ b/lib/librte_eal/linux/eal_vfio.c
> > @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
> >       }
> >
> >       vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> > -     if (vfio_group_fd < 0) {
> > +     if (vfio_group_fd <= 0) {
> >               RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
> >               return -1;
> >       }
> >
>
> If it's returning an invalid value, is that a kernel bug?
>
> I mean, looks fine to me, so
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

We are missing a fixes line.
Does this deserve a backport?
  
Michael Haeuptle April 6, 2020, 7:15 p.m. UTC | #4
Hi Dave,

This is my first submission... how do I get the fixes reference?

Back porting it to 19.11 would be great. This issue shows up in SPDK 20.01, which uses 19.11.

-- Michael

-----Original Message-----
From: David Marchand [mailto:david.marchand@redhat.com] 
Sent: Monday, April 6, 2020 7:25 AM
To: Burakov, Anatoly <anatoly.burakov@intel.com>; Haeuptle, Michael <michael.haeuptle@hpe.com>
Cc: dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] eal: Fixes VFIO/sysfs race condition

On Thu, Apr 2, 2020 at 12:11 PM Burakov, Anatoly <anatoly.burakov@intel.com> wrote:
>
> On 31-Mar-20 5:56 PM, Michael Haeuptle wrote:
> > This fix treats a 0 return value from vfio_open_group_fd in 
> > vfio_get_group_fd as the intended error condition instead of putting 
> > an incorrect 0 file descriptor in the vfio_group table.
> >
> > Sometimes, the creation of device files in sysfs is not 
> > instantaneously causing vfio_open_groupfd to return 0.
> > This has been observed when hot removing/adding multiple NVMe 
> > devices (>=4).
> >
> > Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
> > ---
> >   lib/librte_eal/linux/eal_vfio.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/linux/eal_vfio.c 
> > b/lib/librte_eal/linux/eal_vfio.c index 4502aefed..1979f6fdd 100644
> > --- a/lib/librte_eal/linux/eal_vfio.c
> > +++ b/lib/librte_eal/linux/eal_vfio.c
> > @@ -379,7 +379,7 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg,
> >       }
> >
> >       vfio_group_fd = vfio_open_group_fd(iommu_group_num);
> > -     if (vfio_group_fd < 0) {
> > +     if (vfio_group_fd <= 0) {
> >               RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
> >               return -1;
> >       }
> >
>
> If it's returning an invalid value, is that a kernel bug?
>
> I mean, looks fine to me, so
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

We are missing a fixes line.
Does this deserve a backport?


--
David Marchand
  
David Marchand April 6, 2020, 8:08 p.m. UTC | #5
On Mon, Apr 6, 2020 at 9:15 PM Haeuptle, Michael
<michael.haeuptle@hpe.com> wrote:
> This is my first submission... how do I get the fixes reference?

I recommend reading: https://doc.dpdk.org/guides/contributing/patches.html
But let me help you this time.

For the missing Fixes: tag, we want to get the sha1 of the commit that
first exhibited the issue.
For this you must either use git bisect (not always possible) or go
back in the git history and find the culprit commit.
git log -p --follow and git blame can help identify it (just be
cautious and skip purely cosmetic changes if any).

In your case, I *think* the commit is
340b7bb8d583661369a9491ade63fe2407e85267, since it introduced the
vfio_open_group_fd() function and the check on its return value.
I did not see changes between this commit and origin/master wrt to
vfio_open_group_fd() return values.
It might have been older than this, the vfio maintainer can help confirm.
Anatoly, can you double check :-) ?


For the formatting of the Fixes: tag, you have a git macro in
https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-body
Which then gives:
$ git fixline 340b7bb8d58
Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")

>
> Back porting it to 19.11 would be great. This issue shows up in SPDK 20.01, which uses 19.11.

For the criterias on what should be backported:
https://doc.dpdk.org/guides/contributing/stable.html#what-changes-should-be-backported.

You can annotate a patch with a version, but it will be informational only.
What is important is to Cc: stable@dpdk.org so that stable maintainers
see this fix and consider it for backport in the currently maintained
branches (atm 18.11 and 19.11).
The Fixes: line already gives an idea of which branches are concerned.
The stable maintainers have scripts to catch fixes of interest for
them (+ those scripts also check if a change is a fix of a previous
fix that got backported itself).

Here, if the above Fixes: is correct, it means this backport is a
candidate to backport to 18.11 and 19.11.
  

Patch

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@  vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}