eal: Fixes VFIO/sysfs race condition

Message ID 20200406222323.18609-1-michael.haeuptle@hpe.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series eal: Fixes VFIO/sysfs race condition |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/travis-robot success Travis build: passed
ci/Intel-compilation success Compilation OK

Commit Message

Michael Haeuptle April 6, 2020, 10:23 p.m. UTC
  This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")
Cc: xiao.w.wang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
---
 lib/librte_eal/linux/eal_vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

David Marchand April 21, 2020, 4:19 p.m. UTC | #1
On Tue, Apr 7, 2020 at 12:23 AM Michael Haeuptle
<michael.haeuptle@hpe.com> wrote:
>
> This fix treats a 0 return value from vfio_open_group_fd
> in vfio_get_group_fd as the intended error condition instead
> of putting an incorrect 0 file descriptor in the vfio_group table.
>
> Sometimes, the creation of device files in sysfs is not
> instantaneously causing vfio_open_groupfd to return 0.
> This has been observed when hot removing/adding multiple
> NVMe devices (>=4).
>
> Fixes: 340b7bb8d583 ("vfio: extend data structure for multi container")
> Cc: stable@dpdk.org
>
> Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>

Please submit with a revision next time.
Added back acks from first revision of the patch.

Applied, thanks.
  

Patch

diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
index 4502aefed..1979f6fdd 100644
--- a/lib/librte_eal/linux/eal_vfio.c
+++ b/lib/librte_eal/linux/eal_vfio.c
@@ -379,7 +379,7 @@  vfio_get_group_fd(struct vfio_config *vfio_cfg,
 	}
 
 	vfio_group_fd = vfio_open_group_fd(iommu_group_num);
-	if (vfio_group_fd < 0) {
+	if (vfio_group_fd <= 0) {
 		RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num);
 		return -1;
 	}