[dpdk-dev] [PATCH] vfio: fix device unplug when several devices per vfio group

Jerin Jacob jerin.jacob at caviumnetworks.com
Tue May 9 06:13:53 CEST 2017


-----Original Message-----
> Date: Mon, 8 May 2017 17:44:37 +0100
> From: Alejandro Lucero <alejandro.lucero at netronome.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> Cc: Thomas Monjalon <thomas at monjalon.net>, dev <dev at dpdk.org>, "Burakov,
>  Anatoly" <anatoly.burakov at intel.com>
> Subject: Re: [dpdk-dev] [PATCH] vfio: fix device unplug when several
>  devices per vfio group
> 
> Hi Jerin,
> 
> On Mon, May 8, 2017 at 4:20 PM, Jerin Jacob <jerin.jacob at caviumnetworks.com>
> wrote:
> 
> > -----Original Message-----
> > > Date: Sun, 30 Apr 2017 19:29:49 +0200
> > > From: Thomas Monjalon <thomas at monjalon.net>
> > > To: Alejandro Lucero <alejandro.lucero at netronome.com>
> > > Cc: dev at dpdk.org, "Burakov, Anatoly" <anatoly.burakov at intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH] vfio: fix device unplug when several
> > >  devices per vfio group
> > >
> > > 28/04/2017 15:25, Burakov, Anatoly:
> > > > From: Alejandro Lucero [mailto:alejandro.lucero at netronome.com]
> > > > > VFIO allows a secure way of assigning devices to user space and those
> > > > > devices which can not be isolated from other ones are set in same
> > VFIO
> > > > > group. Releasing or unplugging a device should be aware of remaining
> > > > > devices is the same group for avoiding to close such a group.
> > > > >
> > > > > Fixes: 94c0776b1bad ("vfio: support hotplug")
> > > > >
> > > > > Signed-off-by: Alejandro Lucero <alejandro.lucero at netronome.com>
> > > >
> > > > I have tested this on my setup on an old kernel with multiple
> > attach/detaches, and it works (whereas it fails without this patch).
> > > >
> > > > Acked-by: Anatoly  Burakov <anatoly.burakov at intel.com>
> > >
> > > Applied, thanks
> >
> > This patch creates issue when large number of PCIe devices connected to
> > system.
> > Found it through git bisect.
> >
> > This issue is, vfio_group_fd goes beyond 64(VFIO_MAX_GROUPS) and writes
> > to wrong memory on following code execution and sub sequentially creates
> > issues in vfio mapping or such.
> >
> vfio_cfg.vfio_groups[vfio_group_fd].devices++;
> >
> > I can increase VFIO_MAX_GROUPS, but I think, it is not correct fix as
> > vfio_group_fd generated from open system call.
> >
> > I add some prints the code for debug. Please find below the output.
> > Any thoughts from VFIO experts?
> >
> >
> That is a silly but serious bug. We are using the file descriptor as the
> index for updating devices counter of a vfio group structure internal to
> DPDK VFIO code. We should be using the vfio_group that file descriptor is
> registered with.
> 
> I will send a fix where vfio_group_device_get/put/count functions are
> implemented which take the file descriptor as a parameter and then go
> through the vfio_group array for working with the right one.
> 
> Thomas, is this fix in time yet for 17.05? I will send the patch today but
> I can just test it against a system with the "normal" case for VFIO device
> groups. Maybe Jerin or/and Anatoly can test it against the other case.


Thanks Alejandro for the patch.
Tested your patch on failure setup, it works fine after applying your
patch.

IMO, for v17.05, this fix must go in or we need to revert the original offending
patch(a9c349e3a100 ("vfio: fix device unplug when several devices per
group")) as it breaks DPDK running on the system with few PCIe devices connected in.


More information about the dev mailing list