[dpdk-stable] [dpdk-dev] [PATCH] vfio: fix interrupts race condition

Jerin Jacob Kollanukkaran jerinj at marvell.com
Mon Jul 15 07:35:14 CEST 2019


> > >
> > > This is a real bug which should be fixed in this release.
> > > As the patch is quite big and needs a strong validation, I prefer
> > > merging it quickly to give a lot of time before releasing 19.08-rc2.
> > > The maintainers of all concerned PMDs are Cc.
> > > Please make sure the interrupts are still working well with VFIO.
> > >
> > > Applied, thanks
> > >
> >
> > [Apologies in advance if email format gets messed up. Forced to use
> > outlook for the first time..]
> >
> > Hi,
> >
> > This commit breaks MSI-X + rxq interrupts. I think others are seeing
> > the same error?
> >
> > sudo ~/dpdk/examples/l3fwd-power/build/l3fwd-power \ -c 0x1e -n 4 -w
> > 0000:1a:00.0 --log-level=pmd,debug -- -p 0x1 -P --config
> "(0,0,2),(0,1,3),(0,2,4)"
> > [...]
> > EAL: Error enabling MSI-X interrupts for fd 35
> >
> > A rough sequence of events goes like this. The above test is using 3
> > rxqs (3 interrupts).
> >
> > 1. During probe, pci_vfio_setup_interrupts() runs.
> > This now does ioctl(VFIO_DEVICE_SET_IRQS) for the 1st efd
> > (intr_handle->fd).
> >
> > ioctl does:
> > - pci_enable_msix(1 vector) because this is the first time enabling
> >   interrupts.
> > - request_irq(vector 0)
> >
> > 2. App configs
> > The app sets port_conf.intr_conf.rxq=1, configs 3 rxqs, etc.
> >
> > 3. rte_eth_dev_start()
> > PMD calls:
> > - rte_intr_efd_enable()
> >   This creates 3 efds (intr_handle->nb_efd = 3).
> > - rte_intr_enable() => vfio_enable_msix()
> >   This does ioctl(VFIO_DEVICE_SET_IRQS) for the 3 efds.
> >
> > ioctl now needs to request_irq() for vectors 1, 2, 3 for the 3 new
> > efds. It does not do another pci_enable_msix() as it has been done
> > earlier. Before calling request_irq(), it sees that only 1 vector was
> > enabled in earlier pci_enable_msix(), so it fails with EINVAL.
> >
> > We would need pci_enable_msix(4 vectors) for this to work
> > (intr_handle->fd + 3 efds).
> >
> > Prior to this patch, VFIO_DEVICE_SET_IRQS is done only in
> > vfio_enable_msix(). So, ioctl ends up doing pci_enable_msix(4 vectors)
> > and request_irq() for each of the 4 efds, which completes
> > successfully.
> >
> > Not an expert in this area.. Perhaps, defer enabling 1st efd
> > (intr_handle->fd) until the first invocation of vfio_enable_msix(), so
> > it knows the app wants to use 4 vectors in total?
> >
> > Also, vfio_disable_msix() looks a bit wrong.
> >
> >         irq_set.flags = VFIO_IRQ_SET_DATA_NONE |
> VFIO_IRQ_SET_ACTION_TRIGGER;
> >         irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX;
> >         irq_set.start = RTE_INTR_VEC_RXTX_OFFSET;
> >         irq_set.count = intr_handle->nb_efd;
> >
> > This tells vfio-pci to simulate interrupts by triggering efds? To
> > free_irq() specific efds, I think we need DATA_EVENTFD and set fd =
> > -1.
> >
> > flags = DATA_EVENTFD | ACTION_TRIGGER
> > data = [fd(-1), fd(-1), ...]
> >
> > I have not tested this part myself yet.


We do see the following failure[1] on octeontx2 PMD with this patch.
We will try to find a fix.

        irq_set = (struct vfio_irq_set *)irq_set_buf;
        irq_set->argsz = len;
        irq_set->start = 0;
        irq_set->count = intr_handle->max_intr;
        irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
                        VFIO_IRQ_SET_ACTION_TRIGGER;
        irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;

        fd_ptr = (int32_t *)&irq_set->data[0];
        for (i = 0; i < irq_set->count; i++)
                fd_ptr[i] = -1;

        rc = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
        if (rc)
                otx2_err("Failed to set irqs vector rc=%d", rc);
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^[1]			
			
				


> 
> Thanks for your detailed report Hyong.
> Would you be able to propose a fix?
> 



More information about the stable mailing list