[dpdk-stable] [dpdk-dev] [PATCH] vfio: fix interrupts race condition

Hyong Youb Kim (hyonkim) hyonkim at cisco.com
Sun Jul 14 07:10:49 CEST 2019


> -----Original Message-----
> From: Thomas Monjalon <thomas at monjalon.net>
> Sent: Thursday, July 11, 2019 6:21 AM
[...]
> Subject: Re: [dpdk-dev] [PATCH] vfio: fix interrupts race condition
> 
> 10/07/2019 14:33, David Marchand:
> > Populating the eventfd in rte_intr_enable in each request to vfio
> > triggers a reconfiguration of the interrupt handler on the kernel side.
> > The problem is that rte_intr_enable is often used to re-enable masked
> > interrupts from drivers interrupt handlers.
> >
> > This reconfiguration leaves a window during which a device could send
> > an interrupt and then the kernel logs this (unsolicited from the kernel
> > point of view) interrupt:
> > [158764.159833] do_IRQ: 9.34 No irq handler for vector
> >
> > VFIO api makes it possible to set the fd at setup time.
> > Make use of this and then we only need to ask for masking/unmasking
> > legacy interrupts and we have nothing to do for MSI/MSIX.
> >
> > "rxtx" interrupts are left untouched but are most likely subject to the
> > same issue.
> >
> > Fixes: 5c782b3928b8 ("vfio: interrupts")
> > Cc: stable at dpdk.org
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1654824
> > Signed-off-by: David Marchand <david.marchand at redhat.com>
> > Tested-by: Shahed Shaikh <shshaikh at marvell.com>
> 
> This is a real bug which should be fixed in this release.
> As the patch is quite big and needs a strong validation,
> I prefer merging it quickly to give a lot of time before
> releasing 19.08-rc2.
> The maintainers of all concerned PMDs are Cc.
> Please make sure the interrupts are still working well with VFIO.
> 
> Applied, thanks
> 

[Apologies in advance if email format gets messed up. Forced to use
outlook for the first time..]

Hi,

This commit breaks MSI-X + rxq interrupts. I think others are seeing
the same error?

sudo ~/dpdk/examples/l3fwd-power/build/l3fwd-power \
-c 0x1e -n 4 -w 0000:1a:00.0 --log-level=pmd,debug -- -p 0x1 -P --config "(0,0,2),(0,1,3),(0,2,4)"
[...]
EAL: Error enabling MSI-X interrupts for fd 35

A rough sequence of events goes like this. The above test is using 3
rxqs (3 interrupts).

1. During probe, pci_vfio_setup_interrupts() runs.
This now does ioctl(VFIO_DEVICE_SET_IRQS) for the 1st efd
(intr_handle->fd).

ioctl does:
- pci_enable_msix(1 vector) because this is the first time enabling
  interrupts.
- request_irq(vector 0)

2. App configs
The app sets port_conf.intr_conf.rxq=1, configs 3 rxqs, etc.

3. rte_eth_dev_start()
PMD calls:
- rte_intr_efd_enable()
  This creates 3 efds (intr_handle->nb_efd = 3).
- rte_intr_enable() => vfio_enable_msix()
  This does ioctl(VFIO_DEVICE_SET_IRQS) for the 3 efds.

ioctl now needs to request_irq() for vectors 1, 2, 3 for the 3 new
efds. It does not do another pci_enable_msix() as it has been done
earlier. Before calling request_irq(), it sees that only 1 vector was
enabled in earlier pci_enable_msix(), so it fails with EINVAL.

We would need pci_enable_msix(4 vectors) for this to work
(intr_handle->fd + 3 efds).

Prior to this patch, VFIO_DEVICE_SET_IRQS is done only in
vfio_enable_msix(). So, ioctl ends up doing pci_enable_msix(4 vectors)
and request_irq() for each of the 4 efds, which completes
successfully.

Not an expert in this area.. Perhaps, defer enabling 1st efd
(intr_handle->fd) until the first invocation of vfio_enable_msix(), so
it knows the app wants to use 4 vectors in total?

Also, vfio_disable_msix() looks a bit wrong.

        irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
        irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX;
        irq_set.start = RTE_INTR_VEC_RXTX_OFFSET;
        irq_set.count = intr_handle->nb_efd;

This tells vfio-pci to simulate interrupts by triggering efds? To
free_irq() specific efds, I think we need DATA_EVENTFD and set fd =
-1.

flags = DATA_EVENTFD | ACTION_TRIGGER
data = [fd(-1), fd(-1), ...]

I have not tested this part myself yet.

Thanks..
-Hyong



More information about the stable mailing list