Bug 361 - device reset handling with igb_uio
Summary: device reset handling with igb_uio
Status: CONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: ethdev (show other bugs)
Version: unspecified
Hardware: All All
: Normal normal
Target Milestone: ---
Assignee: Ferruh YIGIT
URL:
Depends on:
Blocks:
 
Reported: 2019-11-11 05:58 CET by Santosh
Modified: 2020-08-05 09:47 CEST (History)
3 users (show)



Attachments
msi-x vector allocation for igb_uio after device reset (3.81 KB, patch)
2019-11-11 05:58 CET, Santosh
Details | Diff

Description Santosh 2019-11-11 05:58:23 CET
Created attachment 73 [details]
msi-x vector allocation for igb_uio after device reset

Hi,

    I have a question on igb_uio.

From the below function call traces, vfio-pci module frees/allocates msi-x
vector table as part of
interrupt disable/enable. Where as igb-uio module, only masks/unmasks the
msi-x interrupt.
Does this mean, when using igb_uio, device can't undergo reset which clears
MSI-X vector table?
How to handle device reset with igb_uio?

igb-uio:
rte_intr_disable->uio_intr_disable->igbuio_pci_irqcontrol->pci_msi_mask_irq
rte_intr_enable->uio_intr_enable->igbuio_pci_irqcontrol->pci_msi_unmask_irq

igbuio_pci_open->igbuio_pci_enable_interrupts->pci_alloc_irq_vectors/request_irq
igbuio_pci_release->igbuio_pci_disable_interrupts->free_irq->pci_free_irq_vectors

vfio-pci:
rte_intr_disable->vfio_disable_msix->vfio_pci_ioctl->vfio_msi_disable->pci_free_irq_vectors
rte_intr_enable->vfio_enable_msix->vfio_pci_ioctl->vfio_msi_enable->pci_alloc_irq_vectors/vfio_msi_set_vector_signal->request_irq

I am using the attached hack to overcome this. What is the correct way to handle this? 
This is a hack as I am assigning "udev->info.irq = -1" when interrupt is disabled to prevent uio_write from exiting.

kernel/linux/igb_uio/igb_uio.c: igbuio_pci_irqcontrol
    igbuio_pci_disable_interrupts(udev);
    udev->info.irq = -1; <-- Assigning to -1 and not 0.

Basically uio_write will not call igbuio_pci_enable_interrupts to enable the interrupts back if irq is 0.

drivers/uio/uio.c: uio_write
        if (!idev->info->irq) {
                retval = -EIO;
                goto out;
        }
        if (!idev->info->irqcontrol) {
                retval = -ENOSYS;
                goto out;
        }
        retval = idev->info->irqcontrol(idev->info, irq_on);


Regards
-Santosh
Comment 1 Ajit Khaparde 2019-11-11 18:12:06 CET
Ferruh, Can you take a look? Thanks
Comment 2 Ferruh YIGIT 2020-08-04 19:38:18 CEST
Hi Santosh,

The defect is old, I hope this can still find you well.

Overall UIO interface seems designed for enable/disable the interrupt, the 'idev->info->irq' check shows the intend that interrupt being freed is not expected, and that is why you need your hack.

When does a msi-x vector table is cleared? I expect it is not with FLR since we don't observe the problem you mentioned.
Can you please elaborate the conditions or hardware properties that cause the mentioned problem?

And is it an option to use the vfio-pci to cover this case, instead of trying to hack the uio system...

Thanks,
ferruh
Comment 3 Ajit Khaparde 2020-08-04 19:43:06 CEST
Ferruh,
I agree that vfio-pci will cover this case. But customers still tend to use the UIO interface and we run into this.

I will have someone put in more details on the sequence of events happening here.

Thanks
Ajit
Comment 4 Ferruh YIGIT 2020-08-05 09:47:55 CEST
Hi Ajit,

What are the cases that igb_uio is still needed?

Note You need to log in before you can comment on or make changes to this bug.