Message ID | 1462879301-13570-1-git-send-email-zhe.tao@intel.com (mailing list archive) |
---|---|
State | Rejected, archived |
Delegated to: | Ferruh Yigit |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id CA5909AB3; Tue, 10 May 2016 13:22:39 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 0CAEA9AB2 for <dev@dpdk.org>; Tue, 10 May 2016 13:22:37 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP; 10 May 2016 04:21:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,604,1455004800"; d="scan'208";a="962597991" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by fmsmga001.fm.intel.com with ESMTP; 10 May 2016 04:21:50 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id u4ABLnvN016284; Tue, 10 May 2016 19:21:49 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id u4ABLkFY013604; Tue, 10 May 2016 19:21:48 +0800 Received: (from zhetao@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id u4ABLkiI013600; Tue, 10 May 2016 19:21:46 +0800 From: Zhe Tao <zhe.tao@intel.com> To: dev@dpdk.org Cc: zhe.tao@intel.com Date: Tue, 10 May 2016 19:21:41 +0800 Message-Id: <1462879301-13570-1-git-send-email-zhe.tao@intel.com> X-Mailer: git-send-email 1.7.4.1 Subject: [dpdk-dev] [PATCH v1] igu_uio: fix IOMMU domain issue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Zhe Tao
May 10, 2016, 11:21 a.m. UTC
Problem:
The following operations will cause the igb_uio based DPDK
operation failed.
--Any device assignment through the kvm_assign_device interface,
this can be the pci-assign method in QEMU
--VFIO group attachment operation(attach to the container)
this can happens in vfio-pci assignment in QEMU
Root cause:
For the two operation above finally will call the intel_iommu_attach_device
(e.g. for vfio/ vfio_group_set_container->
vfio_iommu_type1_attach_group->intel_iommu_attach_device)
If we use iommu=pt in the grub which means intel iommu driver will create a
static identity domain for all the PCI device,
Which will set the translation type into passthrough for all the context
entry for all the PCI devices,
But once we close QEMU process, e.g. the VFIO framework will invoke the
detach group operation and finally will call the intel_iommu_detach_device
which will clean the context entry.
(now the IOMMU entry for this device is not availablei)
For AMD iommu driver it handle this detach action right which will restore the
pt_domain (the same as static identity domain for intel) to the
corresponding entry.
Solution:
Add a work around in igb_uio driver which map one single page.
Because all the DMA related alloc and map
actions will cause the intel IOMMU driver to reload the SI domain to the context
entry, that's why the kernel driver never meets such problem.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
---
lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
Comments
On Tue, 10 May 2016 19:21:41 +0800 Zhe Tao <zhe.tao@intel.com> wrote: > Problem: > The following operations will cause the igb_uio based DPDK > operation failed. > --Any device assignment through the kvm_assign_device interface, > this can be the pci-assign method in QEMU > --VFIO group attachment operation(attach to the container) > this can happens in vfio-pci assignment in QEMU If you have an IOMMU why not use VFIO instead, it is better.
On Tue, May 10, 2016 at 4:59 PM, Stephen Hemminger < stephen@networkplumber.org> wrote: > On Tue, 10 May 2016 19:21:41 +0800 > Zhe Tao <zhe.tao@intel.com> wrote: > > > Problem: > > The following operations will cause the igb_uio based DPDK > > operation failed. > > --Any device assignment through the kvm_assign_device interface, > > this can be the pci-assign method in QEMU > > --VFIO group attachment operation(attach to the container) > > this can happens in vfio-pci assignment in QEMU > > > If you have an IOMMU why not use VFIO instead, it is better. > It is not about VFIO against UIO but about how iommu domains are created and destroyed by the (old) kernel when iommu=pt. So even with VFIO you can have problems. We have had problems like this and other due to our device (NFP) just mapping up to 40 bits of address space. Old kernels used in LTS distributions like Ubuntu are iommu buggy and you need to do things like this mapping inside the driver for solving problems. By the way, using SRIOV just adds more problems. It is not safe to use iommu=pt with 3.13.x Ubuntu kernels. It would be a good thing for the original patch to identify those kernels where the problem was detected. Of course, there could be more kernels with the same problem but that is more work to do.
On 5/11/2016 8:35 AM, Alejandro Lucero wrote: > On Tue, May 10, 2016 at 4:59 PM, Stephen Hemminger < > stephen@networkplumber.org> wrote: > >> On Tue, 10 May 2016 19:21:41 +0800 >> Zhe Tao <zhe.tao@intel.com> wrote: >> >>> Problem: >>> The following operations will cause the igb_uio based DPDK >>> operation failed. >>> --Any device assignment through the kvm_assign_device interface, >>> this can be the pci-assign method in QEMU >>> --VFIO group attachment operation(attach to the container) >>> this can happens in vfio-pci assignment in QEMU >> >> >> If you have an IOMMU why not use VFIO instead, it is better. >> > > It is not about VFIO against UIO but about how iommu domains are created > and destroyed by the (old) kernel when iommu=pt. So even with VFIO you can > have problems. Problem is in IOMMU driver but we are adding a workaround to igb_uio, if using VFIO solves the issue, I believe that is better workaround. 1) Is there any case IOMMU supported but VFIO is not supported? Is there anything forces to use igb_uio? 2) Does using VFIO solves the issue defined in problem statement? > > We have had problems like this and other due to our device (NFP) just > mapping up to 40 bits of address space. Old kernels used in LTS > distributions like Ubuntu are iommu buggy and you need to do things like > this mapping inside the driver for solving problems. By the way, using > SRIOV just adds more problems. It is not safe to use iommu=pt with 3.13.x > Ubuntu kernels. > > It would be a good thing for the original patch to identify those kernels > where the problem was detected. Of course, there could be more kernels with > the same problem but that is more work to do. > Thanks, ferruh
Ping, this patch is stalled. 2016-05-11 18:24, Ferruh Yigit: > On 5/11/2016 8:35 AM, Alejandro Lucero wrote: > > On Tue, May 10, 2016 at 4:59 PM, Stephen Hemminger < > > stephen@networkplumber.org> wrote: > > > >> On Tue, 10 May 2016 19:21:41 +0800 > >> Zhe Tao <zhe.tao@intel.com> wrote: > >> > >>> Problem: > >>> The following operations will cause the igb_uio based DPDK > >>> operation failed. > >>> --Any device assignment through the kvm_assign_device interface, > >>> this can be the pci-assign method in QEMU > >>> --VFIO group attachment operation(attach to the container) > >>> this can happens in vfio-pci assignment in QEMU > >> > >> > >> If you have an IOMMU why not use VFIO instead, it is better. > >> > > > > It is not about VFIO against UIO but about how iommu domains are created > > and destroyed by the (old) kernel when iommu=pt. So even with VFIO you can > > have problems. > > Problem is in IOMMU driver but we are adding a workaround to igb_uio, if > using VFIO solves the issue, I believe that is better workaround. > > 1) Is there any case IOMMU supported but VFIO is not supported? Is there > anything forces to use igb_uio? > > 2) Does using VFIO solves the issue defined in problem statement? > > > > > We have had problems like this and other due to our device (NFP) just > > mapping up to 40 bits of address space. Old kernels used in LTS > > distributions like Ubuntu are iommu buggy and you need to do things like > > this mapping inside the driver for solving problems. By the way, using > > SRIOV just adds more problems. It is not safe to use iommu=pt with 3.13.x > > Ubuntu kernels. > > > > It would be a good thing for the original patch to identify those kernels > > where the problem was detected. Of course, there could be more kernels with > > the same problem but that is more work to do. > > > > Thanks, > ferruh
On 7/8/2016 6:27 PM, Thomas Monjalon wrote: > 2016-05-11 18:24, Ferruh Yigit: >> On 5/11/2016 8:35 AM, Alejandro Lucero wrote: >>> On Tue, May 10, 2016 at 4:59 PM, Stephen Hemminger < >>> stephen@networkplumber.org> wrote: >>> >>>> On Tue, 10 May 2016 19:21:41 +0800 >>>> Zhe Tao <zhe.tao@intel.com> wrote: >>>> >>>>> Problem: >>>>> The following operations will cause the igb_uio based DPDK >>>>> operation failed. >>>>> --Any device assignment through the kvm_assign_device interface, >>>>> this can be the pci-assign method in QEMU >>>>> --VFIO group attachment operation(attach to the container) >>>>> this can happens in vfio-pci assignment in QEMU >>>> >>>> >>>> If you have an IOMMU why not use VFIO instead, it is better. >>>> >>> >>> It is not about VFIO against UIO but about how iommu domains are created >>> and destroyed by the (old) kernel when iommu=pt. So even with VFIO you can >>> have problems. >> >> Problem is in IOMMU driver but we are adding a workaround to igb_uio, if >> using VFIO solves the issue, I believe that is better workaround. >> >> 1) Is there any case IOMMU supported but VFIO is not supported? Is there >> anything forces to use igb_uio? >> >> 2) Does using VFIO solves the issue defined in problem statement? >> >>> >>> We have had problems like this and other due to our device (NFP) just >>> mapping up to 40 bits of address space. Old kernels used in LTS >>> distributions like Ubuntu are iommu buggy and you need to do things like >>> this mapping inside the driver for solving problems. By the way, using >>> SRIOV just adds more problems. It is not safe to use iommu=pt with 3.13.x >>> Ubuntu kernels. >>> >>> It would be a good thing for the original patch to identify those kernels >>> where the problem was detected. Of course, there could be more kernels with >>> the same problem but that is more work to do. >>> >B > Ping, this patch is stalled. > I am for rejecting this patch. The patch is useful for tester / developers who use both vfio and igb_uio. But if end user has environment support to use vfio, she should use vfio instead of having workaround to use both. Thanks, ferruh
diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c index 45a5720..3fa88b0 100644 --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c @@ -327,6 +327,18 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) struct rte_uio_pci_dev *udev; struct msix_entry msix_entry; int err; + struct page *page; + /* + * work around for Intel IOMMU implemation for SI doamin + */ + + page = alloc_page(GFP_ATOMIC); + if (!page) { + dev_err(&dev->dev, "Cannot alloc page\n"); + } else { + dma_map_page(&dev->dev, page, 0, PAGE_SIZE, DMA_FROM_DEVICE); + __free_page(page); + } udev = kzalloc(sizeof(struct rte_uio_pci_dev), GFP_KERNEL); if (!udev)