[dpdk-dev] vfio: try physical address if virtual address fails

Message ID 20171124235718.6064-1-3chas3@gmail.com (mailing list archive)
State Rejected, archived
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Chas Williams Nov. 24, 2017, 11:57 p.m. UTC
  Some machines appear to have buggy DMAR mappings.  A typical mapping
error looks like:

    DMAR: intel_iommu_map: iommu width (39) is not sufficient for the mapped address (7fc4fa800000)
    DMAR: intel_iommu_map: iommu width (39) is not sufficient for the mapped address (7fc4fa800000)
    DMAR: intel_iommu_map: iommu width (39) is not sufficient for the mapped address (7fc4fa800000)
    DMAR: intel_iommu_map: iommu width (39) is not sufficient for the mapped address (7fc4fa800000)

To work around this, attempt to do a physical address mapping if the
virtual address mapping fails.

Fixes: e85a919286d2 ("vfio: honor IOVA mode before mapping")

Signed-off-by: Chas Williams <chas3@att.com>
---
 lib/librte_eal/linuxapp/eal/eal_vfio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
  

Comments

Qi Zhang Nov. 27, 2017, 9:30 a.m. UTC | #1
Hi William:

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chas Williams
> Sent: Saturday, November 25, 2017 7:57 AM
> To: dev@dpdk.org
> Cc: skhare@vmware.com; Chas Williams <3chas3@gmail.com>; Chas
> Williams <chas3@att.com>
> Subject: [dpdk-dev] [PATCH] vfio: try physical address if virtual address fails
> 
> Some machines appear to have buggy DMAR mappings.  A typical mapping
> error looks like:
> 
>     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> mapped address (7fc4fa800000)
>     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> mapped address (7fc4fa800000)
>     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> mapped address (7fc4fa800000)
>     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> mapped address (7fc4fa800000)
> 
I met the same issue on some intel atom platform, the root cause is IOMMU only support 39 bit virtual address.
Not sure retry with physical address will be the right fix. I saw rte_eal_iova_mode is called at other place, it still take the virtual address as the mapped result, does that break something? 
So far the workaround may works by using --virtbase-addr to assign a address in range explicitly (for example 0x70,0000,0000)
Regards
Qi

> To work around this, attempt to do a physical address mapping if the virtual
> address mapping fails.
> 
> Fixes: e85a919286d2 ("vfio: honor IOVA mode before mapping")
> 
> Signed-off-by: Chas Williams <chas3@att.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 58f0123..6250676 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -35,6 +35,7 @@
>  #include <fcntl.h>
>  #include <unistd.h>
>  #include <sys/ioctl.h>
> +#include <stdbool.h>
> 
>  #include <rte_log.h>
>  #include <rte_memory.h>
> @@ -702,6 +703,7 @@ vfio_type1_dma_map(int vfio_container_fd)
>  	/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>  	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>  		struct vfio_iommu_type1_dma_map dma_map;
> +		int retried = false;
> 
>  		if (ms[i].addr == NULL)
>  			break;
> @@ -716,9 +718,15 @@ vfio_type1_dma_map(int vfio_container_fd)
>  			dma_map.iova = ms[i].iova;
>  		dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
> VFIO_DMA_MAP_FLAG_WRITE;
> 
> +retry:
>  		ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA,
> &dma_map);
> 
>  		if (ret) {
> +			if (!retried && rte_eal_iova_mode() == RTE_IOVA_VA) {
> +				dma_map.iova = ms[i].iova;


> +				retried = true;
> +				goto retry;
> +			}
>  			RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
>  					  "error %i (%s)\n", errno,
>  					  strerror(errno));
> --
> 2.9.5
Regards
Qi
  
Chas Williams Nov. 27, 2017, 12:58 p.m. UTC | #2
On Mon, Nov 27, 2017 at 4:30 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:

> Hi William:
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chas Williams
> > Sent: Saturday, November 25, 2017 7:57 AM
> > To: dev@dpdk.org
> > Cc: skhare@vmware.com; Chas Williams <3chas3@gmail.com>; Chas
> > Williams <chas3@att.com>
> > Subject: [dpdk-dev] [PATCH] vfio: try physical address if virtual
> address fails
> >
> > Some machines appear to have buggy DMAR mappings.  A typical mapping
> > error looks like:
> >
> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> > mapped address (7fc4fa800000)
> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> > mapped address (7fc4fa800000)
> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> > mapped address (7fc4fa800000)
> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
> > mapped address (7fc4fa800000)
> >
> I met the same issue on some intel atom platform, the root cause is IOMMU
> only support 39 bit virtual address.
> Not sure retry with physical address will be the right fix. I saw
> rte_eal_iova_mode is called at other place, it still take the virtual
> address as the mapped result, does that break something?
> So far the workaround may works by using --virtbase-addr to assign a
> address in range explicitly (for example 0x70,0000,0000)
> Regards
> Qi
>

I don't think the IOVA usage elsewhere is an issue since I limited my
changes to reworking what was done in commit e85a919286d2 which appears to
be what broken things for me.

It's not clear that passing --base-virtaddr is a workable solution.  First,
it's only a hint to mmap().  The kernel really can do whatever it wants.
And then, what values do I pick?  If one value fails, do I just restart and
keep trying new values until it succeeds?  I just want to fall back to the
previous behavior.  I know the physical mapping will succeed since the DMAR
tables are limited to 36 bits in width.

And yes, I should update the commit to blame the IOMMU width.


>
> > To work around this, attempt to do a physical address mapping if the
> virtual
> > address mapping fails.
> >
> > Fixes: e85a919286d2 ("vfio: honor IOVA mode before mapping")
> >
> > Signed-off-by: Chas Williams <chas3@att.com>
> > ---
> >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > index 58f0123..6250676 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -35,6 +35,7 @@
> >  #include <fcntl.h>
> >  #include <unistd.h>
> >  #include <sys/ioctl.h>
> > +#include <stdbool.h>
> >
> >  #include <rte_log.h>
> >  #include <rte_memory.h>
> > @@ -702,6 +703,7 @@ vfio_type1_dma_map(int vfio_container_fd)
> >       /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> >       for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> >               struct vfio_iommu_type1_dma_map dma_map;
> > +             int retried = false;
> >
> >               if (ms[i].addr == NULL)
> >                       break;
> > @@ -716,9 +718,15 @@ vfio_type1_dma_map(int vfio_container_fd)
> >                       dma_map.iova = ms[i].iova;
> >               dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
> > VFIO_DMA_MAP_FLAG_WRITE;
> >
> > +retry:
> >               ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA,
> > &dma_map);
> >
> >               if (ret) {
> > +                     if (!retried && rte_eal_iova_mode() ==
> RTE_IOVA_VA) {
> > +                             dma_map.iova = ms[i].iova;
>
>
> > +                             retried = true;
> > +                             goto retry;
> > +                     }
> >                       RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
> >                                         "error %i (%s)\n", errno,
> >                                         strerror(errno));
> > --
> > 2.9.5
> Regards
> Qi
>
  
Chas Williams Nov. 30, 2017, 2:03 a.m. UTC | #3
OK, I found a machine in our local collection that couldn't get by with
this patch, so I will submitting a follow up that forces
rte_eval_iova_mode() into PA mode.

On Mon, Nov 27, 2017 at 7:58 AM, Chas Williams <3chas3@gmail.com> wrote:

>
>
> On Mon, Nov 27, 2017 at 4:30 AM, Zhang, Qi Z <qi.z.zhang@intel.com> wrote:
>
>> Hi William:
>>
>> > -----Original Message-----
>> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chas Williams
>> > Sent: Saturday, November 25, 2017 7:57 AM
>> > To: dev@dpdk.org
>> > Cc: skhare@vmware.com; Chas Williams <3chas3@gmail.com>; Chas
>> > Williams <chas3@att.com>
>> > Subject: [dpdk-dev] [PATCH] vfio: try physical address if virtual
>> address fails
>> >
>> > Some machines appear to have buggy DMAR mappings.  A typical mapping
>> > error looks like:
>> >
>> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
>> > mapped address (7fc4fa800000)
>> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
>> > mapped address (7fc4fa800000)
>> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
>> > mapped address (7fc4fa800000)
>> >     DMAR: intel_iommu_map: iommu width (39) is not sufficient for the
>> > mapped address (7fc4fa800000)
>> >
>> I met the same issue on some intel atom platform, the root cause is IOMMU
>> only support 39 bit virtual address.
>> Not sure retry with physical address will be the right fix. I saw
>> rte_eal_iova_mode is called at other place, it still take the virtual
>> address as the mapped result, does that break something?
>> So far the workaround may works by using --virtbase-addr to assign a
>> address in range explicitly (for example 0x70,0000,0000)
>> Regards
>> Qi
>>
>
> I don't think the IOVA usage elsewhere is an issue since I limited my
> changes to reworking what was done in commit e85a919286d2 which appears to
> be what broken things for me.
>
> It's not clear that passing --base-virtaddr is a workable solution.
> First, it's only a hint to mmap().  The kernel really can do whatever it
> wants.  And then, what values do I pick?  If one value fails, do I just
> restart and keep trying new values until it succeeds?  I just want to fall
> back to the previous behavior.  I know the physical mapping will succeed
> since the DMAR tables are limited to 36 bits in width.
>
> And yes, I should update the commit to blame the IOMMU width.
>
>
>>
>> > To work around this, attempt to do a physical address mapping if the
>> virtual
>> > address mapping fails.
>> >
>> > Fixes: e85a919286d2 ("vfio: honor IOVA mode before mapping")
>> >
>> > Signed-off-by: Chas Williams <chas3@att.com>
>> > ---
>> >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 8 ++++++++
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> > index 58f0123..6250676 100644
>> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> > @@ -35,6 +35,7 @@
>> >  #include <fcntl.h>
>> >  #include <unistd.h>
>> >  #include <sys/ioctl.h>
>> > +#include <stdbool.h>
>> >
>> >  #include <rte_log.h>
>> >  #include <rte_memory.h>
>> > @@ -702,6 +703,7 @@ vfio_type1_dma_map(int vfio_container_fd)
>> >       /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>> >       for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>> >               struct vfio_iommu_type1_dma_map dma_map;
>> > +             int retried = false;
>> >
>> >               if (ms[i].addr == NULL)
>> >                       break;
>> > @@ -716,9 +718,15 @@ vfio_type1_dma_map(int vfio_container_fd)
>> >                       dma_map.iova = ms[i].iova;
>> >               dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
>> > VFIO_DMA_MAP_FLAG_WRITE;
>> >
>> > +retry:
>> >               ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA,
>> > &dma_map);
>> >
>> >               if (ret) {
>> > +                     if (!retried && rte_eal_iova_mode() ==
>> RTE_IOVA_VA) {
>> > +                             dma_map.iova = ms[i].iova;
>>
>>
>> > +                             retried = true;
>> > +                             goto retry;
>> > +                     }
>> >                       RTE_LOG(ERR, EAL, "  cannot set up DMA remapping,
>> "
>> >                                         "error %i (%s)\n", errno,
>> >                                         strerror(errno));
>> > --
>> > 2.9.5
>> Regards
>> Qi
>>
>
>
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 58f0123..6250676 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -35,6 +35,7 @@ 
 #include <fcntl.h>
 #include <unistd.h>
 #include <sys/ioctl.h>
+#include <stdbool.h>
 
 #include <rte_log.h>
 #include <rte_memory.h>
@@ -702,6 +703,7 @@  vfio_type1_dma_map(int vfio_container_fd)
 	/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
 	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
 		struct vfio_iommu_type1_dma_map dma_map;
+		int retried = false;
 
 		if (ms[i].addr == NULL)
 			break;
@@ -716,9 +718,15 @@  vfio_type1_dma_map(int vfio_container_fd)
 			dma_map.iova = ms[i].iova;
 		dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
 
+retry:
 		ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
 
 		if (ret) {
+			if (!retried && rte_eal_iova_mode() == RTE_IOVA_VA) {
+				dma_map.iova = ms[i].iova;
+				retried = true;
+				goto retry;
+			}
 			RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
 					  "error %i (%s)\n", errno,
 					  strerror(errno));