[dpdk-dev] [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus addresses for DMA map

Alexey Kardashevskiy aik at ozlabs.ru
Fri Apr 21 00:01:13 CEST 2017


On 21/04/17 01:15, Jonas Pfefferle1 wrote:
> Alexey Kardashevskiy <aik at ozlabs.ru> wrote on 20/04/2017 16:22:01:
> 
>> From: Alexey Kardashevskiy <aik at ozlabs.ru>
>> To: Jonas Pfefferle1 <JPF at zurich.ibm.com>
>> Cc: dev at dpdk.org, Gowrishankar Muthukrishnan
>> <gowrishankar.m at in.ibm.com>, Adrian Schuepbach <DRI at zurich.ibm.com>
>> Date: 20/04/2017 16:22
>> Subject: Re: [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus
>> addresses for DMA map
>>
>> On 20/04/17 23:25, Alexey Kardashevskiy wrote:
>> > On 20/04/17 19:04, Jonas Pfefferle1 wrote:
>> >> Alexey Kardashevskiy <aik at ozlabs.ru> wrote on 20/04/2017 09:24:02:
>> >>
>> >>> From: Alexey Kardashevskiy <aik at ozlabs.ru>
>> >>> To: dev at dpdk.org
>> >>> Cc: Alexey Kardashevskiy <aik at ozlabs.ru>, JPF at zurich.ibm.com,
>> >>> Gowrishankar Muthukrishnan <gowrishankar.m at in.ibm.com>
>> >>> Date: 20/04/2017 09:24
>> >>> Subject: [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus
>> >>> addresses for DMA map
>> >>>
>> >>> VFIO_IOMMU_SPAPR_TCE_CREATE ioctl() returns the actual bus address for
>> >>> just created DMA window. It happens to start from zero because the
> default
>> >>> window is removed (leaving no windows) and new window starts from zero.
>> >>> However this is not guaranteed and the new window may start from another
>> >>> address, this adds an error check.
>> >>>
>> >>> Another issue is that IOVA passed to VFIO_IOMMU_MAP_DMA should be a PCI
>> >>> bus address while in this case a physical address of a user page is used.
>> >>> This changes IOVA to start from zero in a hope that the rest of DPDK
>> >>> expects this.
>> >>
>> >> This is not the case. DPDK expects a 1:1 mapping PA==IOVA. It will use the
>> >> phys_addr of the memory segment it got from /proc/self/pagemap cf.
>> >> librte_eal/linuxapp/eal/eal_memory.c. We could try setting it here to the
>> >> actual iova which basically makes the whole virtual to phyiscal mapping
>> >> with pagemap unnecessary which I believe should be the case for VFIO
>> >> anyway. Pagemap should only be needed when using pci_uio.
>> >
>> >
>> > Ah, ok, makes sense now. But it sure needs a big fat comment there as it is
>> > not obvious why host RAM address is used there as DMA window start is not
>> > guaranteed.
>>
>> Well, either way there is some bug - ms[i].phys_addr and ms[i].addr_64 both
>> have exact same value, in my setup it is 3fffb33c0000 which is a userspace
>> address - at least ms[i].phys_addr must be physical address.
> 
> This might be the case if you are not using hugetlbfs i.e. passing
> "--no-huge" cf. eal_memory.c:980
> 
> /* hugetlbfs can be disabled */
> if (internal_config.no_hugetlbfs) {
> addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
> if (addr == MAP_FAILED) {
> RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
> strerror(errno));
> return -1;
> }
> mcfg->memseg[0].phys_addr = (phys_addr_t)(uintptr_t)addr;
> mcfg->memseg[0].addr = addr;
> mcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K;
> mcfg->memseg[0].len = internal_config.memory;
> mcfg->memseg[0].socket_id = 0;
> return 0;
> }
> 
> If it fails to get the virt2phys mapping it actually assigns iovas starting
> from 0 to the memory segments, cf. set_physaddrs eal_memory.c:263

Right, this is the case here.


> 
>>
>>
>> >
>> >
>> >>
>> >>>
>> >>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>> >>> ---
>> >>>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 12 ++++++++++--
>> >>>  1 file changed, 10 insertions(+), 2 deletions(-)
>> >>>
>> >>> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/
>> >>> librte_eal/linuxapp/eal/eal_vfio.c
>> >>> index 46f951f4d..8b8e75c4f 100644
>> >>> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> >>> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
>> >>> @@ -658,7 +658,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
>> >>>  {
>> >>>     const struct rte_memseg *ms = rte_eal_get_physmem_layout();
>> >>>     int i, ret;
>> >>> -
>> >>> +   phys_addr_t io_offset;
>> >>>     struct vfio_iommu_spapr_register_memory reg = {
>> >>>        .argsz = sizeof(reg),
>> >>>        .flags = 0
>> >>> @@ -702,6 +702,13 @@ vfio_spapr_dma_map(int vfio_container_fd)
>> >>>        return -1;
>> >>>     }
>> >>>  
>> >>> +   io_offset = create.start_addr;
>> >>> +   if (io_offset) {
>> >>> +      RTE_LOG(ERR, EAL, "  DMA offsets other than zero is not
>> supported, "
>> >>> +            "new window is created at %lx\n", io_offset);
>> >>> +      return -1;
>> >>> +   }
>> >>> +
>> >>>     /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>> >>>     for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>> >>>        struct vfio_iommu_type1_dma_map dma_map;
>> >>> @@ -723,7 +730,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
>> >>>        dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
>> >>>        dma_map.vaddr = ms[i].addr_64;
>> >>>        dma_map.size = ms[i].len;
>> >>> -      dma_map.iova = ms[i].phys_addr;
>> >>> +      dma_map.iova = io_offset;
>> >>>        dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
>> >>>               VFIO_DMA_MAP_FLAG_WRITE;
>> >>>  
>> >>> @@ -735,6 +742,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
>> >>>           return -1;
>> >>>        }
>> >>>  
>> >>> +      io_offset += dma_map.size;
>> >>>     }
>> >>>  
>> >>>     return 0;
>> >>> --
>> >>> 2.11.0
>> >>>
>> >>
>> >
>> >
>>
>>
>> --
>> Alexey
>>
> 


-- 
Alexey


More information about the dev mailing list