[dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK

Burakov, Anatoly anatoly.burakov at intel.com
Fri Mar 9 11:42:03 CET 2018


On 09-Mar-18 9:15 AM, Pavan Nikhilesh wrote:
> On Thu, Mar 08, 2018 at 08:33:21PM +0000, Burakov, Anatoly wrote:
>> On 08-Mar-18 8:11 PM, Burakov, Anatoly wrote:
>>> On 08-Mar-18 2:36 PM, Burakov, Anatoly wrote:
>>>> On 08-Mar-18 1:36 PM, Pavan Nikhilesh wrote:
>>>>> Hi Anatoly,
>>>>>
>>>>> We are currently facing issues with running testpmd on thunderx
>>>>> platform.
>>>>> The issue seems to be with vfio
>>>>>
>>>>> EAL: Detected 24 lcore(s)
>>>>> EAL: Detected 1 NUMA nodes
>>>>> EAL: No free hugepages reported in hugepages-2048kB
>>>>> EAL: Multi-process socket /var/run/.rte_unix
>>>>> EAL: Probing VFIO support...
>>>>> EAL: VFIO support initialized
>>>>> EAL:   VFIO support not initialized
>>>>>
>>>>> <snip>
>>>>>
>>>>> EAL:   probe driver: 177d:a053 octeontx_fpavf
>>>>> EAL: PCI device 0001:01:00.1 on NUMA socket 0
>>>>> EAL:   probe driver: 177d:a034 net_thunderx
>>>>> EAL:   using IOMMU type 1 (Type 1)
>>>>> EAL:   cannot set up DMA remapping, error 22 (Invalid argument)
>>>>> EAL:   0001:01:00.1 DMA remapping failed, error 22 (Invalid argument)
>>>>> EAL: Requested device 0001:01:00.1 cannot be used
>>>>> EAL: PCI device 0001:01:00.2 on NUMA socket 0
>>>>> <snip>
>>>>> testpmd: No probed ethernet devices
>>>>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=251456,
>>>>> size=2176, socket=0
>>>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>>>> EAL:   VFIO support not initialized
>>>>> EAL:   VFIO support not initialized
>>>>> EAL:   VFIO support not initialized
>>>>> Done
>>>>>
>>>>>
>>>>> This is because rte_service_init() calls rte_calloc() before
>>>>> rte_bus_probe() and vfio_dma_mem_map fails because iommu type is
>>>>> not set.
>>>>>
>>>>> Call stack:
>>>>> gdb) bt
>>>>> #0  vfio_dma_mem_map (vaddr=281439006359552, iova=11274289152,
>>>>> len=536870912, do_map=1) at
>>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:967
>>>>> #1  0x00000000004fd974 in rte_vfio_dma_map
>>>>> (vaddr=281439006359552, iova=11274289152, len=536870912) at
>>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:988
>>>>> #2  0x00000000004fbe78 in vfio_mem_event_callback
>>>>> (type=RTE_MEM_EVENT_ALLOC, addr=0xfff7a0000000, len=536870912)
>>>>> at /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:240
>>>>> #3  0x00000000005070ac in eal_memalloc_notify
>>>>> (event=RTE_MEM_EVENT_ALLOC, start=0xfff7a0000000, len=536870912)
>>>>> at
>>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_memalloc.c:177
>>>>> #4  0x0000000000515c98 in try_expand_heap_primary
>>>>> (heap=0xffffb7fb167c, pg_sz=536870912, elt_size=8192, socket=0,
>>>>> flags=0, align=128, bound=0, contig=false) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:247
>>>>> #5  0x0000000000515e94 in try_expand_heap (heap=0xffffb7fb167c,
>>>>> pg_sz=536870912, elt_size=8192, socket=0, flags=0, align=128,
>>>>> bound=0, contig=false) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:327
>>>>> #6  0x00000000005163a0 in alloc_more_mem_on_socket
>>>>> (heap=0xffffb7fb167c, size=8192, socket=0, flags=0, align=128,
>>>>> bound=0, contig=false) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:455
>>>>> #7  0x0000000000516514 in heap_alloc_on_socket (type=0x85bf90
>>>>> "rte_services", size=8192, socket=0, flags=0, align=128,
>>>>> bound=0, contig=false) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:491
>>>>> #8  0x0000000000516664 in malloc_heap_alloc (type=0x85bf90
>>>>> "rte_services", size=8192, socket_arg=-1, flags=0, align=128,
>>>>> bound=0, contig=false) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:527
>>>>> #9  0x0000000000513b54 in rte_malloc_socket (type=0x85bf90
>>>>> "rte_services", size=8192, align=128, socket_arg=-1) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:54
>>>>> #10 0x0000000000513bc8 in rte_zmalloc_socket (type=0x85bf90
>>>>> "rte_services", size=8192, align=128, socket=-1) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:72
>>>>> #11 0x0000000000513c00 in rte_zmalloc (type=0x85bf90
>>>>> "rte_services", size=8192, align=128) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:81
>>>>> #12 0x0000000000513c90 in rte_calloc (type=0x85bf90
>>>>> "rte_services", num=64, size=128, align=128) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:99
>>>>> #13 0x0000000000518cec in rte_service_init () at
>>>>> /root/clean/dpdk/lib/librte_eal/common/rte_service.c:81
>>>>> #14 0x00000000004f55f4 in rte_eal_init (argc=3,
>>>>> argv=0xfffffffff488) at
>>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal.c:959
>>>>> #15 0x000000000045af5c in main (argc=3, argv=0xfffffffff488) at
>>>>> /root/clean/dpdk/app/test-pmd/testpmd.c:2483
>>>>>
>>>>>
>>>>> Also, I have tried running with --legacy-mem but I'm stuck in
>>>>> `pci_find_max_end_va` loop  because `rte_fbarray_find_next_used`
>>>>> always return
>>>>> 0. >
>>>>> HugePages_Total:      15
>>>>> HugePages_Free:       11
>>>>> HugePages_Rsvd:        0
>>>>> HugePages_Surp:        0
>>>>> Hugepagesize:     524288 kB
>>>>>
>>>>> Call Stack:
>>>>> (gdb) bt
>>>>> #0  find_next (arr=0xffffb7fb009c, start=0, used=true) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_fbarray.c:248
>>>>> #1  0x00000000005132a8 in rte_fbarray_find_next_used
>>>>> (arr=0xffffb7fb009c, start=0) at
>>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_fbarray.c:700
>>>>> #2  0x000000000052d030 in pci_find_max_end_va () at
>>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci.c:138
>>>>> #3  0x0000000000530ab8 in pci_vfio_map_resource_primary
>>>>> (dev=0xeae700) at
>>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci_vfio.c:499
>>>>> #4  0x0000000000530ffc in pci_vfio_map_resource (dev=0xeae700)
>>>>> at /root/clean/dpdk/drivers/bus/pci/linux/pci_vfio.c:601
>>>>> #5  0x000000000052ce90 in rte_pci_map_device (dev=0xeae700) at
>>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci.c:75
>>>>> #6  0x0000000000531a20 in rte_pci_probe_one_driver (dr=0x997e20
>>>>> <rte_nicvf_pmd>, dev=0xeae700) at
>>>>> /root/clean/dpdk/drivers/bus/pci/pci_common.c:164
>>>>> #7  0x0000000000531c68 in pci_probe_all_drivers (dev=0xeae700)
>>>>> at /root/clean/dpdk/drivers/bus/pci/pci_common.c:249
>>>>> #8  0x0000000000531f68 in rte_pci_probe () at
>>>>> /root/clean/dpdk/drivers/bus/pci/pci_common.c:359
>>>>> #9  0x000000000050a140 in rte_bus_probe () at
>>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_bus.c:98
>>>>> #10 0x00000000004f55f4 in rte_eal_init (argc=1,
>>>>> argv=0xfffffffff498) at
>>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal.c:967
>>>>> #11 0x000000000045af5c in main (argc=1, argv=0xfffffffff498) at
>>>>> /root/clean/dpdk/app/test-pmd/testpmd.c:2483
>>>>>
>>>>> Am I missing something here?
>>>>
>>>> I'll look into those, thanks!
>>>>
>>>> Btw, i've now set up a github repo with the patchset applied:
>>>>
>>>> https://github.com/anatolyburakov/dpdk
>>>>
>>>> I will be pushing quick fixes there before spinning new revisions,
>>>> so we can discover and fix bugs more rapidly. I'll fix compile
>>>> issues reported earlier, then i'll take a look at your issues. The
>>>> latter one seems like a typo, the former is probably a matter of
>>>> moving things around a bit.
>>>>
>>>> (also, pull requests welcome if you find it easier to fix things
>>>> yourself and submit patches against my tree!)
>>>>
>>>> Thanks for testing.
>>>>
>>>
>>> I've looked into the failures.
>>>
>>> The VFIO one is not actually a failure. It only prints out errors
>>> because rte_malloc is called before VFIO is initialized. However, once
>>> VFIO *is* initialized, all of that memory would be added to VFIO, so
>>> these error messages are harmless. Regardless, i've added a check to see
>>> if init is finished before printing out those errors, so they won't be
>>> printed out any more.
>>>
>>> Second one is a typo on my part that got lost in one of the rebases.
>>>
>>> I've pushed fixes for both into the github repo.
>>>
>>
>> Although i do wonder where do the DMA remapping errors come from. The error
>> message says "invalid argument", so that doesn't come from rte_service or
>> anything to do with rte_malloc - this is us not providing valid arguments to
>> VFIO. I'm not seeing these errors on my system. I'll check on others to be
>> sure.
> 
> I have taken a look at the github tree the issues with VFIO are gone, Although
> compilation issues with dpaa/dpaa2 are still present due to their dependency on
> `rte_eal_get_physmem_layout`.

I've fixed the dpaa compile issue and pushed it to github. I've tried to 
keep the semantics the same as before, but i can't compile-test (let 
alone test-test) them as i don't have access to a system with dpaa bus.

Also, you might want to know that dpaa bus driver references 
RTE_LIBRTE_DPAA_MAX_CRYPTODEV which is only found in 
config/common_armv8a_linuxapp but is not present in base config. Not 
sure if that's an issue.

> 
>>
>> --
>> Thanks,
>> Anatoly
> 
> Thanks,
> Pavan
> 


-- 
Thanks,
Anatoly


More information about the dev mailing list