[dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK

Shreyansh Jain shreyansh.jain at nxp.com
Wed Mar 21 14:45:57 CET 2018


Hello Anatoly,

This is not necessarily right chain to reply to, but reusing this
email for another issue in DPAA2 so that all issues can be at a single
place.

On Thu, Mar 15, 2018 at 7:31 PM, Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
> Hello Anatoly,
>
> On Tue, Mar 13, 2018 at 10:47 AM, Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
>> Hello Anatoly,
>>
>> On Fri, Mar 9, 2018 at 4:12 PM, Burakov, Anatoly
>> <anatoly.burakov at intel.com> wrote:
>>> On 09-Mar-18 9:15 AM, Pavan Nikhilesh wrote:
>>
>> [...]
>>
>>>>
>>>>
>>>> I have taken a look at the github tree the issues with VFIO are gone,
>>>> Although
>>>> compilation issues with dpaa/dpaa2 are still present due to their
>>>> dependency on
>>>> `rte_eal_get_physmem_layout`.
>>>
>>>
>>> I've fixed the dpaa compile issue and pushed it to github. I've tried to
>>> keep the semantics the same as before, but i can't compile-test (let alone
>>> test-test) them as i don't have access to a system with dpaa bus.
>>
>> Thanks. I will have a look at this.
>
> Just a heads-up, DPAA2 is broken on top-of-tree (github:
> 784e041f6b520) as of now:
>
> --->8---
> root at ls2088ardb:~/shreyansh/07_dpdk_memory#
> ./arm64-dpaa2-linuxapp-gcc/app/testpmd -c 0xE -n 1 --log-level=eal,8
> --log-level=mem,8 -- -i --portmask=0x3
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 1 on socket 0
> EAL: Detected lcore 2 as core 0 on socket 0
> EAL: Detected lcore 3 as core 1 on socket 0
> EAL: Detected lcore 4 as core 0 on socket 0
> EAL: Detected lcore 5 as core 1 on socket 0
> EAL: Detected lcore 6 as core 0 on socket 0
> EAL: Detected lcore 7 as core 1 on socket 0
> EAL: Support maximum 16 logical core(s) by configuration.
> EAL: Detected 8 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: VFIO PCI modules not loaded
> EAL: DPAA Bus not present. Skipping.
> EAL: Container: dprc.2 has VFIO iommu group id = 4
> EAL: fslmc: Bus scan completed
> EAL: Module /sys/module/rte_kni not found! error 2 (No such file or directory)
> EAL: Multi-process socket /var/run/.rte_unix
> EAL: Probing VFIO support...
> EAL:   IOMMU type 1 (Type 1) is supported
> EAL:   IOMMU type 7 (sPAPR) is not supported
> EAL:   IOMMU type 8 (No-IOMMU) is not supported
> EAL: VFIO support initialized
> EAL: Mem event callback 'vfio_mem_event_clb' registered
> EAL: Ask a virtual area of 0x2e000 bytes
> EAL: Virtual area found at 0xffff86cae000 (size = 0x2e000)
> EAL: Setting up physically contiguous memory...
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0xffff8873f000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0xfff780000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0xffff8873e000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0xffef40000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0xffff8873d000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0xffe700000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0xffff8873c000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0xffdec0000000 (size = 0x800000000)
> EAL: TSC frequency is ~25000 KHz
> EAL: Master lcore 1 is ready (tid=88742110;cpuset=[1])
> EAL: lcore 3 is ready (tid=85cab910;cpuset=[3])
> EAL: lcore 2 is ready (tid=864ab910;cpuset=[2])
> EAL: eal_memalloc_alloc_page_bulk(): couldn't find suitable memseg_list
> error allocating rte services array
> EAL: FATAL: rte_service_init() failed
>
> EAL: rte_service_init() failed
>
> PANIC in main():
> Cannot init EAL
> 1: [./arm64-dpaa2-linuxapp-gcc/app/testpmd(rte_dump_stack+0x38) [0x4f37a8]]
> Aborted
> --->8--
>
> Above is an initial output - still investigating. I will keep you posted.
>

While working on issue reported in [1], I have found another issue
which I might need you help.

[1] http://dpdk.org/ml/archives/dev/2018-March/093202.html

For [1], I bypassed by changing the mempool_add_elem code for time
being - it now allows non-contiguous (not explicitly demanded
contiguous) allocations to go through rte_mempool_populate_iova. With
that, I was able to get DPAA2 working.

Problem is:
1. When I am working with 1GB pages, I/O is working fine.
2. When using 2MB pages (1024 num), the initialization somewhere after
VFIO layer fails.

All with IOVA=VA mode.

Some logs:

This is the output of the virtual memory layout demanded by DPDK:

--->8---
EAL: Ask a virtual area of 0x2e000 bytes
EAL: Virtual area found at 0xffffb6561000 (size = 0x2e000)
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0x59000 bytes
EAL: Virtual area found at 0xffffb6508000 (size = 0x59000)
EAL: Memseg list allocated: 0x800kB at socket 0
EAL: Ask a virtual area of 0x400000000 bytes
EAL: Virtual area found at 0xfffbb6400000 (size = 0x400000000)
EAL: Ask a virtual area of 0x59000 bytes
EAL: Virtual area found at 0xfffbb62af000 (size = 0x59000)
EAL: Memseg list allocated: 0x800kB at socket 0
EAL: Ask a virtual area of 0x400000000 bytes
EAL: Virtual area found at 0xfff7b6200000 (size = 0x400000000)
EAL: Ask a virtual area of 0x59000 bytes
EAL: Virtual area found at 0xfff7b6056000 (size = 0x59000)
EAL: Memseg list allocated: 0x800kB at socket 0
EAL: Ask a virtual area of 0x400000000 bytes
EAL: Virtual area found at 0xfff3b6000000 (size = 0x400000000)
EAL: Ask a virtual area of 0x59000 bytes
EAL: Virtual area found at 0xfff3b5dfd000 (size = 0x59000)
EAL: Memseg list allocated: 0x800kB at socket 0
EAL: Ask a virtual area of 0x400000000 bytes
EAL: Virtual area found at 0xffefb5c00000 (size = 0x400000000)
--->8---

Then, somehow VFIO mapping is able to find only a single page to map

--->8---
EAL: Device (dpci.1) abstracted from VFIO
EAL: -->Initial SHM Virtual ADDR FFFBB6400000
EAL: -----> DMA size 0x200000
EAL: Total 1 segments found.
--->8---

Then, these logs appear probably when DPAA2 code requests for memory.
I am not sure why it repeats the same '...expanded by 10MB'.

--->8---
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 2MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
EAL: Calling mem event callback vfio_mem_event_clbEAL: request: mp_malloc_sync
EAL: Heap on socket 0 was expanded by 10MB
LPM or EM none selected, default LPM on
Initializing port 0 ...
--->8---

l3fwd is stuck at this point. What I observe is that DPAA2 driver has
gone ahead to register the queues (queue_setup) with hardware and the
memory has either overrun (smaller than requested size mapped) or the
addresses are corrupt (that is, not dma-able). (I get SMMU faults,
indicating one of these cases)

There is some change from you in the fslmc/fslmc_vfio.c file
(rte_fslmc_vfio_dmamap()). Ideally, that code should have walked over
all the available pages for mapping but that didn't happen and only a
single virtual area got dma-mapped.

--->8---
EAL: Device (dpci.1) abstracted from VFIO
EAL: -->Initial SHM Virtual ADDR FFFBB6400000
EAL: -----> DMA size 0x200000
EAL: Total 1 segments found.
--->8---

I am looking into this but if there is some hint which come to your
mind, it might help.

Regards,
Shreyansh


More information about the dev mailing list