[dpdk-users] segfault with dpdk 16.07 in rte_mempool_populate_phys

Shreyansh Jain shreyansh.jain at nxp.com
Mon Aug 22 15:58:44 CEST 2016


Hi Martin,

See inline.
(Also, please don't remove mail thread text in replied as it loses context).

> -----Original Message-----
> From: martin_curran-gray at keysight.com [mailto:martin_curran-
> gray at keysight.com]
> Sent: Friday, August 19, 2016 1:58 PM
> To: Shreyansh Jain <shreyansh.jain at nxp.com>; users at dpdk.org
> Subject: RE: segfault with dpdk 16.07 in rte_mempool_populate_phys
> 
> Hi Shreyansh,
> 
> Thanks for your reply,
> 
> Hmmm, I had wondered if the debug output from 16.7 was reduced compared to
> 2.2.0, but perhaps this is what I should have been concentrating on, rather
> than the core later
> 
> 
> On a vm running our app using 2.2.0 at startup, I see:
> 
> dpdk: In dpdk_init_eal core_mask is  79, master_core_id  is 0
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 0 on socket 0
> EAL: Detected lcore 2 as core 0 on socket 0
> EAL: Detected lcore 3 as core 0 on socket 0
> EAL: Detected lcore 4 as core 0 on socket 0
> EAL: Detected lcore 5 as core 0 on socket 0
> EAL: Detected lcore 6 as core 0 on socket 0
> EAL: Support maximum 32 logical core(s) by configuration.
> EAL: Detected 7 lcore(s)
> EAL: Setting up physically contiguous memory...
> EAL: Ask a virtual area of 0x40000000 bytes
> EAL: Virtual area found at 0x7f2735600000 (size = 0x40000000)
> EAL: Requesting 512 pages of size 2MB from socket 0
> EAL: TSC frequency is ~2094950 KHz
> EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
> clock cycles !
> EAL: Master lcore 0 is ready (tid=9a11c720;cpuset=[0])
> EAL: Failed to set thread name for interrupt handling
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: Cannot set name for lcore thread
> EAL: lcore 4 is ready (tid=33ff7700;cpuset=[4])
> EAL: lcore 3 is ready (tid=349f8700;cpuset=[3])
> EAL: lcore 6 is ready (tid=32bf5700;cpuset=[6])
> EAL: lcore 5 is ready (tid=335f6700;cpuset=[5])
> EAL: PCI device 0000:00:07.0 on NUMA socket -1
> EAL:   probe driver: 8086:1521 rte_igb_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> EAL: PCI device 0000:00:08.0 on NUMA socket -1
> EAL:   probe driver: 8086:1572 rte_i40e_pmd
> EAL:   PCI memory mapped at 0x7f27319f5000
> EAL:   PCI memory mapped at 0x7f279a33c000
> PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000224e
> 
> However on my vm running our app but with 16.7 I see much less EAL output,
> the other stuff is printf output I put in the dpdk code to try and figure out
> where it was going wrong
> 
> dpdk: In dpdk_init_eal core_mask is  79, master_core_id  is 0
> EAL: Detected 7 lcore(s)
> EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
> clock cycles !
> 
> dpdk_init_memory_pools  position 1
> dpdk_init_memory_pools  position 2
> dpdk_init_memory_pools  position 3
> 
>   about to call ret_mempool_create
> 
>   name               Error Ind Mempool
>   number             8
>   element size       256
>   cache size         4
>   private data size  4
>   mp_init            1158173360
>   mp_init_arg        0
>   obj_init           1158173120
>   obj_init_arg       0
>   socket_id          4294967295
>   flags              0
> 
> 
>  at start of rte_mempool_create
>  at start of rte_mempool_populate_default
>  at start of rte_mempool_populate_phys
> 
> 
> Is this just down to a change of the debug output from within the EAL , or is
> something going fundamentally wrong.

The number of messages (specially the lcore detection, etc) have definitely been reduced across 16.07.
>From what I remember, Lcore detection, VFIO support and eventually application specific log was what was getting printed. As soon as I have access to a vanilla 16.07 app, I will post the output (on Host only). But, it seems fine to me as of now.

> 
> There is output about the individual detected lcores, there is no output
> about the setting up physically contiguous memory.. etc

Which is OK I think. Most of the INFO have been moved to DEBUG which is why you won't see the 2.2.0 messages. 

> 
> However if my call to rte_eal_init  hadn't worked, I shouldn't have to as far
> as trying to call rte_mempool_create
> 
> We check for a return of rte_eal_init of < 0 and if so, we rte_exit.
> 
> I'll have a look over the newer documentation for the debug output

For the stack trace that you dumped in previous email, would it be possible to recompile without the optimization flags and dump it again?
It is possible that the core is hitting some path because of which clean exit is not happening.

> 
> Thanks
> 
> Martin
> 

-
Shreyansh



More information about the users mailing list