[dpdk-dev] rte_hash_del_key crash in multi-process environment

De Lara Guarch, Pablo pablo.de.lara.guarch at intel.com
Tue Apr 19 09:39:16 CEST 2016


Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of ??
> Sent: Tuesday, April 19, 2016 5:58 AM
> To: De Lara Guarch, Pablo
> Cc: Thomas Monjalon; Gonzalez Monroy, Sergio; dev at dpdk.org; Dhana
> Eadala; Richardson, Bruce; Qiu, Michael
> Subject: [dpdk-dev] rte_hash_del_key crash in multi-process environment
> 
> Hi all,
> 
> 
> In the multi-process environment, before I met a bug when calling
> rte_hash_lookup_with_hash. Using Dhana's patch fixed my problem. Now I
> need to remove the flow in the multi-process environment, the system gets
> crashed when calling rte_hash_del_key function. The following is the gdb
> trace. Does anybody meet this problem or know how to fix it?

First of all, another fix for the multi-process support was implemented and merged for 16.04 release,
so take a look at it, if you can.
Regarding the rte_hash_del_key() function, you should use rte_hash_del_key_with_hash,
if you want to use it in a multi-process environment (as you did for the lookup function).

Thanks,
Pablo

> 
> 
> 
> 
> Program received signal SIGILL, Illegal instruction.
> 
> 0x000000000048a0dd in rte_port_ring_reader_frag_free
> (port=0x7ffe113d4100) at /home/zhangwei1984/timopenNetVM/dpdk-
> 2.2.0/lib/librte_port/rte_port_frag.c:266
> 
> 266            return -1;
> 
> (gdb) bt
> 
> #0  0x000000000048a0dd in rte_port_ring_reader_frag_free
> (port=0x7ffe113d4100) at /home/zhangwei1984/timopenNetVM/dpdk-
> 2.2.0/lib/librte_port/rte_port_frag.c:266
> 
> #1  0x000000000049c537 in rte_hash_del_key (h=0x7ffe113d4100,
> key=0x7ffe092e1000)
> 
>    at /home/zhangwei1984/timopenNetVM/dpdk-
> 2.2.0/lib/librte_hash/rte_cuckoo_hash.c:917
> 
> #2  0x000000000043716a in onvm_ft_remove_key (table=0x7ffe113c3e80,
> key=0x7ffe092e1000) at /home/zhangwei1984/onvm-shared-
> cpu/onvm/shared/onvm_flow_table.c:160
> 
> #3  0x000000000043767e in onvm_flow_dir_del_and_free_key
> (key=0x7ffe092e1000) at /home/zhangwei1984/onvm-shared-
> cpu/onvm/shared/onvm_flow_dir.c:144
> 
> #4  0x0000000000437619 in onvm_flow_dir_del_key (key=0x7ffe092e1000) at
> /home/zhangwei1984/onvm-shared-
> cpu/onvm/shared/onvm_flow_dir.c:128
> 
> #5  0x0000000000423ded in remove_flow_rule (idx=3) at
> /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:130
> 
> #6  0x0000000000423e44 in clear_stat_remove_flow_rule
> (nf_info=0x7fff3e652100) at /home/zhangwei1984/onvm-shared-
> cpu/examples/flow_dir/flow_dir.c:145
> 
> #7  0x00000000004247e3 in alloc_nfs_install_flow_rule (services=0xd66e90
> <services>, pkt=0x7ffe13f56400)
> 
>    at /home/zhangwei1984/onvm-shared-
> cpu/examples/flow_dir/flow_dir.c:186
> 
> #8  0x0000000000424bdb in packet_handler (pkt=0x7ffe13f56400,
> meta=0x7ffe13f56440) at /home/zhangwei1984/onvm-shared-
> cpu/examples/flow_dir/flow_dir.c:294
> 
> #9  0x000000000043001d in onvm_nf_run (info=0x7fff3e651b00,
> handler=0x424b21 <packet_handler>) at /home/zhangwei1984/onvm-
> shared-cpu/onvm/onvm_nf/onvm_nflib.c:462
> 
> #10 0x0000000000424cc2 in main (argc=3, argv=0x7fffffffe660) at
> /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:323
> 
> 
> 
> 
> 
> 
> 
> 
> At 2016-03-23 03:53:43, "De Lara Guarch, Pablo"
> <pablo.de.lara.guarch at intel.com> wrote:
> >Hi Thomas,
> >
> >> -----Original Message-----
> >> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> >> Sent: Tuesday, March 22, 2016 11:42 AM
> >> To: De Lara Guarch, Pablo; Gonzalez Monroy, Sergio
> >> Cc: dev at dpdk.org; Dhana Eadala; Richardson, Bruce; Qiu, Michael
> >> Subject: Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in
> multi-
> >> process environment
> >>
> >> Hi,
> >>
> >> Pablo, Sergio, please could you help with this issue?
> >
> >I agree this is not the best way to fix this. I will try to have a fix without
> having to use ifdefs.
> >
> >Thanks,
> >Pablo
> >>
> >> 2016-03-13 22:16, Dhana Eadala:
> >> > We found a problem in dpdk-2.2 using under multi-process
> environment.
> >> > Here is the brief description how we are using the dpdk:
> >> >
> >> > We have two processes proc1, proc2 using dpdk. These proc1 and proc2
> >> are
> >> > two different compiled binaries.
> >> > proc1 is started as primary process and proc2 as secondary process.
> >> >
> >> > proc1:
> >> > Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash
> >> structure.
> >> > As part of this, this api initalized the rte_hash structure and set the
> >> > srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1
> >> address space.
> >> >
> >> > proc2:
> >> > calls srcHash =  rte_hash_find_existing("src_hash_name").
> >> > This function call returns the rte_hash created by proc1.
> >> > This srcHash->rte_hash_cmp_eq still points to the address of
> >> > memcmp() from proc1 address space.
> >> > Later proc2  calls
> >> > rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
> >> > rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
> >> > which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
> >> > This leads to a crash as h->rte_hash_cmp_eq is an address
> >> > from proc1 address space and is invalid address in proc2 address space.
> >> >
> >> > We found, from dpdk documentation, that
> >> >
> >> > "
> >> >  The use of function pointers between multiple processes
> >> >  running based of different compiled
> >> >  binaries is not supported, since the location of a given function
> >> >  in one process may be different to
> >> >  its location in a second. This prevents the librte_hash library
> >> >  from behaving properly as in a  multi-
> >> >  threaded instance, since it uses a pointer to the hash function
> internally.
> >> >
> >> >  To work around this issue, it is recommended that
> >> >  multi-process applications perform the hash
> >> >  calculations by directly calling the hashing function
> >> >  from the code and then using the
> >> >  rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> >> >  instead of the functions which do
> >> >  the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> >> > "
> >> >
> >> > We did follow the recommended steps by invoking
> >> rte_hash_lookup_with_hash().
> >> > It was no issue up to and including dpdk-2.0.
> >> > In later releases started crashing because rte_hash_cmp_eq is
> >> > introduced in dpdk-2.1
> >> >
> >> > We fixed it with the following patch and would like to
> >> > submit the patch to dpdk.org.
> >> > Patch is created such that, if anyone wanted to use dpdk in
> >> > multi-process environment with function pointers not shared, they need
> to
> >> > define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
> >> > Without defining this flag in Makefile, it works as it is now.
> >>
> >> Introducing #ifdef RTE_LIB_MP_NO_FUNC_PTR is not recommended.
> >


More information about the dev mailing list