[dpdk-dev] rte_hash_del_key crash in multi-process environment

张伟 zhangwqh at 126.com
Tue Apr 19 06:58:00 CEST 2016


Hi all, 


In the multi-process environment, before I met a bug when calling rte_hash_lookup_with_hash. Using Dhana's patch fixed my problem. Now I need to remove the flow in the multi-process environment, the system gets crashed when calling rte_hash_del_key function. The following is the gdb trace. Does anybody meet this problem or know how to fix it?




Program received signal SIGILL, Illegal instruction.

0x000000000048a0dd in rte_port_ring_reader_frag_free (port=0x7ffe113d4100) at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_port/rte_port_frag.c:266

266            return -1;

(gdb) bt

#0  0x000000000048a0dd in rte_port_ring_reader_frag_free (port=0x7ffe113d4100) at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_port/rte_port_frag.c:266

#1  0x000000000049c537 in rte_hash_del_key (h=0x7ffe113d4100, key=0x7ffe092e1000)

   at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:917

#2  0x000000000043716a in onvm_ft_remove_key (table=0x7ffe113c3e80, key=0x7ffe092e1000) at /home/zhangwei1984/onvm-shared-cpu/onvm/shared/onvm_flow_table.c:160

#3  0x000000000043767e in onvm_flow_dir_del_and_free_key (key=0x7ffe092e1000) at /home/zhangwei1984/onvm-shared-cpu/onvm/shared/onvm_flow_dir.c:144

#4  0x0000000000437619 in onvm_flow_dir_del_key (key=0x7ffe092e1000) at /home/zhangwei1984/onvm-shared-cpu/onvm/shared/onvm_flow_dir.c:128

#5  0x0000000000423ded in remove_flow_rule (idx=3) at /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:130

#6  0x0000000000423e44 in clear_stat_remove_flow_rule (nf_info=0x7fff3e652100) at /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:145

#7  0x00000000004247e3 in alloc_nfs_install_flow_rule (services=0xd66e90 <services>, pkt=0x7ffe13f56400)

   at /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:186

#8  0x0000000000424bdb in packet_handler (pkt=0x7ffe13f56400, meta=0x7ffe13f56440) at /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:294

#9  0x000000000043001d in onvm_nf_run (info=0x7fff3e651b00, handler=0x424b21 <packet_handler>) at /home/zhangwei1984/onvm-shared-cpu/onvm/onvm_nf/onvm_nflib.c:462

#10 0x0000000000424cc2 in main (argc=3, argv=0x7fffffffe660) at /home/zhangwei1984/onvm-shared-cpu/examples/flow_dir/flow_dir.c:323








At 2016-03-23 03:53:43, "De Lara Guarch, Pablo" <pablo.de.lara.guarch at intel.com> wrote:
>Hi Thomas,
>
>> -----Original Message-----
>> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>> Sent: Tuesday, March 22, 2016 11:42 AM
>> To: De Lara Guarch, Pablo; Gonzalez Monroy, Sergio
>> Cc: dev at dpdk.org; Dhana Eadala; Richardson, Bruce; Qiu, Michael
>> Subject: Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-
>> process environment
>> 
>> Hi,
>> 
>> Pablo, Sergio, please could you help with this issue?
>
>I agree this is not the best way to fix this. I will try to have a fix without having to use ifdefs.
>
>Thanks,
>Pablo
>> 
>> 2016-03-13 22:16, Dhana Eadala:
>> > We found a problem in dpdk-2.2 using under multi-process environment.
>> > Here is the brief description how we are using the dpdk:
>> >
>> > We have two processes proc1, proc2 using dpdk. These proc1 and proc2
>> are
>> > two different compiled binaries.
>> > proc1 is started as primary process and proc2 as secondary process.
>> >
>> > proc1:
>> > Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash
>> structure.
>> > As part of this, this api initalized the rte_hash structure and set the
>> > srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1
>> address space.
>> >
>> > proc2:
>> > calls srcHash =  rte_hash_find_existing("src_hash_name").
>> > This function call returns the rte_hash created by proc1.
>> > This srcHash->rte_hash_cmp_eq still points to the address of
>> > memcmp() from proc1 address space.
>> > Later proc2  calls
>> > rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
>> > rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
>> > which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
>> > This leads to a crash as h->rte_hash_cmp_eq is an address
>> > from proc1 address space and is invalid address in proc2 address space.
>> >
>> > We found, from dpdk documentation, that
>> >
>> > "
>> >  The use of function pointers between multiple processes
>> >  running based of different compiled
>> >  binaries is not supported, since the location of a given function
>> >  in one process may be different to
>> >  its location in a second. This prevents the librte_hash library
>> >  from behaving properly as in a  multi-
>> >  threaded instance, since it uses a pointer to the hash function internally.
>> >
>> >  To work around this issue, it is recommended that
>> >  multi-process applications perform the hash
>> >  calculations by directly calling the hashing function
>> >  from the code and then using the
>> >  rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
>> >  instead of the functions which do
>> >  the hashing internally, such as rte_hash_add()/rte_hash_lookup().
>> > "
>> >
>> > We did follow the recommended steps by invoking
>> rte_hash_lookup_with_hash().
>> > It was no issue up to and including dpdk-2.0.
>> > In later releases started crashing because rte_hash_cmp_eq is
>> > introduced in dpdk-2.1
>> >
>> > We fixed it with the following patch and would like to
>> > submit the patch to dpdk.org.
>> > Patch is created such that, if anyone wanted to use dpdk in
>> > multi-process environment with function pointers not shared, they need to
>> > define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
>> > Without defining this flag in Makefile, it works as it is now.
>> 
>> Introducing #ifdef RTE_LIB_MP_NO_FUNC_PTR is not recommended.
>


More information about the dev mailing list