We are using DPDK in our application which has two processes , first one will be a primary DPDK process where as second one will be a secondary DPDK process. While our primary process calls all dpdk initialization routines like rte_eal_init, dev_configure, rx/tx queue setup and dev_start routines, our secondary process will invoke just rte_eal_init. In DPDK, rte_eth_dev->data->rx_queues and rte_eth_dev->data->tx_queues is a shared data structure between both primary and secondary processes. In iavf pmd, each rxq(i.e above rx_queues) and txq (above tx_queues) holds a pointer to the function ( eg. in rx_queues[index]->ops->release_mbufs) which will be invoked during rte_eth_dev_stop. Call to iavf_set_rx_function modifies this function pointer i.e release_mbufs. This function pointer will be initially set to a address by primary process -> rte_eth_dev_start() -> iavf_init_queues() -> iavf_set_rx_function(). Later this function pointer is updated by secondary process to its own address -> rte_eal_init()) -> iavf_dev_init() -> iavf_set_rx_function() . This address will be invalid in primary process address space. During application shutdown, we are invoking rte_eth_dev_stop from primary process which invokes release_mbufs function . As the address stored in release_mbufs function pointer now points to an invalid address , primary process is crashing always. Note: This bug will also be observed in other PMDs like ice, ixgbe which uses similar code/design.
This issue can be reproduced even with DPDK multi process application. ======================================================================= [root@dpdk /]#/boot/examples/dpdk-mp_server -l 2-3 -n 4 --allow 0000:00:0f.0 --allow 0000:00:0d.0 --proc-type=primary -- -p 0x3 -n 1 ....... ....... ....... PORTS ----- Port 0: 'FA:16:42:B2:E4:70' Port 1: 'FA:16:42:68:9A:7C' Port 0 - rx: 1500 tx: 650 Port 1 - rx: 1111 tx: 845 CLIENTS ------- Client 0 - rx: 1495, rx_drop: 1116 tx: 1495, tx_drop: 0 ^CSegmentation fault (core dumped) ======================================================================= [root@dpdk /]# /boot/examples/dpdk-mp_client -l 4-5 -n 4 --allow 0000:00:0f.0 --allow 0000:00:0d.0 --proc-type=secondary -- -n 0 ================This is the crash file bt in gdb=================== (gdb) bt #0 0x0000000000a96477 in ?? () #1 0x0000000000000000 in ?? () (gdb) file /home/dpdk/examples/dpdk-mp_server Reading symbols from /home/dpdk/examples/dpdk-mp_server...done. (gdb) bt #0 0x0000000000a96477 in i40e_flow_parse_fdir_filter () at dpdk/drivers/net/i40e/i40e_flow.c:3272 #1 0x0000000000ab983d in iavf_stop_queues () at dpdk/drivers/net/iavf/iavf_rxtx.c:1036 #2 0x0000000000591aa1 in iavf_dev_stop (dev=0x15a0240 <rte_eth_devices>) at dpdk/drivers/net/iavf/iavf_ethdev.c:1019 #3 0x00000000008d7900 in rte_eth_dev_stop () at dpdk/lib/ethdev/rte_ethdev.c:1883 #4 0x00000000006a2a44 in signal_handler (signal=<optimized out>) at dpdk/examples/multi_process/client_server_mp/mp_server/main.c:284 #5 0x00007ffff648cb80 in ?? () #6 0x0000000000000007 in ?? () #7 0x0000000000000000 in ?? () (gdb)
There may be an issue in the example code, but in doubt, assigning to Beilei.