Bug 194

Summary: vhost pmd has become unusable from secondary processes.
Product: DPDK Reporter: Itsuro Oda (oda)
Component: vhost/virtioAssignee: Maxime Coquelin (maxime.coquelin)
Status: CONFIRMED ---    
Severity: critical CC: ajit.khaparde, anatoly.burakov, yasufum.o
Priority: Normal    
Version: 18.11   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: GDB log of inspecting rte_eth_devices in primary and secondary

Description Itsuro Oda 2019-01-21 06:45:06 CET
This problem was discovered when trying to migrate the base of SPP(http://git.dpdk.org/apps/spp/) from DPDK v18.08 to v18.11.
In SPP, secondary processes attach (rte_eal_hotplug_add) and use vhost pmd (ex. devargs: "eth_vhost0,iface=/tmp/sock0,queues=1,client=1").
It was no problem under DPDK v18.08 but secondary processes crash under v18.11.

As a result of some investigations, it was found that the direct cause of crash is because no value is set (i.e. null pointer) for [rt]x_pkt_burst member of rte_eth_dev of vhost.
Certainly there is no place to set on the code.
(Is this comment related to something?
https://github.com/DPDK/dpdk/blob/master/drivers/net/vhost/rte_eth_vhost.c#L1352 )

In addition, even if the value is set, it will not work.
This is because eth_vhost_[rt]x refers to vid which is the index of vhsot_devices
(https://github.com/DPDK/dpdk/blob/master/lib/librte_vhost/vhost.c#L28).
vhost_devices is per process data (i.e. not shared data) and the primary process only uses vhost_devices under v18.11 (unlike v18.08) and it is not accessed from 
secondary processes.
Perhaps some fix, such as making vhost_devices a shared data, is necessary.
Comment 1 Ajit Khaparde 2019-01-22 21:38:55 CET
Maxime, can you take a look at this? Thanks
Comment 2 Ajit Khaparde 2019-03-16 14:57:33 CET
Maxime, any update?
oda@valinux.co.jp, Is this still an issue?
Comment 3 Yasufumi Ogawa 2019-07-09 13:55:11 CEST
Created attachment 49 [details]
GDB log of inspecting rte_eth_devices in primary and secondary
Comment 4 Yasufumi Ogawa 2019-07-09 13:55:50 CEST
Hi Maxime and Ajit,

This issue is still remained in v19.05. I am attaching a result of inspecting which of members in `rte_eth_devices` are shared between primary and secondary. This `rte_eth_devices[4]` is a vhost device added with ret_eal_hotplug_add(). It looks that its members are set property without `rx_pkt_burst` and `tx_pkt_burst` in secondary in which both of them are set as `0x0`. Thanks