Bug 507
Summary: | virtio perf decrease and interrupt abnormal | ||
---|---|---|---|
Product: | DPDK | Reporter: | xiaoqimai (qimaix.xiao) |
Component: | vhost/virtio | Assignee: | Maxime Coquelin (maxime.coquelin) |
Status: | IN_PROGRESS --- | ||
Severity: | normal | CC: | ajit.khaparde, maxime.coquelin |
Priority: | Normal | ||
Version: | 20.08 | ||
Target Milestone: | --- | ||
Hardware: | x86 | ||
OS: | Linux |
Description
xiaoqimai
2020-07-16 09:16:07 CEST
And also, this commit will affect virtio interrupt function, makes lcore can't be wakeup when handling pkts. test steps as below: Test Case 2: wake up vhost-user cores with event idx interrupt mode 16 queues test ================================================================================== 1. Launch l3fwd-power example app with client mode:: ./l3fwd-power -l 1-16 \ -n 4 --socket-mem 1024,1024 --legacy-mem --no-pci\ --log-level=9 \ --vdev 'eth_vhost0,iface=/vhost-net0,queues=16,client=1' \ – -p 0x1 \ --parse-ptype 1 \ --config "(0,0,1),(0,1,2),(0,2,3),(0,3,4),(0,4,5),(0,5,6),(0,6,7),(0,7,8),(0,8,9),(0,9,10),(0,10,11),(0,11,12),(0,12,13),(0,13,14),(0,14,15),(0,15,16)" 2. Launch VM1 with server mode:: taskset c 17-18 qemu-system-x86_64 -name us-vhost-vm1 -cpu host -enable-kvm -m 2048 -object memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa node,memdev=mem -mem-prealloc -netdev user,id=yinan,hostfwd=tcp:127.0.0.1:6005:22 -device e1000,netdev=yinan -smp cores=16,sockets=1 -drive file=/home/osimg/ubuntu16.img -chardev socket,server,id=char0,path=/vhost-net0 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=16 -device virtio-net-pci,mac=52:54:00:00:00:01,netdev=mynet1,csum=on,mq=on,vectors=40 -vnc :10 -daemonize 3. Relauch l3fwd-power sample for port up:: ./l3fwd-power -l 1-16 \ -n 4 --socket-mem 1024,1024 --legacy-mem --no-pci\ --log-level=9 \ --vdev 'eth_vhost0,iface=/vhost-net0,queues=16,client=1' \ – -p 0x1 \ --parse-ptype 1 \ --config "(0,0,1),(0,1,2),(0,2,3),(0,3,4),(0,4,5),(0,5,6),(0,6,7),(0,7,8),(0,8,9),(0,9,10),(0,10,11),(0,11,12),(0,12,13),(0,13,14),(0,14,15),(0,15,16)" 3. Set vitio-net with 16 quques and give vitio-net ip address:: ethtool -L [ens3] combined 16 # [ens3] is the name of virtio-net ifconfig [ens3] 1.1.1.1 4. Send packets with different IPs from virtio-net, notice to bind each vcpu to different send packets process:: taskset -c 0 ping 1.1.1.2 taskset -c 1 ping 1.1.1.3 taskset -c 2 ping 1.1.1.4 taskset -c 3 ping 1.1.1.5 taskset -c 4 ping 1.1.1.6 taskset -c 5 ping 1.1.1.7 taskset -c 6 ping 1.1.1.8 taskset -c 7 ping 1.1.1.9 taskset -c 8 ping 1.1.1.2 taskset -c 9 ping 1.1.1.2 taskset -c 10 ping 1.1.1.2 taskset -c 11 ping 1.1.1.2 taskset -c 12 ping 1.1.1.2 taskset -c 13 ping 1.1.1.2 taskset -c 14 ping 1.1.1.2 taskset -c 15 ping 1.1.1.2 5. Check vhost related cores are waked up with l3fwd-power log, such as following:: L3FWD_POWER: lcore 0 is waked up from rx interrupt on port 0 queue 0 ..... ..... L3FWD_POWER: lcore 15 is waked up from rx interrupt on port 0 queue 15 about performance drop issue, some other virtio path can also be affected by this commit, such as path=./vhost-net,queues=8,packed_vq=1,mrg_rxbuf=0,in_order=0 and path=./vhost-net,queues=8,packed_vq=1,mrg_rxbuf=1,in_order=1 I now understand what happens in Vhost PMD. In the test case I managed to reproduce, rx interrupts are never enabled. The problem is that in new device callback, rte_vhost_enable_guest_notification() calls to disable notifications fail for RX queues because vq->device_event is NULL. The patch Matan prepared only calls rte_vhost_enable_guest_notification when RX interrupts are enabled in the the eth dev, and so is not called in this test case. By removing this limitation, and also with removing the access_lock lock for testing purpose, I get the performance back to prior the readiness series. See below for the debugging patch. I'm going to implement what I suggested in Matan's patch review yesterday, which is making rte_vhost_enable_guest_notification to save the notif enable value in the vq metadata, and apply it when the vq becomes ready. That would solve our issue. I'm not sure yet it will fix the other issues reported, I'll need a bit more time to try and reproduce them. Thanks, Maxime diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index bbf79b2c0..44ad23078 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -851,24 +851,25 @@ vring_conf_update(int vid, struct rte_eth_dev *eth_dev, uint16_t vring_id) */ if (rx_idx >= 0 && rx_idx < eth_dev->data->nb_rx_queues && rte_atomic32_read(&internal->dev_attached) && - rte_atomic32_read(&internal->started) && - dev_conf->intr_conf.rxq) { + rte_atomic32_read(&internal->started)) { vq = eth_dev->data->rx_queues[rx_idx]; - ret = rte_vhost_get_vhost_vring(vid, vring_id, &vring); - if (!ret) { - if (vring.kickfd != - eth_dev->intr_handle->efds[rx_idx]) { - VHOST_LOG(INFO, - "kickfd for rxq-%d was changed.\n", - rx_idx); - eth_dev->intr_handle->efds[rx_idx] = - vring.kickfd; + if (dev_conf->intr_conf.rxq) { + ret = rte_vhost_get_vhost_vring(vid, vring_id, &vring); + if (!ret) { + if (vring.kickfd != + eth_dev->intr_handle->efds[rx_idx]) { + VHOST_LOG(INFO, + "kickfd for rxq-%d was changed.\n", + rx_idx); + eth_dev->intr_handle->efds[rx_idx] = + vring.kickfd; + } } - - rte_vhost_enable_guest_notification(vid, vring_id, - vq->intr_en); - rte_wmb(); } + + rte_vhost_enable_guest_notification(vid, vring_id, + vq->intr_en); + rte_wmb(); } return ret; diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 14b3e253e..3e28dc179 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -1323,14 +1323,12 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) vq = dev->virtqueue[queue_id]; - rte_spinlock_lock(&vq->access_lock); if (vq_is_packed(dev)) ret = vhost_enable_notify_packed(dev, vq, enable); else ret = vhost_enable_notify_split(dev, vq, enable); - rte_spinlock_unlock(&vq->access_lock); return ret; } Maxime, Is this taken care of? Thanks |