[dpdk-dev] [PATCH 0/2] vhost: IOTLB fixes for -rc1

Maxime Coquelin maxime.coquelin at redhat.com
Thu Oct 12 17:38:48 CEST 2017

These two patches fixes issues faced when running the VM on a different
socket than DPDK.

In this case, the numa_realloc() function is called to reallocate
the virtqueue and the virtio-net device structs on the VM's socket.

The problem is that doing so corrupts the IOTLB cache list, as the list
head is being reallocated, but the first entry in the list is not updated
to point to the new list head. It results in all new IOTLB entries that
need to be inserted before the first entry in the list to be leaked, as
the new head is still pointing to the first entry at the time the realloc

Patch 2 addresses this issue by re-initializing the IOTLB cache 
completely. Doing this also create again the IOTLB mempool on the new

This first issue helped to highlight a deadlock that patch 1 fixes.
As inserting an entry before the first entry in the list resulted in a
leak, it ended up flooding Qemu with IOTLB misses for the same address.

The deadlock happen because an optimization was done to lock the iotlb
cache lock once per packet burst instead of once per translation. It
means that when an IOTLB miss is sent, it is done with the lock held.

The problem is that sending an IOTLB miss can block if the socket buffer
is full, and this buffer is emptied by the same Qemu thread which is
waiting for an IOTLB update to be completed. But it never completes
because DPDK waits for the iotlb lock to insert the update into the
iotlb cache, hence the deadlock.

The fix consists in just unlocking the iotlb lock while sending the
IOTLB miss, which is safe as it does not access the iotlb list the
lock protects.

Maxime Coquelin (2):
  vhost: fix deadlock on IOTLB miss
  vhost: fix IOTLB on NUMA realloc

 lib/librte_vhost/iotlb.c      |  1 -
 lib/librte_vhost/vhost.c      | 12 ++++++++++++
 lib/librte_vhost/vhost_user.c |  3 +++
 3 files changed, 15 insertions(+), 1 deletion(-)


More information about the dev mailing list