[dpdk-dev] [PATCH] vhost: fix iotlb pool out-of-memory handling

Tiwei Bie tiwei.bie at intel.com
Fri Jan 26 09:03:16 CET 2018


On Thu, Jan 25, 2018 at 04:06:53PM +0100, Maxime Coquelin wrote:
> In the unlikely case the IOTLB memory pool runs out of memory,
> an issue may happen if all entries are used by the IOTLB cache,
> and an IOTLB miss happen. If the iotlb pending list is empty,
> then no memory is freed and allocation fails a second time.
> 
> This patch fixes this by doing an IOTLB cache random evict if
> the IOTLB pending list is empty, ensuring the second allocation
> try will succeed.
> 
> In the same spirit, the opposite is done when inserting an
> IOTLB entry in the IOTLB cache fails due to out of memory. In
> this case, the IOTLB pending is flushed if the IOTLB cache is
> empty to ensure the new entry can be inserted.
> 
> Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
> Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers")
> 
[...]
> @@ -95,9 +98,11 @@ vhost_user_iotlb_pending_insert(struct vhost_virtqueue *vq,
>  
>  	ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>  	if (ret) {
> -		RTE_LOG(INFO, VHOST_CONFIG,
> -				"IOTLB pool empty, clear pending misses\n");
> -		vhost_user_iotlb_pending_remove_all(vq);
> +		RTE_LOG(INFO, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");
> +		if (!TAILQ_EMPTY(&vq->iotlb_pending_list))
> +			vhost_user_iotlb_pending_remove_all(vq);
> +		else
> +			vhost_user_iotlb_cache_random_evict(vq);
>  		ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>  		if (ret) {
>  			RTE_LOG(ERR, VHOST_CONFIG, "IOTLB pool still empty, failure\n");
> @@ -186,8 +191,11 @@ vhost_user_iotlb_cache_insert(struct vhost_virtqueue *vq, uint64_t iova,
>  
>  	ret = rte_mempool_get(vq->iotlb_pool, (void **)&new_node);
>  	if (ret) {
> -		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, evict one entry\n");
> -		vhost_user_iotlb_cache_random_evict(vq);
> +		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");

Maybe we should use the same log level for both cases?
Currently, the first one is INFO, and this one is DEBUG.

> +		if (!TAILQ_EMPTY(&vq->iotlb_list))
> +			vhost_user_iotlb_cache_random_evict(vq);
> +		else
> +			vhost_user_iotlb_pending_remove_all(vq);

The IOTLB entry will be invalidated explicitly if it's
unmapped in the frontend. So all the IOTLB entries in
IOTLB cache are supposed to be valid. So I think maybe
it's better to always prefer to free memory from IOTLB
pending list. Something like:

	if (TAILQ_EMPTY(&vq->iotlb_list))
		vhost_user_iotlb_pending_remove_all(vq);
	else if (TAILQ_EMPTY(&vq->iotlb_pending_list))
		vhost_user_iotlb_cache_random_evict(vq);
	else
		vhost_user_iotlb_pending_random_evict(vq);

Besides, in __vhost_iova_to_vva():

uint64_t
__vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
		    uint64_t iova, uint64_t size, uint8_t perm)
{
	uint64_t vva, tmp_size;

	if (unlikely(!size))
		return 0;

	tmp_size = size;

	vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm);
	if (tmp_size == size)
		return vva;

	if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) {
		/*
		 * iotlb_lock is read-locked for a full burst,
		 * but it only protects the iotlb cache.
		 * In case of IOTLB miss, we might block on the socket,
		 * which could cause a deadlock with QEMU if an IOTLB update
		 * is being handled. We can safely unlock here to avoid it.
		 */
		vhost_user_iotlb_rd_unlock(vq);

		vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm);
		vhost_user_iotlb_miss(dev, iova + tmp_size, perm);

The vhost_user_iotlb_miss() may fail. If it fails,
the inserted pending entry should be removed from
the pending list.

Best regards,
Tiwei Bie


More information about the dev mailing list