[dpdk-dev] [PATCH] vhost: fix iotlb pool out-of-memory handling

Maxime Coquelin maxime.coquelin at redhat.com
Fri Jan 26 16:23:03 CET 2018


Hi Tiwei,

On 01/26/2018 09:03 AM, Tiwei Bie wrote:
> On Thu, Jan 25, 2018 at 04:06:53PM +0100, Maxime Coquelin wrote:
>> In the unlikely case the IOTLB memory pool runs out of memory,
>> an issue may happen if all entries are used by the IOTLB cache,
>> and an IOTLB miss happen. If the iotlb pending list is empty,
>> then no memory is freed and allocation fails a second time.
>>
>> This patch fixes this by doing an IOTLB cache random evict if
>> the IOTLB pending list is empty, ensuring the second allocation
>> try will succeed.
>>
>> In the same spirit, the opposite is done when inserting an
>> IOTLB entry in the IOTLB cache fails due to out of memory. In
>> this case, the IOTLB pending is flushed if the IOTLB cache is
>> empty to ensure the new entry can be inserted.
>>
>> Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
>> Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers")
>>
> [...]
>> @@ -95,9 +98,11 @@ vhost_user_iotlb_pending_insert(struct vhost_virtqueue *vq,
>>   
>>   	ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>>   	if (ret) {
>> -		RTE_LOG(INFO, VHOST_CONFIG,
>> -				"IOTLB pool empty, clear pending misses\n");
>> -		vhost_user_iotlb_pending_remove_all(vq);
>> +		RTE_LOG(INFO, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");
>> +		if (!TAILQ_EMPTY(&vq->iotlb_pending_list))
>> +			vhost_user_iotlb_pending_remove_all(vq);
>> +		else
>> +			vhost_user_iotlb_cache_random_evict(vq);
>>   		ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>>   		if (ret) {
>>   			RTE_LOG(ERR, VHOST_CONFIG, "IOTLB pool still empty, failure\n");
>> @@ -186,8 +191,11 @@ vhost_user_iotlb_cache_insert(struct vhost_virtqueue *vq, uint64_t iova,
>>   
>>   	ret = rte_mempool_get(vq->iotlb_pool, (void **)&new_node);
>>   	if (ret) {
>> -		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, evict one entry\n");
>> -		vhost_user_iotlb_cache_random_evict(vq);
>> +		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");
> 
> Maybe we should use the same log level for both cases?
> Currently, the first one is INFO, and this one is DEBUG.

Right, I will set DEBUG log level for both.

> 
>> +		if (!TAILQ_EMPTY(&vq->iotlb_list))
>> +			vhost_user_iotlb_cache_random_evict(vq);
>> +		else
>> +			vhost_user_iotlb_pending_remove_all(vq);
> 
> The IOTLB entry will be invalidated explicitly if it's
> unmapped in the frontend. So all the IOTLB entries in
> IOTLB cache are supposed to be valid. So I think maybe
> it's better to always prefer to free memory from IOTLB
> pending list. Something like:
> 
> 	if (TAILQ_EMPTY(&vq->iotlb_list))
> 		vhost_user_iotlb_pending_remove_all(vq);
> 	else if (TAILQ_EMPTY(&vq->iotlb_pending_list))
> 		vhost_user_iotlb_cache_random_evict(vq);
> 	else
> 		vhost_user_iotlb_pending_random_evict(vq);

I agree that all the IOTLB entries in the cache are supposed to
be valid. But removing in pending list first might not be a good idea,
as if the pending entry is valid, it means an IOTLB update is on its way
to the backend. So it could cause another MISS request to be sent for
the same address if the PMD thread retry the translation afterwards.

> Besides, in __vhost_iova_to_vva():
> 
> uint64_t
> __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
> 		    uint64_t iova, uint64_t size, uint8_t perm)
> {
> 	uint64_t vva, tmp_size;
> 
> 	if (unlikely(!size))
> 		return 0;
> 
> 	tmp_size = size;
> 
> 	vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm);
> 	if (tmp_size == size)
> 		return vva;
> 
> 	if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) {
> 		/*
> 		 * iotlb_lock is read-locked for a full burst,
> 		 * but it only protects the iotlb cache.
> 		 * In case of IOTLB miss, we might block on the socket,
> 		 * which could cause a deadlock with QEMU if an IOTLB update
> 		 * is being handled. We can safely unlock here to avoid it.
> 		 */
> 		vhost_user_iotlb_rd_unlock(vq);
> 
> 		vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm);
> 		vhost_user_iotlb_miss(dev, iova + tmp_size, perm);
> 
> The vhost_user_iotlb_miss() may fail. If it fails,
> the inserted pending entry should be removed from
> the pending list.

Right, thanks for spotting this.
I will fix this in a separate patch.

Regards,
Maxime
> Best regards,
> Tiwei Bie
> 


More information about the dev mailing list