[dpdk-dev] [PATCH v2 1/7] vhost: refactor rte_vhost_dequeue_burst

Xie, Huawei huawei.xie at intel.com
Mon Mar 7 03:19:54 CET 2016


On 3/4/2016 10:19 AM, Yuanhan Liu wrote:
> On Thu, Mar 03, 2016 at 04:21:19PM +0000, Xie, Huawei wrote:
>> On 2/18/2016 9:48 PM, Yuanhan Liu wrote:
>>> The current rte_vhost_dequeue_burst() implementation is a bit messy
>>> and logic twisted. And you could see repeat code here and there: it
>>> invokes rte_pktmbuf_alloc() three times at three different places!
>>>
>>> However, rte_vhost_dequeue_burst() acutally does a simple job: copy
>>> the packet data from vring desc to mbuf. What's tricky here is:
>>>
>>> - desc buff could be chained (by desc->next field), so that you need
>>>   fetch next one if current is wholly drained.
>>>
>>> - One mbuf could not be big enough to hold all desc buff, hence you
>>>   need to chain the mbuf as well, by the mbuf->next field.
>>>
>>> Even though, the logic could be simple. Here is the pseudo code.
>>>
>>> 	while (this_desc_is_not_drained_totally || has_next_desc) {
>>> 		if (this_desc_has_drained_totally) {
>>> 			this_desc = next_desc();
>>> 		}
>>>
>>> 		if (mbuf_has_no_room) {
>>> 			mbuf = allocate_a_new_mbuf();
>>> 		}
>>>
>>> 		COPY(mbuf, desc);
>>> 	}
>>>
>>> And this is how I refactored rte_vhost_dequeue_burst.
>>>
>>> Note that the old patch does a special handling for skipping virtio
>>> header. However, that could be simply done by adjusting desc_avail
>>> and desc_offset var:
>>>
>>> 	desc_avail  = desc->len - vq->vhost_hlen;
>>> 	desc_offset = vq->vhost_hlen;
>>>
>>> This refactor makes the code much more readable (IMO), yet it reduces
>>> binary code size (nearly 2K).
>>>
>>> Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com>
>>> ---
>>>
>>> v2: - fix potential NULL dereference bug of var "prev" and "head"
>>> ---
>>>  lib/librte_vhost/vhost_rxtx.c | 297 +++++++++++++++++-------------------------
>>>  1 file changed, 116 insertions(+), 181 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
>>> index 5e7e5b1..d5cd0fa 100644
>>> --- a/lib/librte_vhost/vhost_rxtx.c
>>> +++ b/lib/librte_vhost/vhost_rxtx.c
>>> @@ -702,21 +702,104 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
>>>  	}
>>>  }
>>>  
>>> +static inline struct rte_mbuf *
>>> +copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>> +		  uint16_t desc_idx, struct rte_mempool *mbuf_pool)
>>> +{
>>> +	struct vring_desc *desc;
>>> +	uint64_t desc_addr;
>>> +	uint32_t desc_avail, desc_offset;
>>> +	uint32_t mbuf_avail, mbuf_offset;
>>> +	uint32_t cpy_len;
>>> +	struct rte_mbuf *head = NULL;
>>> +	struct rte_mbuf *cur = NULL, *prev = NULL;
>>> +	struct virtio_net_hdr *hdr;
>>> +
>>> +	desc = &vq->desc[desc_idx];
>>> +	desc_addr = gpa_to_vva(dev, desc->addr);
>>> +	rte_prefetch0((void *)(uintptr_t)desc_addr);
>>> +
>>> +	/* Retrieve virtio net header */
>>> +	hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
>>> +	desc_avail  = desc->len - vq->vhost_hlen;
>> There is a serious bug here, desc->len - vq->vhost_len could overflow.
>> VM could easily create this case. Let us fix it here.
> Nope, this issue has been there since the beginning, and this patch
> is a refactor: we should not bring any functional changes. Therefore,
> we should not fix it here.

No, I don't mean exactly fixing in this patch but in this series.

Besides, from refactoring's perspective, actually we could make things
further much simpler and more readable. Both the desc chains and mbuf
could be converted into iovec, then both dequeue(copy_desc_to_mbuf) and
enqueue(copy_mbuf_to_desc) could use the commonly used iovec copying
algorithms As data path are performance oriented, let us stop here.

>
> And actually, it's been fixed in the 6th patch in this series:

Will check that.

>
>     [PATCH v2 6/7] vhost: do sanity check for desc->len
>
> 	--yliu
>



More information about the dev mailing list