[dpdk-dev] virtio optimization idea
Xie, Huawei
huawei.xie at intel.com
Fri Sep 4 18:50:37 CEST 2015
There is some format issue with the ascii chart of the tx ring. Update
that chart.
Sorry for the trouble.
On 9/4/2015 4:25 PM, Xie, Huawei wrote:
> Hi:
>
> Recently I have done one virtio optimization proof of concept. The
> optimization includes two parts:
> 1) avail ring set with fixed descriptors
> 2) RX vectorization
> With the optimizations, we could have several times of performance boost
> for purely vhost-virtio throughput.
>
> Here i will only cover the first part, which is the prerequisite for the
> second part.
> Let us first take RX for example. Currently when we fill the avail ring
> with guest mbuf, we need
> a) allocate one descriptor(for non sg mbuf) from free descriptors
> b) set the idx of the desc into the entry of avail ring
> c) set the addr/len field of the descriptor to point to guest blank mbuf
> data area
>
> Those operation takes time, and especially step b results in modifed (M)
> state of the cache line for the avail ring in the virtio processing
> core. When vhost processes the avail ring, the cache line transfer from
> virtio processing core to vhost processing core takes pretty much CPU
> cycles.
> To solve this problem, this is the arrangement of RX ring for DPDK
> pmd(for non-mergable case).
>
> avail
> idx
> +
> |
> +----+----+---+-------------+------+
> | 0 | 1 | 2 | ... | 254 | 255 | avail ring
> +-+--+-+--+-+-+---------+---+--+---+
> | | | | | |
> | | | | | |
> v v v | v v
> +-+--+-+--+-+-+---------+---+--+---+
> | 0 | 1 | 2 | ... | 254 | 255 | desc ring
> +----+----+---+-------------+------+
> |
> |
> +----+----+---+-------------+------+
> | 0 | 1 | 2 | | 254 | 255 | used ring
> +----+----+---+-------------+------+
> |
> +
> Avail ring is initialized with fixed descriptor and is never changed,
> i.e, the index value of the nth avail ring entry is always n, which
> means virtio PMD is actually refilling desc ring only, without having to
> change avail ring.
> When vhost fetches avail ring, if not evicted, it is always in its first
> level cache.
>
> When RX receives packets from used ring, we use the used->idx as the
> desc idx. This requires that vhost processes and returns descs from
> avail ring to used ring in order, which is true for both current dpdk
> vhost and kernel vhost implementation. In my understanding, there is no
> necessity for vhost net to process descriptors OOO. One case could be
> zero copy, for example, if one descriptor doesn't meet zero copy
> requirment, we could directly return it to used ring, earlier than the
> descriptors in front of it.
> To enforce this, i want to use a reserved bit to indicate in order
> processing of descriptors.
>
> For tx ring, the arrangement is like below. Each transmitted mbuf needs
> a desc for virtio_net_hdr, so actually we have only 128 free slots.
>
>
>
> ++
> ||
> ||
> +-----+-----+-----+--------------+------+------+------+
> | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring
> +--+--+--+--+-----+---+------+---+--+---+------+--+---+
> | | | || | | |
> v v v || v v v
> +--+--+--+--+-----+---+------+---+--+---+------+--+---+
> | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring for virtio_net_hdr
> +--+--+--+--+-----+---+------+---+--+---+------+--+---+
> | | | || | | |
> v v v || v v v
> +--+--+--+--+-----+---+------+---+--+---+------+--+---+
> | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat
>
>
>
> /huawei
>
More information about the dev
mailing list