[dpdk-dev] virtio optimization idea

Xie, Huawei huawei.xie at intel.com
Fri Sep 4 10:25:05 CEST 2015


Hi:

Recently I have done one virtio optimization proof of concept. The
optimization includes two parts:
1) avail ring set with fixed descriptors
2) RX vectorization
With the optimizations, we could have several times of performance boost
for purely vhost-virtio throughput.

Here i will only cover the first part, which is the prerequisite for the
second part.
Let us first take RX for example. Currently when we fill the avail ring
with guest mbuf, we need
a) allocate one descriptor(for non sg mbuf) from free descriptors
b) set the idx of the desc into the entry of avail ring
c) set the addr/len field of the descriptor to point to guest blank mbuf
data area

Those operation takes time, and especially step b results in modifed (M)
state of the cache line for the avail ring in the virtio processing
core. When vhost processes the avail ring, the cache line transfer from
virtio processing core to vhost processing core takes pretty much CPU
cycles.
To solve this problem, this is the arrangement of RX ring for DPDK
pmd(for non-mergable case).
   
                    avail                      
                    idx                        
                    +                          
                    |                          
+----+----+---+-------------+------+           
| 0  | 1  | 2 | ... |  254  | 255  |  avail ring
+-+--+-+--+-+-+---------+---+--+---+           
  |    |    |       |   |      |               
  |    |    |       |   |      |               
  v    v    v       |   v      v               
+-+--+-+--+-+-+---------+---+--+---+           
| 0  | 1  | 2 | ... |  254  | 255  |  desc ring
+----+----+---+-------------+------+           
                    |                          
                    |                          
+----+----+---+-------------+------+           
| 0  | 1  | 2 |     |  254  | 255  |  used ring
+----+----+---+-------------+------+           
                    |                          
                    +    
Avail ring is initialized with fixed descriptor and is never changed,
i.e, the index value of the nth avail ring entry is always n, which
means virtio PMD is actually refilling desc ring only, without having to
change avail ring.
When vhost fetches avail ring, if not evicted, it is always in its first
level cache.

When RX receives packets from used ring, we use the used->idx as the
desc idx. This requires that vhost processes and returns descs from
avail ring to used ring in order, which is true for both current dpdk
vhost and kernel vhost implementation. In my understanding, there is no
necessity for vhost net to process descriptors OOO. One case could be
zero copy, for example, if one descriptor doesn't meet zero copy
requirment, we could directly return it to used ring, earlier than the
descriptors in front of it.
To enforce this, i want to use a reserved bit to indicate in order
processing of descriptors.

For tx ring, the arrangement is like below. Each transmitted mbuf needs
a desc for virtio_net_hdr, so actually we have only 128 free slots.
                                                                                      

                           
++                                                          
                           
||                                                          
                           
||                                                          
  
+-----+-----+-----+--------------+------+------+------+                              

   |  0  |  1  | ... |  127 || 128  | 129  | ...  | 255  |   avail ring
with fixed descriptor                
  
+--+--+--+--+-----+---+------+---+--+---+------+--+---+                              

      |     |            |  ||  |      |            
|                                  
      v     v            v  ||  v      v            
v                                  
  
+--+--+--+--+-----+---+------+---+--+---+------+--+---+                              

   | 127 | 128 | ... |  255 || 127  | 128  | ...  | 255  |   desc ring
for virtio_net_hdr
  
+--+--+--+--+-----+---+------+---+--+---+------+--+---+                              

      |     |            |  ||  |      |            
|                                  
      v     v            v  ||  v      v            
v                                  
  
+--+--+--+--+-----+---+------+---+--+---+------+--+---+                              

   |  0  |  1  | ... |  127 ||  0   |  1   | ...  | 127  |   desc ring
for tx dat       
  
+-----+-----+-----+--------------+------+------+------+                        


                     
/huawei


More information about the dev mailing list