[dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

Yuanhan Liu yuanhan.liu at linux.intel.com
Mon Oct 10 06:22:09 CEST 2016


On Mon, Oct 10, 2016 at 07:17:06AM +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 10, 2016 at 12:05:31PM +0800, Yuanhan Liu wrote:
> > On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > > > And the same is done is done in DPDK:
> > > > > 
> > > > > static inline int __attribute__((always_inline))
> > > > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > > > >           uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > > > >           struct rte_mempool *mbuf_pool)
> > > > > {
> > > > > ...
> > > > >     /*
> > > > >      * A virtio driver normally uses at least 2 desc buffers
> > > > >      * for Tx: the first for storing the header, and others
> > > > >      * for storing the data.
> > > > >      */
> > > > >     if (likely((desc->len == dev->vhost_hlen) &&
> > > > >            (desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > > >         desc = &descs[desc->next];
> > > > >         if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > > >             return -1;
> > > > > 
> > > > >         desc_addr = gpa_to_vva(dev, desc->addr);
> > > > >         if (unlikely(!desc_addr))
> > > > >             return -1;
> > > > > 
> > > > >         rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > > > 
> > > > >         desc_offset = 0;
> > > > >         desc_avail  = desc->len;
> > > > >         nr_desc    += 1;
> > > > > 
> > > > >         PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > > >     } else {
> > > > >         desc_avail  = desc->len - dev->vhost_hlen;
> > > > >         desc_offset = dev->vhost_hlen;
> > > > >     }
> > > > 
> > > > Actually, the header is parsed in DPDK vhost implementation.
> > > > But as Virtio PMD provides a zero'ed header, we could just parse
> > > > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> > > 
> > > host can always skip the header parse if it wants to.
> > > It didn't seem worth it to add branches there but
> > > if I'm wrong, by all means code it up.
> > 
> > It's added by following commit, which yields about 10% performance
> > boosts for PVP case (with 64B packet size).
> > 
> > At that time, a packet always use 2 descs. Since indirect desc is
> > enabled (by default) now, the assumption is not true then. What's
> > worse, it might even slow things a bit down. That should also be
> > part of the reason why performance is slightly worse than before.
> > 
> > 	--yliu
> 
> I'm not sure I get what you are saying
> 
> > commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> > Author: Yuanhan Liu <yuanhan.liu at linux.intel.com>
> > Date:   Mon May 2 17:46:17 2016 -0700
> > 
> >     vhost: optimize dequeue for small packets
> > 
> >     A virtio driver normally uses at least 2 desc buffers for Tx: the
> >     first for storing the header, and the others for storing the data.
> > 
> >     Therefore, we could fetch the first data desc buf before the main
> >     loop, and do the copy first before the check of "are we done yet?".
> >     This could save one check for small packets that just have one data
> >     desc buffer and need one mbuf to store it.
> > 
> >     Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com>
> >     Acked-by: Huawei Xie <huawei.xie at intel.com>
> >     Tested-by: Rich Lane <rich.lane at bigswitch.com>
> 
> This fast-paths the 2-descriptors format but it's not active
> for indirect descriptors. Is this what you mean?

Yes. It's also not active when ANY_LAYOUT is actually turned on.

> Should be a simple matter to apply this optimization for indirect.

Might be.

	--yliu


More information about the dev mailing list