[dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue

Wang, Zhihong zhihong.wang at intel.com
Tue Aug 23 04:15:43 CEST 2016


> Subject: Re: [PATCH v3 0/5] vhost: optimize enqueue
> 
> Hi Zhihong,
> 
[...]
> > The main optimization techniques are:
> >
> >  1. Reorder code to reduce CPU pipeline stall cycles.
> >
> >  2. Batch update the used ring for better efficiency.
> >
> >  3. Prefetch descriptor to hide cache latency.
> >
> >  4. Remove useless volatile attribute to allow compiler optimization.
> 
> Thanks for these details, this is helpful to understand where the perf
> gain comes from.
> I would suggest to add these information as comments in the code
> where/if it makes sense. If more a general comment, at least add it in
> the commit message of the patch introducing it.
> Indeed, adding it to the cover letter is fine, but the information is
> lost as soon as the series is applied.

Hi Maxime,

I did add these info in the later optimization patches to explain each
optimization techniques. The v1 was indeed hard to read.


> 
> You don't mention any figures, so I set up a benchmark on my side to
> evaluate your series. It indeed shows an interesting performance gain.
> 
> My setup consists of one host running a guest.
> The guest generates as much 64bytes packets as possible using
> pktgen-dpdk. The hosts forwards received packets back to the guest
> using testpmd on vhost pmd interface. Guest's vCPUs are pinned to
> physical CPUs.
> 

Thanks for doing the test!

I didn't publish any numbers since the gain varies in different platforms
and test setups.

In my phy to vm test on both IVB and HSW, where testpmd in the host rx from
the nic and enqueue to the guest, the enqueue efficiency (cycles per packet)
is 2.4x and 1.4x as fast as the current code for mergeable on and mergeable
off respectively, for v3 patch.


> I tested it with and without your v1 patch, with and without
> rx-mergeable feature turned ON.
> Results are the average of 8 runs of 60 seconds:
> 
> Rx-Mergeable ON : 7.72Mpps
> Rx-Mergeable ON + "vhost: optimize enqueue" v1: 9.19Mpps
> Rx-Mergeable OFF: 10.52Mpps
> Rx-Mergeable OFF + "vhost: optimize enqueue" v1: 10.60Mpps
> 
> Regards,
> Maxime


More information about the dev mailing list