[dpdk-dev] [PATCH 0/4] DPDK memcpy optimization

Bruce Richardson bruce.richardson at intel.com
Wed Jan 21 12:40:22 CET 2015


On Wed, Jan 21, 2015 at 03:44:23AM +0000, Wang, Zhihong wrote:
 
> Neil, Bruce,
> 
> Some data first.
> 
> Sandy Bridge without AVX2:
> 1. original w/ 10 constant memcpy: 2'25" 
> 2. patch w/ 12 constant memcpy: 2'41" 
> 3. patch w/ 63 constant memcpy: 9'41" 
> 
> Haswell with AVX2:
> 1. original w/ 10 constant memcpy: 1'57" 
> 2. patch w/ 12 constant memcpy: 1'56" 
> 3. patch w/ 63 constant memcpy: 3'16" 
> 
> Also, to address Bruce's question, we have to reduce test case to cut down compile time. Because we use:
> 1. intrinsics instead of assembly for better flexibility and can utilize more compiler optimization 
> 2. complex function body for better performance 
> 3. inlining 
> This increases compile time.
> But I think it'd be okay to do that as long as we can select a fair set of test points.
> 
> It'd be great if you could give some suggestion, say, 12 points.
> 
> Zhihong (John)
> 
Hi Zhihong,

Just for comparison I've done a clean dpdk compile on my SNB system this morning.
Using parallel make (which is pretty normal I suspect), I get the following
numbers:
 real    0m52.549s
 user    0m36.034s
 sys     0m10.014s

So total compile time is 52 seconds.

Running a make uninstall and then make install again with "-j 1", provides the 
following numbers:

 real    0m32.751s
 user    0m16.041s
 sys     0m7.946s

Obviously, caching effects are being completely ignored by the this unscientific
study (rerunning the first test again gives a 13-second time), but the upshot
is that the compile time for DPDK right now is well under a minute in the normal
case. Adding in a new file that, in the best case, takes two minutes to compile
is going to increase our compile time many times over. 

Regards,
/Bruce


More information about the dev mailing list