[dpdk-dev] [PATCH 0/4] DPDK memcpy optimization
Bruce Richardson
bruce.richardson at intel.com
Wed Jan 21 12:40:22 CET 2015
On Wed, Jan 21, 2015 at 03:44:23AM +0000, Wang, Zhihong wrote:
> Neil, Bruce,
>
> Some data first.
>
> Sandy Bridge without AVX2:
> 1. original w/ 10 constant memcpy: 2'25"
> 2. patch w/ 12 constant memcpy: 2'41"
> 3. patch w/ 63 constant memcpy: 9'41"
>
> Haswell with AVX2:
> 1. original w/ 10 constant memcpy: 1'57"
> 2. patch w/ 12 constant memcpy: 1'56"
> 3. patch w/ 63 constant memcpy: 3'16"
>
> Also, to address Bruce's question, we have to reduce test case to cut down compile time. Because we use:
> 1. intrinsics instead of assembly for better flexibility and can utilize more compiler optimization
> 2. complex function body for better performance
> 3. inlining
> This increases compile time.
> But I think it'd be okay to do that as long as we can select a fair set of test points.
>
> It'd be great if you could give some suggestion, say, 12 points.
>
> Zhihong (John)
>
Hi Zhihong,
Just for comparison I've done a clean dpdk compile on my SNB system this morning.
Using parallel make (which is pretty normal I suspect), I get the following
numbers:
real 0m52.549s
user 0m36.034s
sys 0m10.014s
So total compile time is 52 seconds.
Running a make uninstall and then make install again with "-j 1", provides the
following numbers:
real 0m32.751s
user 0m16.041s
sys 0m7.946s
Obviously, caching effects are being completely ignored by the this unscientific
study (rerunning the first test again gives a 13-second time), but the upshot
is that the compile time for DPDK right now is well under a minute in the normal
case. Adding in a new file that, in the best case, takes two minutes to compile
is going to increase our compile time many times over.
Regards,
/Bruce
More information about the dev
mailing list