[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

Ravi Kerur rkerur at gmail.com
Wed Apr 15 23:04:58 CEST 2015


On Tue, Apr 14, 2015 at 7:53 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Tue, 14 Apr 2015 14:31:53 -0700
> Ravi Kerur <rkerur at gmail.com> wrote:
>
> > +
> > +     for (i = 0; i < 2; i++)
> > +             rte_mov32(dst + i * 32, src + i * 32);
> >  }
> Unless you force compiler to unroll the loop, it will be slower.
>

I had done following things

1. Use sample code from Intel to make sure CPU supports those instructions.
2. Check generated code with and without loop using (gcc -O3 -m64 -S), gcc
version is 4.8.2
No difference in code generated between "loop" and "no-loop". At least I
was expecting difference in the code.

3. Run "make test" and compare "memcpy perf" results.


More information about the dev mailing list