[dpdk-dev] some questions about rte_memcpy

Linhaifeng haifeng.lin at huawei.com
Thu Jan 22 13:53:13 CET 2015



On 2015/1/22 19:34, Bruce Richardson wrote:
> On Thu, Jan 22, 2015 at 07:23:49PM +0900, Tetsuya Mukawa wrote:
>> On 2015/01/22 16:35, Matthew Hall wrote:
>>> On Thu, Jan 22, 2015 at 01:32:04PM +0800, Linhaifeng wrote:
>>>> Do you mean if call rte_memcpy before rte_eal_init() would crash?why?
>>> No guarantee. But a theory. It might use some things from the EAL init to 
>>> figure out which version of the accelerated algorithm to use.
>>
>> This selection is done at compile-time.
>> And if the size is constant, I guess DPDK assumes memcpy is replaced by
>> inline __builtin_memcpy.
>> I haven't checked the performance of builtin memcpy, but probably much
>> faster.
>>
> 
> Yes, that assumption is correct. A couple of years ago we discovered that for
> constant size values, the compiler would generate much faster code for us
> using a regular memcpy than rte_memcpy, hence the macro.
> 
> /Bruce
> 
>> Tetsuya
>>
>>> Matthew.
>>
>>
> 
> 

Hi,Bruce

I test it,most results like you said use constant may be faster,but sometimes not.

linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999
rte_memcpy(constant) used:279893712	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277818600
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999
rte_memcpy(constant) used:279264328	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277667116
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999
rte_memcpy(constant) used:279491832	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277622772
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999
rte_memcpy(constant) used:279402156	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277738464
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999
rte_memcpy(constant) used:279305172	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277483004
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999
rte_memcpy(constant) used:279784124	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:277605332
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999
rte_memcpy(constant) used:322817260
rte_memcpy(variable) used:350333864
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999
rte_memcpy(constant) used:322840748
rte_memcpy(variable) used:350297868
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999
rte_memcpy(constant) used:322488240
rte_memcpy(variable) used:350348652
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999
rte_memcpy(constant) used:322021428
rte_memcpy(variable) used:350416440
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999
rte_memcpy(constant) used:321370900
rte_memcpy(variable) used:350355796
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999
rte_memcpy(constant) used:322704552
rte_memcpy(variable) used:349900832
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999
rte_memcpy(constant) used:422705828
rte_memcpy(variable) used:425493328
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999
rte_memcpy(constant) used:422421840	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:413691412
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999
rte_memcpy(constant) used:425233088	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:421136724
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999
rte_memcpy(constant) used:901014608	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:900997388
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999
rte_memcpy(constant) used:900803308	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:900794076
linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999
rte_memcpy(constant) used:901842436	@@@@@@@@@@@@@@ not faster
rte_memcpy(variable) used:901218984
linux-mnSyvH:/mnt/sdb/linhf/test #



here is my test codes:

#include <stdio.h>
#include <rte_memcpy.h>
#include <rte_cycles.h>


int main(int narg, char** args)
{
        int i;
        char buf[1024];
        uint64_t start, end;

        if (narg < 3) {
                printf("usage:./rte_memcpy_test size times\n");
                return 0;
        }

        size_t size_v = atoi(args[1]);
        const size_t size_c = atoi(args[1]);
        int times = atoi(args[2]);

        start = rte_rdtsc();
        for(i = 0; i < times; i++) {
                rte_memcpy(buf, buf, size_c);
        }
        end = rte_rdtsc();
        printf("rte_memcpy(constant) used:%llu\n", end - start);

        start = rte_rdtsc();
        for (i = 0; i < times; i++) {
                rte_memcpy(buf, buf, size_v);
        }
        end = rte_rdtsc();
        printf("rte_memcpy(variable) used:%llu\n", end - start);

        return 0;
}





-- 
Regards,
Haifeng



More information about the dev mailing list