[dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy

Li, Xiaoyun xiaoyun.li at intel.com
Fri Oct 13 09:41:14 CEST 2017



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas at monjalon.net]
> Sent: Friday, October 13, 2017 15:36
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Li, Xiaoyun
> <xiaoyun.li at intel.com>
> Cc: dev at dpdk.org; Richardson, Bruce <bruce.richardson at intel.com>; Lu,
> Wenzhuo <wenzhuo.lu at intel.com>; Zhang, Helin <helin.zhang at intel.com>
> Subject: Re: [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over
> memcpy
> 
> 13/10/2017 09:31, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > 13/10/2017 03:06, Li, Xiaoyun:
> > > > Hi
> > > > Sorry for the late reply. I took AL last 3 days.
> > > >
> > > > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > > > 05/10/2017 14:33, Xiaoyun Li:
> > > > > > +/**
> > > > > > + * Macro for copying unaligned block from one location to
> > > > > > +another with constant load offset,
> > > > > > + * 47 bytes leftover maximum,
> > > > > > + * locations should not overlap.
> > > > > > + * Requirements:
> > > > > > + * - Store is aligned
> > > > > > + * - Load offset is <offset>, which must be immediate value
> > > > > > +within [1, 15]
> > > > > > + * - For <src>, make sure <offset> bit backwards & <16 -
> > > > > > +offset> bit forwards are available for loading
> > > > > > + * - <dst>, <src>, <len> must be variables
> > > > > > + * - __m128i <xmm0> ~ <xmm8> must be pre-defined  */ #define
> > > > > > +MOVEUNALIGNED_LEFT47_IMM(dst, src, len,
> > > > >
> > > > > Naive question:
> > > > > Is there a real benefit of using a macro compared to a static
> > > > > inline function optimized by a modern compiler?
> > > > >
> > > > The macro is in the existing DPDK codes. I didn't touch it. I just change
> the file name and the function name to rte_memcpy_internal.
> > > > So I am not clear about if there is real benefit.
> > > > In my opinion, I think it is the same as static inline function.
> > > >
> > > > Do I need to change them to inline function?
> > >
> > > In this patch, it appears as a new macro.
> >
> > Ah no, it definitely been there before.
> > All we did here - git mv rte_memcpy.h rte_memcpyu_interlan.h and then
> > in rte_memcpy_internal.h renamed rte_memcpy() to
> rte_memcpy_internal().
> >
> > > If you can, inline function is cleaner for the new one.
> >
> > I don't think it will be straightforward - one of the parameters is a constant
> value.
> > My preference would be to keep original rte_memcpy() code intact as
> > much as we can here (except probably cosmetic changes - indentation, line
> length fixing etc.).
> > After all that patch is for adding architecture function selection at runtime
> only.
> > If we like to improve our rte_memcpy() any furher - NP with that, but
> > let it be a separate patch.
> 
> OK
> 
Then I will just modify indentation and line length fix and keep the original macro.

> I am waiting this patch to close RC1 today.
I will do it ASAP.

Best Regards
Xiaoyun Li





More information about the dev mailing list