[dpdk-dev] [PATCH v2] ring: use aligned memzone allocation

Jerin Jacob jerin.jacob at caviumnetworks.com
Fri Jun 9 19:28:55 CEST 2017

Previous message: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
Next message: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

-----Original Message-----
> Date: Fri, 9 Jun 2017 10:16:25 -0700
> From: Stephen Hemminger <stephen at networkplumber.org>
> To: Yerden Zhumabekov <e_zhumabekov at sts.kz>
> Cc: "Ananyev, Konstantin" <konstantin.ananyev at intel.com>, "Richardson,
>  Bruce" <bruce.richardson at intel.com>, "Verkamp, Daniel"
>  <daniel.verkamp at intel.com>, "dev at dpdk.org" <dev at dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
> 
> On Fri, 9 Jun 2017 18:47:43 +0600
> Yerden Zhumabekov <e_zhumabekov at sts.kz> wrote:
> 
> > On 06.06.2017 19:19, Ananyev, Konstantin wrote:
> > >  
> > >>>> Maybe there is some deeper  reason for the >= 128-byte alignment logic in rte_ring.h?  
> > >>> Might be, would be good to hear opinion the author of that change.  
> > >> It gives improved performance for core-2-core transfer.  
> > > You mean empty cache-line(s) after prod/cons, correct?
> > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries?
> > > Something like that:
> > > struct rte_ring {
> > >     ...
> > >     struct rte_ring_headtail prod __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > >     struct rte_ring_headtail cons __rte_cache_aligned;
> > >     EMPTY_CACHE_LINE   __rte_cache_aligned;
> > > };
> > >
> > > Konstantin
> > >  
> > 
> > I'm curious, can anyone explain, how does it actually affect 
> > performance? Maybe we can utilize it application code?
> 
> I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines.
> If these cache lines change, then it will create false sharing.

I see. I think, In such cases it is better to abstract as conditional
compilation. The above logic has worst case cache memory
requirement if CPU is 128B CL and no speculative prefetch.

Previous message: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
Next message: [dpdk-dev] [PATCH v2] ring: use aligned memzone allocation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list