[dpdk-dev] [PATCH] ring: relax alignment constraint on ring structure

Jerin Jacob jerin.jacob at caviumnetworks.com
Tue Apr 3 18:42:50 CEST 2018


-----Original Message-----
> Date: Tue, 3 Apr 2018 17:56:01 +0200
> From: Olivier Matz <olivier.matz at 6wind.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> CC: dev at dpdk.org, konstantin.ananyev at intel.com, bruce.richardson at intel.com
> Subject: Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring
>  structure
> User-Agent: NeoMutt/20170113 (1.7.2)
> 
> On Tue, Apr 03, 2018 at 09:07:04PM +0530, Jerin Jacob wrote:
> > -----Original Message-----
> > > Date: Tue, 3 Apr 2018 17:25:17 +0200
> > > From: Olivier Matz <olivier.matz at 6wind.com>
> > > To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > > CC: dev at dpdk.org, konstantin.ananyev at intel.com, bruce.richardson at intel.com
> > > Subject: Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring
> > >  structure
> > > User-Agent: NeoMutt/20170113 (1.7.2)
> > > 
> > > On Tue, Apr 03, 2018 at 08:37:23PM +0530, Jerin Jacob wrote:
> > > > -----Original Message-----
> > > > > Date: Tue, 3 Apr 2018 15:26:44 +0200
> > > > > From: Olivier Matz <olivier.matz at 6wind.com>
> > > > > To: dev at dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring
> > > > >  structure
> > > > > X-Mailer: git-send-email 2.11.0
> > > > > 
> > > > > The initial objective of
> > > > > commit d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> > > > > was to add an empty cache line betwee, the producer and consumer
> > > > > data (on platform with cache line size = 64B), preventing from
> > > > > having them on adjacent cache lines.
> > > > > 
> > > > > Following discussion on the mailing list, it appears that this
> > > > > also imposes an alignment constraint that is not required.
> > > > > 
> > > > > This patch removes the extra alignment constraint and adds the
> > > > > empty cache lines using padding fields in the structure. The
> > > > > size of rte_ring structure and the offset of the fields remain
> > > > > the same on platforms with cache line size = 64B:
> > > > > 
> > > > >   rte_ring = 384
> > > > >   rte_ring.name = 0
> > > > >   rte_ring.flags = 32
> > > > >   rte_ring.memzone = 40
> > > > >   rte_ring.size = 48
> > > > >   rte_ring.mask = 52
> > > > >   rte_ring.prod = 128
> > > > >   rte_ring.cons = 256
> > > > > 
> > > > > But it has an impact on platform where cache line size is 128B:
> > > > > 
> > > > >   rte_ring = 384        -> 768
> > > > >   rte_ring.name = 0
> > > > >   rte_ring.flags = 32
> > > > >   rte_ring.memzone = 40
> > > > >   rte_ring.size = 48
> > > > >   rte_ring.mask = 52
> > > > >   rte_ring.prod = 128   -> 256
> > > > >   rte_ring.cons = 256   -> 512
> > > > 
> > > > Are we leaving TWO cacheline to make sure, HW prefetch don't load
> > > > the adjust cacheline(consumer)?
> > > > 
> > > > If so, Will it have impact on those machine where it is 128B Cache line
> > > > and the HW prefetcher is not loading the next caching explicitly. Right?
> > > 
> > > The impact on machines that have a 128B cache line is that an unused
> > > cache line will be added between the producer and consumer data. I
> > > expect that the impact is positive in case there is a hw prefetcher, and
> > > null in case there is no such prefetcher.
> > 
> > It is not NULL, Right? You are loosing 256B for each ring.
> 
> Is it really that important?

Pipeline or eventdev SW cases there could more rings in the system.
I don't see any downside of having config option which is enabled
default.

In my view, such config options are good, as in embedded usecases, customers
can really fine tune the target for the need. In server usecases, let the default
of option be enabled, no harm.

> 
> 
> > > On machines with 64B cache line, this was already the case. It just
> > > reduces the alignment constraint.
> > 
> > Not all the 64B CL machines will have HW prefetch.
> > 
> > I would recommend to add conditional compilation flags to express HW
> > prefetch enabled or not? based on that we can decide to reserve
> > the additional space. By default, in common config, HW prefetch can
> > be enabled so that it works for almost all cases.
> 
> The hw prefetcher can be enabled at runtime, so a compilation flag
> does not seem to be a good idea. Moreover, changing this compilation

On those Hardwares HW prefetch can be disabled at runtime, it is fine
with default config. I was taking about some low end ARM hardware which
does not have HW prefetch is not present at all.

> flag would change the ABI.

ABI is broken anyway, Right? due to size of the structure change.




More information about the dev mailing list