[dpdk-dev] [PATCH 1/2] eal: add macro to mark variable mostly read only

Bruce Richardson bruce.richardson at intel.com
Thu Apr 19 14:09:58 CEST 2018


On Thu, Apr 19, 2018 at 02:50:52PM +0530, Pavan Nikhilesh wrote:
> On Wed, Apr 18, 2018 at 07:03:06PM +0100, Ferruh Yigit wrote:
> > On 4/18/2018 6:55 PM, Pavan Nikhilesh wrote:
> > > On Wed, Apr 18, 2018 at 06:43:11PM +0100, Ferruh Yigit wrote:
> > >> On 4/18/2018 4:30 PM, Pavan Nikhilesh wrote:
> > >>> Add macro to mark a variable to be mostly read only and place it in a
> > >>> separate section.
> > >>>
> > >>> Signed-off-by: Pavan Nikhilesh <pbhagavatula at caviumnetworks.com>
> > >>> ---
> > >>>
> > >>>  Group together mostly read only data to avoid cacheline bouncing, also
> > >>>  useful for auditing purposes.
> > >>>
> > >>>  lib/librte_eal/common/include/rte_common.h | 5 +++++
> > >>>  1 file changed, 5 insertions(+)
> > >>>
> > >>> diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h
> > >>> index 6c5bc5a76..f2ff2e9e6 100644
> > >>> --- a/lib/librte_eal/common/include/rte_common.h
> > >>> +++ b/lib/librte_eal/common/include/rte_common.h
> > >>> @@ -114,6 +114,11 @@ static void __attribute__((constructor(prio), used)) func(void)
> > >>>   */
> > >>>  #define __rte_noinline  __attribute__((noinline))
> > >>>
> > >>> +/**
> > >>> + * Mark a variable to be mostly read only and place it in a separate section.
> > >>> + */
> > >>> +#define __rte_read_mostly __attribute__((__section__(".read_mostly")))
> > >>
> > >
> > > Hi Ferruh,
> > >
> > >> Hi Pavan,
> > >>
> > >> Is the section ".read_mostly" treated specially [1] or is this just for grouping
> > >> symbols together (to reduce cacheline bouncing)?
> > >
> > > The section .read_mostly is not treated specially it's just for grouping
> > > symbols.
> >
> > I have encounter with a blog post claiming this is not working:
> >
> > "
> > The problem with the above approach is that once all the __read_mostly variables
> > are grouped into one section, the remaining "non-read-mostly" variables end-up
> > together too. This increases the chances that two frequently used elements (in
> > the "non-read-mostly" region) will end-up competing for the same position (or
> > cache-line, the basic fixed-sized block for memory<-->cache transfers) in the
> > cache. Thus frequent accesses will cause excessive cache thrashing on that
> > particular cache-line thereby degrading the overall system performance.
> > "
> >
> > https://thecodeartist.blogspot.com/2011/12/why-readmostly-does-not-work-as-it.html
> >
> 
> The author is concerned about processors with less cache set-associativity,
> almost all modern processors have >= 16 way set associativity. And the above
> issue can happen even now when two frequently written global variables are
> placed next to each other.
> 
> Currently, we don't have much control over how the global variables are
> arranged and a single addition/deletion to the global variables causes change
> in alignment and in some cases minor performance regression.
> Tagging them as __read_mostly we can easily identify the alignment changes
> across builds by comparing map files global variable section.
> 
> I have verified the patch-set on arm64 (16-way set-associative) and didn't
> notice any performance regression.
> Did you have a chance to verify if there is any performance regression?
> 
Is there a performance improvement? It's seems a relatively strange change
to me, so I'd like to know that it really improves performance in test
cases.

/Bruce


More information about the dev mailing list