[dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue

Jianbo Liu Jianbo.Liu at arm.com
Fri Oct 13 09:33:37 CEST 2017


The 10/13/2017 07:19, Jerin Jacob wrote:
> -----Original Message-----
> > Date: Fri, 13 Oct 2017 09:16:31 +0800
> > From: Jia He <hejianet at gmail.com>
> > To: Jerin Jacob <jerin.jacob at caviumnetworks.com>, "Ananyev, Konstantin"
> >  <konstantin.ananyev at intel.com>
> > Cc: Olivier MATZ <olivier.matz at 6wind.com>, "dev at dpdk.org" <dev at dpdk.org>,
> >  "jia.he at hxt-semitech.com" <jia.he at hxt-semitech.com>,
> >  "jie2.liu at hxt-semitech.com" <jie2.liu at hxt-semitech.com>,
> >  "bing.zhao at hxt-semitech.com" <bing.zhao at hxt-semitech.com>
> > Subject: Re: [PATCH] ring: guarantee ordering of cons/prod loading when
> >  doing enqueue/dequeue
> > User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
> >  Thunderbird/52.3.0
> >
> > Hi
> >
> >
> > On 10/13/2017 9:02 AM, Jia He Wrote:
> > > Hi Jerin
> > >
> > >
> > > On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
> > > > -----Original Message-----
> > > > > Date: Thu, 12 Oct 2017 17:05:50 +0000
> > > > >
> > [...]
> > > > On the same lines,
> > > >
> > > > Jia He, jie2.liu, bing.zhao,
> > > >
> > > > Is this patch based on code review or do you saw this issue on any
> > > > of the
> > > > arm/ppc target? arm64 will have performance impact with this change.
> > sorry, miss one important information
> > Our platform is an aarch64 server with 46 cpus.
>
> Is this an OOO(Out of order execution) aarch64 CPU implementation?
>
> > If we reduced the involved cpu numbers, the bug occurred less frequently.
> >
> > Yes, mb barrier impact the performance, but correctness is more important,
> > isn't it ;-)
>
> Yes.
>
> > Maybe we can  find any other lightweight barrier here?
>
> Yes, Regarding the lightweight barrier, arm64 has native support for acquire and release
> semantics, which is exposed through gcc as architecture agnostic
> functions.
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> http://preshing.com/20130922/acquire-and-release-fences/
>
> Good to know,
> 1) How much overhead this patch in your platform? Just relative
> numbers are enough
> 2) As a prototype, Is Changing to acquire and release schematics
> reduces the overhead in your platform?
>

+1, can you try what ODP does in the link mentioned below?

> Reference FreeBSD ring/DPDK style ring implementation through acquire
> and release schematics
> https://github.com/Linaro/odp/blob/master/platform/linux-generic/pktio/ring.c
>
> I will also spend on cycles on this.
>
>
> >
> > Cheers,
> > Jia
> > > Based on mbuf_autotest, the rte_panic will be invoked in seconds.
> > >
> > > PANIC in test_refcnt_iter():
> > > (lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
> > > 1: [./test(rte_dump_stack+0x38) [0x58d868]]
> > > Aborted (core dumped)
> > >
> > > Cheers,
> > > Jia
> > > >
> > > >
> > > > > Konstantin
> > >
> >

--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


More information about the dev mailing list