[dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Oct 19 12:02:32 CEST 2017


Hi Jia,

> 
> Hi
> 
> 
> On 10/13/2017 9:02 AM, Jia He Wrote:
> > Hi Jerin
> >
> >
> > On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
> >> -----Original Message-----
> >>> Date: Thu, 12 Oct 2017 17:05:50 +0000
> >>>
> [...]
> >> On the same lines,
> >>
> >> Jia He, jie2.liu, bing.zhao,
> >>
> >> Is this patch based on code review or do you saw this issue on any of
> >> the
> >> arm/ppc target? arm64 will have performance impact with this change.
> sorry, miss one important information
> Our platform is an aarch64 server with 46 cpus.
> If we reduced the involved cpu numbers, the bug occurred less frequently.
> 
> Yes, mb barrier impact the performance, but correctness is more
> important, isn't it ;-)
> Maybe we can  find any other lightweight barrier here?
> 
> Cheers,
> Jia
> > Based on mbuf_autotest, the rte_panic will be invoked in seconds.
> >
> > PANIC in test_refcnt_iter():
> > (lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
> > 1: [./test(rte_dump_stack+0x38) [0x58d868]]
> > Aborted (core dumped)
> >

So is it only reproducible with mbuf refcnt test?
Could it be reproduced with some 'pure' ring test
(no mempools/mbufs refcnt, etc.)?
The reason I am asking - in that test we also have mbuf refcnt updates
(that's what for that test was created) and we are doing some optimizations here too
to avoid excessive atomic updates.
BTW, if the problem is not reproducible without mbuf refcnt,
can I suggest to extend the test  with:
  - add a check that enqueue() operation was successful
  - walk through the pool and check/printf refcnt of each mbuf.
Hopefully that would give us some extra information what is going wrong here. 
Konstantin
  


More information about the dev mailing list