[dpdk-dev] [PATCH v3 00/12] generic rte atomic APIs deprecate proposal

Van Haaren, Harry harry.van.haaren at intel.com
Fri Mar 27 15:47:51 CET 2020


> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Sent: Friday, March 20, 2020 6:32 PM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>; Phil Yang
> <Phil.Yang at arm.com>; thomas at monjalon.net; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; stephen at networkplumber.org;
> maxime.coquelin at redhat.com; dev at dpdk.org
> Cc: david.marchand at redhat.com; jerinj at marvell.com; hemant.agrawal at nxp.com;
> Gavin Hu <Gavin.Hu at arm.com>; Ruifeng Wang <Ruifeng.Wang at arm.com>; Joyce Kong
> <Joyce.Kong at arm.com>; Carrillo, Erik G <erik.g.carrillo at intel.com>; nd
> <nd at arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>; nd
> <nd at arm.com>
> Subject: RE: [PATCH v3 00/12] generic rte atomic APIs deprecate proposal
> 
> + Erik as there are similar changes to timer library
> 
> <snip>
> 
> >
> > > > Subject: [PATCH v3 00/12] generic rte atomic APIs deprecate proposal
> > > >
> > > > DPDK provides generic rte_atomic APIs to do several atomic operations.
> > > > These APIs are using the deprecated __sync built-ins and enforce
> > > > full memory barriers on aarch64. However, full barriers are not
> > > > necessary in many use cases. In order to address such use cases, C
> > > > language offers
> > > > C11 atomic APIs. The C11 atomic APIs provide finer memory barrier
> > > > control by making use of the memory ordering parameter provided by
> > > > the
> > > user.
> > > > Various patches submitted in the past [2] and the patches in this
> > > > series indicate significant performance gains on multiple aarch64
> > > > CPUs and no performance loss on x86.
> > > >
> > > > But the existing rte_atomic API implementations cannot be changed as
> > > > the APIs do not take the memory ordering parameter. The only choice
> > > > available is replacing the usage of the rte_atomic APIs with C11
> > > > atomic APIs. In order to make this change, the following steps are
> > proposed:
> > > >
> > > > [1] deprecate rte_atomic APIs so that future patches do not use
> > > > rte_atomic APIs (a script is added to flag the usages).
> > > > [2] refactor the code that uses rte_atomic APIs to use c11 atomic
> APIs.
> > >
> > > On [1] above, I feel deprecating DPDKs atomic functions and failing
> > > checkpatch is a bit sudden. Perhaps noting that in a future release
> > > (20.11?) DPDK will move to a
> > > C11 based atomics model is a more gradual step to achieving the goal,
> > > and at that point add a checkpatch warning for additions of rte_atomic*?
> > We have been working on changing existing usages of rte_atomic APIs in
> DPDK
> > to use C11 atomics. Usually, the x.11 releases have significant amount of
> > changes (not sure how many would use rte_atomic APIs). I would prefer that
> > in 20.11 no additional code is added using rte_atomics APIs. However, I am
> > open to suggestions on the exact time frame.
> > Once we decide on the release, I think it makes sense to add a 'warning'
> in the
> > checkpatch to indicate the deprecation timeline and add an 'error' after
> the
> > release.

The above sounds reasonable - mainly let's not block any code that exists
or is being developed today using rte_atomic_* APIs from making a release.


> > > More on [2] in context below.
> > >
> > > The above is my point-of-view, of course I'd like more people from the
> > > DPDK community to provide their input too.
> > >
> > >
> > > > This patchset contains:
> > > > 1) the checkpatch script changes to flag rte_atomic API usage in
> patches.
> > > > 2) changes to programmer guide describing writing efficient code for
> > > aarch64.
> > > > 3) changes to various libraries to make use of c11 atomic APIs.
> > > >
> > > > We are planning to replicate this idea across all the other
> > > > libraries, drivers, examples, test applications. In the next phase,
> > > > we will add changes to the mbuf, the EAL interrupts and the event
> > > > timer adapter
> > > libraries.
> > >
> > > About ~6/12 patches of this C11 set are targeting the Service Cores
> > > area of DPDK. I have some concerns over increased complexity of C11
> > > implementation vs the (already complex) rte_atomic implementation today.
> > I agree that it C11 changes are complex, especially if one is starting out
> to
> > understand what these APIs provide. From my experience, once few
> > underlying concepts are understood, reviewing or making changes do not
> take
> > too much time.
> >
> > > I see other patchsets enabling C11 across other DPDK components, so
> > > maybe we should also discuss C11 enabling in a wider context that just
> > service cores?
> > Yes, agree. We are in the process of making changes to other areas as
> well.
> >
> > >
> > > I don't think it fair to expect all developers to be well versed in
> > > C11 atomic semantics like understanding the complex interactions
> > > between the various
> > > C11 RELEASE, AQUIRE barriers requires.
> > C11 has been around from sometime now. To my surprise, OVS already uses
> > C11 APIs extensively. VPP has been accepting C11 related changes from past
> > couple of years. Having said that, I agree in general that not everyone is
> > well versed.

Fair point - like so many things, once familiar with it, it becomes easy :)


> > > As maintainer of Service Cores I'm hesitant to accept the large-scale
> > > refactor
> > Right now, the patches are split into multiple commits. If required I can
> host a
> > call to go over simple C11 API usages (sufficient to cover the usage in
> service
> > core) and the changes in this patch. If you find that particular areas
> need more
> > understanding I can work on providing additional information such as
> memory
> > order ladder diagrams. Please let me know what you think.

Thanks for the offer - I will need to do my due diligence on reiview before
taking up any of your or other C11 folks time.

> When I started working with C11 APIs, I had referred to the following blogs.
> https://preshing.com/20120913/acquire-and-release-semantics/
> https://preshing.com/20130702/the-happens-before-relation/
> https://preshing.com/20130823/the-synchronizes-with-relation/
> 
> These will be helpful to understand the changes.

Thanks, indeed good articles. I found the following slide deck particularly
informative due to the fantastic diagrams (eg, slide 23):
https://mariadb.org/wp-content/uploads/2017/11/2017-11-Memory-barriers.pdf

That said, finding a nice diagram and understanding the implications of
actually using it is different! I hope to properly review the
service-cores patches next week.

> > > of atomic-implementation, as it could lead to racey bugs that are
> > > likely extremely difficult to track down. (The recent race-on-exit has
> > > proven the difficulty in reproducing, and that's with an atomics model
> > > I'm quite familiar with).
> > >
> > > Let me be very clear: I don't wish to block a C11 atomic
> > > implementation, but I'd like to discuss how we (DPDK community) can
> > > best port-to and maintain a complex multi-threaded service library
> > > with best-in-class performance for the workload.
> > >
> > > To put some discussions/solutions on the table:
> > > - Shared Maintainership of a component?
> > >      Split in functionality and C11 atomics implementation
> > >      Obviously there would be collaboration required in such a case.
> > > - Maybe shared maintainership is too much?
> > >      A gentlemans/womans agreement of "helping out" with C11 atomics
> > > debug is enough?
> > I think shared maintainer ship could be too much as there are many
> changes.
> > But, I and other engineers from Arm (I would include Arm ecosystem as
> well)
> > can definitely help out on debug and reviews involving C11 APIs. (I see
> > Thomas's reply on this topic).

Thanks for the offer - as above, ball on my side, I'll go review.

<snip>


More information about the dev mailing list