[dpdk-stable] [PATCH v2 1/6] service: fix race condition for MT unsafe service

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Fri May 1 16:56:12 CEST 2020


<snip>
> >
> > > > Subject: [PATCH v2 1/6] service: fix race condition for MT unsafe
> > > > service
> > > >
> > > > From: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > >
> > > > The MT unsafe service might get configured to run on another core
> > > > while the service is running currently. This might result in the
> > > > MT unsafe service running on multiple cores simultaneously. Use
> > > > 'execute_lock' always when the service is MT unsafe.
> > > >
> > > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
> > > > Cc: stable at dpdk.org
> > > >
> > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > > Reviewed-by: Phil Yang <phil.yang at arm.com>
> > > > ---
> > >
> > > Thanks for spinning a new revision - based on ML discussion
> > > previously, it seems like the "use service-run-count" to avoid this
> > > race would be a complex solution.
> > >
> > > Suggesting the following;
> > > 1) Take the approach as per this patch, to always take the atomic,
> > > fixing the race condition.
> > Ok
> 
> I've micro-benchmarked this code change inside the service cores autotest,
> and it introduces around 35 cycles of overhead per service call.  This is not
> ideal, but given it's a bugfix, and by far the simplest method to fix this race-
> condition. Having discussed and investigated multiple other solutions, I
> believe this is the right solution.
> Thanks Honnappa and Phil for identifying and driving a solution.
You are welcome. Thank you for your timely responses.

> 
> I suggest to post the benchmarking unit-test addition patch, and integrate
> that
> *before* the series under review here gets merged? This makes
> benchmarking of the "before bugfix" performance in future easier should it be
> required.
I do not see any issues, would be happy to review. I think we still have time to catch up with RC2 (May 8th).
You had also mentioned about calling out that, the control plane APIs are not MT safe. Should I add that to this patch?

> 
> 
> > > 2) Add an API to service-cores, which allows "committing" of mappings.
> > > Committing the mapping would imply that the mappings will not be
> > > changed in future. With runtime-remapping being removed from the
> > > equation, the existing branch-over-atomic optimization is valid again.
> > Ok. Just to make sure I understand this:
> > a) on the data plane, if commit API is called (probably a new state
> > variable) and num_mapped_cores is set to 1, there is no need to take the
> lock.
> > b) possible implementation of the commit API would check if
> > num_mapped_cores for the service is set to 1 and set a variable to
> > indicate that the lock is not required.
> >
> > What do you think about asking the application to set  the service
> > capability to MT_SAFE if it knows that the service will run on a
> > single core? This would require us to change the documentation and does
> not require additional code.
> 
> That's a nice idea - I like that if applications want to micro-optimize around
> the atomic, that they have a workaround/solution to do so, particularly that it
> doesn't require code-changes and backporting.
> 
> Will send review and send feedback on the patches themselves.
> Regards, -Harry
> 
> > > So this would offer applications two situations
> > > A) No application change: possible performance regression due to
> > > atomic always taken.
> > > B) Call "commit" API, and regain the performance as per previous
> > > DPDK versions.
> > >
> > > Thoughts/opinions on the above?  I've flagged the rest of the
> > > patchset for review ASAP. Regards, -Harry
> > >
> > > >  lib/librte_eal/common/rte_service.c | 11 +++++------
> > > >  1 file changed, 5 insertions(+), 6 deletions(-)
> <snip patch changes>



More information about the stable mailing list