[dpdk-dev] [PATCH] service: fix race in service on app lcore function

Bruce Richardson bruce.richardson at intel.com
Wed Nov 1 18:09:09 CET 2017


On Tue, Oct 31, 2017 at 11:49:02AM +0000, Harry van Haaren wrote:
> This commit fixes a possible race condition if an application
> uses the service-cores infrastructure and the function to run
> a service on an application lcore at the same time.
> 
> The fix is to change the num_mapped_cores variable to be an
> atomic variable. This causes concurrent accesses by multiple
> threads to a service using rte_service_run_iter_on_app_lcore()
> to detect if another core is currently mapped to the service,
> and refuses to run if it is not multi-thread safe.
> 
> No performance impact is expected as the mappings for the
> service-cores changes at control-path frequency, hence the
> change from an ordinary write to an atomic write will not
> have any significant impact.
> 
> Two unit tests were added to verify the behaviour of the
> function to run a service on an application core, testing both
> a multi-thread safe service, and a multi-thread unsafe service.
> 
> The doxygen API documentation for the function has been updated
> to reflect the current and correct behaviour.
> 
> Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
> 
> Signed-off-by: Harry van Haaren <harry.van.haaren at intel.com>
> 
> ---
> 
<snip>
@@ -381,8 +381,28 @@ service_run(uint32_t i, struct core_state *cs, uint64_t service_mask)
>  int32_t rte_service_run_iter_on_app_lcore(uint32_t id)
>  {
>  	/* run service on calling core, using all-ones as the service mask */
> +	if (!service_valid(id))
> +		return -EINVAL;
> +
>  	struct core_state *cs = &lcore_states[rte_lcore_id()];
> -	return service_run(id, cs, UINT64_MAX);
> +	struct rte_service_spec_impl *s = &rte_services[id];
> +
> +	/* Atomically add this core to the mapped cores first, then examine if
> +	 * we can run the service. This avoids a race condition between
> +	 * checking the value, and atomically adding to the mapped count.
> +	 */
> +	rte_atomic32_inc(&s->num_mapped_cores);
> +
> +	if (service_mt_safe(s) == 0 &&
> +			rte_atomic32_read(&s->num_mapped_cores) > 1) {
> +		rte_atomic32_dec(&s->num_mapped_cores);
> +		return -EBUSY;
> +	}
> +
> +	int ret = service_run(id, cs, UINT64_MAX);
> +	rte_atomic32_dec(&s->num_mapped_cores);
> +
> +	return ret;
>  }

Do we really need to do an atomic inc and dec in this function? If we
check that there are no service cores mapped, would that not be enough
for safety? If an app core is calling a service, the control plane is
unlikely to decide to start spawning off a service core for that
service simultaneously. Manipulating the count is safer, yes, but unlike
the other changes in this patch, this one will affect performance, so I
think we can go without. Similarly, for multiple data plane threads
calling the same service simultaneously: everything else dataplane is
done without locks so I think this should be too.

/Bruce


More information about the dev mailing list