[dpdk-dev] Random failure in service_autotest

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Tue Jul 21 17:09:28 CEST 2020


<snip>

> <more snip>
> 
> Thanks Lukasz & Phil for v2 reviews & Acks.
> 
> > > >>           return 0;
> > > > Changing the implementation loop counting is one option - but
> > > > changing the implementation as a workaround for a unit test seems
> > > > like the wrong
> > > way around.
> > > I agree. We should fix the test not the service implementation. Of
> > > course it would be nice to do so without inserting sleeps as it's a
> > > workaround for true synchronization.
> > I think the service shutdown sequence in the test case is not correct.
> > We cannot call 'rte_service_lcore_stop' without first shutting down
> > the service using 'rte_service_runstate_set' and
> > 'rte_service_component_runstate_set'. The 'rte_service_lcore_stop' has
> > checks to validate that the service is not running currently (among
> > other things). In fact, the call to 'rte_service_lcore_stop' API is
> > returning -EBUSY currently in the test case. We are not checking the return
> status.
> >
> > If I understand the intent of the test case correctly, the sequence of
> > the calls needs to be:
> > rte_service_runstate_set(id, 0)
> > rte_service_component_runstate_set(id, 0); rte_service_may_be_active -
> > loop till the service is not active rte_service_lcore_stop(slcore_id);
> 
> No need to change service runstates, unmapping the service lcore to the
> service allows service_lcore_stop() to work as expected, and not return -
> EBUSY. This change to add an unmap() is integrated in the test case in the v2
> patch.
Ok, understood.
Looking at the patch, why not use the 'rte_service_lcore_stop' to provide the status of the lcore?
For ex: the 'thread_active' can be used in 'rte_service_lcore_stop' to indicate that the lcore is still busy?

> 
> 
> > > > Honnappa's suggestion in the other reply seems to not work here,
> > > > as the
> > > "service_may_be_active"
> > > > won't get updated if the service core is stopped after running its
> > > > final
> > > iteration..?
> > This also is a problem.
> > This needs to be fixed by setting 'service_active_on_lcore'
> > immediately after the service completes running in 'service_run'
> > (rather than waiting for the next iteration of service_run).
> 
> This is a different bug - not directly related to the service_autotest issue
> reported here. I'll keep focus & resolve that issue first as it has higher priorty
> due to CI failures.
Sure.

> 
> <snip previous discsussion>


More information about the dev mailing list