[dpdk-stable] [PATCH] test/service: fix race in attr check

David Marchand david.marchand at redhat.com
Tue Oct 12 20:49:20 CEST 2021


On Mon, Oct 11, 2021 at 4:54 PM David Marchand
<david.marchand at redhat.com> wrote:
>
> The CI reported rare (and cryptic) failures like:
>
> RTE>>service_autotest
>  + ------------------------------------------------------- +
>  + Test Suite : service core test suite
>  + ------------------------------------------------------- +
>  + TestCase [ 0] : unregister_all succeeded
>  + TestCase [ 1] : service_name succeeded
>  + TestCase [ 2] : service_get_by_name succeeded
> Service dummy_service Summary
>   dummy_service: stats 1        calls 0 cycles 0        avg: 0
> Service dummy_service Summary
>   dummy_service: stats 0        calls 0 cycles 0        avg: 0
>  + TestCase [ 3] : service_dump succeeded
>  + TestCase [ 4] : service_attr_get failed
>  + TestCase [ 5] : service_lcore_attr_get succeeded
>  + TestCase [ 6] : service_probe_capability succeeded
>  + TestCase [ 7] : service_start_stop succeeded
>  + TestCase [ 8] : service_lcore_add_del succeeded
>  + TestCase [ 9] : service_lcore_start_stop succeeded
>  + TestCase [10] : service_lcore_en_dis_able succeeded
>  + TestCase [11] : service_mt_unsafe_poll succeeded
>  + TestCase [12] : service_mt_safe_poll succeeded
> perf test for MT Safe: 42.7 cycles per call
>  + TestCase [13] : service_app_lcore_mt_safe succeeded
> perf test for MT Unsafe: 73.3 cycles per call
>  + TestCase [14] : service_app_lcore_mt_unsafe succeeded
>  + TestCase [15] : service_may_be_active succeeded
>  + TestCase [16] : service_active_two_cores succeeded
>  + ------------------------------------------------------- +
>  + Test Suite Summary : service core test suite
>  + ------------------------------------------------------- +
>  + Tests Total :       17
>  + Tests Skipped :      0
>  + Tests Executed :    17
>  + Tests Unsupported:   0
>  + Tests Passed :      16
>  + Tests Failed :       1
>  + ------------------------------------------------------- +
> Test Failed
> RTE>>
> stderr:
> EAL: Detected CPU lcores: 16
> EAL: Detected NUMA nodes: 2
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available 1048576 kB hugepages reported
> EAL: VFIO support initialized
> EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0
> APP: HPET is not enabled, using TSC as default timer
> EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't
>  get call count (zero)
>
> According to API, trying to stop a service lcore is not possible if this
> lcore is the only one associated to a service.
> Doing this will result in a -EBUSY return code from
> rte_service_lcore_stop() which the service_attr_get subtest was not
> checking.
> This left the service lcore running, and a race existed with the main
> lcore on checking the service attributes which triggered this CI
> failure.
>
> To fix this, dissociate the service lcore with current service.
>
> Once fixed this first issue, a race still exists, because the
> wait_slcore_inactive helper added in a previous fix was not
> paired with a check that the service lcore _did_ stop.
>
> Add missing check on rte_service_lcore_may_be_active.
>
> Fixes: 4d55194d76a4 ("service: add attribute get function")
> Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore")
> Cc: stable at dpdk.org
>
> Signed-off-by: David Marchand <david.marchand at redhat.com>
Acked-by: Aaron Conole <aconole at redhat.com>
Acked-by: Harry van Haaren <harry.van.haaren at intel.com>

Applied, thanks.


-- 
David Marchand



More information about the stable mailing list