[dpdk-stable] [PATCH] test/service: fix race in attr check
David Marchand
david.marchand at redhat.com
Tue Oct 12 20:49:20 CEST 2021
On Mon, Oct 11, 2021 at 4:54 PM David Marchand
<david.marchand at redhat.com> wrote:
>
> The CI reported rare (and cryptic) failures like:
>
> RTE>>service_autotest
> + ------------------------------------------------------- +
> + Test Suite : service core test suite
> + ------------------------------------------------------- +
> + TestCase [ 0] : unregister_all succeeded
> + TestCase [ 1] : service_name succeeded
> + TestCase [ 2] : service_get_by_name succeeded
> Service dummy_service Summary
> dummy_service: stats 1 calls 0 cycles 0 avg: 0
> Service dummy_service Summary
> dummy_service: stats 0 calls 0 cycles 0 avg: 0
> + TestCase [ 3] : service_dump succeeded
> + TestCase [ 4] : service_attr_get failed
> + TestCase [ 5] : service_lcore_attr_get succeeded
> + TestCase [ 6] : service_probe_capability succeeded
> + TestCase [ 7] : service_start_stop succeeded
> + TestCase [ 8] : service_lcore_add_del succeeded
> + TestCase [ 9] : service_lcore_start_stop succeeded
> + TestCase [10] : service_lcore_en_dis_able succeeded
> + TestCase [11] : service_mt_unsafe_poll succeeded
> + TestCase [12] : service_mt_safe_poll succeeded
> perf test for MT Safe: 42.7 cycles per call
> + TestCase [13] : service_app_lcore_mt_safe succeeded
> perf test for MT Unsafe: 73.3 cycles per call
> + TestCase [14] : service_app_lcore_mt_unsafe succeeded
> + TestCase [15] : service_may_be_active succeeded
> + TestCase [16] : service_active_two_cores succeeded
> + ------------------------------------------------------- +
> + Test Suite Summary : service core test suite
> + ------------------------------------------------------- +
> + Tests Total : 17
> + Tests Skipped : 0
> + Tests Executed : 17
> + Tests Unsupported: 0
> + Tests Passed : 16
> + Tests Failed : 1
> + ------------------------------------------------------- +
> Test Failed
> RTE>>
> stderr:
> EAL: Detected CPU lcores: 16
> EAL: Detected NUMA nodes: 2
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/service_autotest/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available 1048576 kB hugepages reported
> EAL: VFIO support initialized
> EAL: Device 0000:03:00.0 is not NUMA-aware, defaulting socket to 0
> APP: HPET is not enabled, using TSC as default timer
> EAL: Test assert service_attr_get line 340 failed: attr_get() call didn't
> get call count (zero)
>
> According to API, trying to stop a service lcore is not possible if this
> lcore is the only one associated to a service.
> Doing this will result in a -EBUSY return code from
> rte_service_lcore_stop() which the service_attr_get subtest was not
> checking.
> This left the service lcore running, and a race existed with the main
> lcore on checking the service attributes which triggered this CI
> failure.
>
> To fix this, dissociate the service lcore with current service.
>
> Once fixed this first issue, a race still exists, because the
> wait_slcore_inactive helper added in a previous fix was not
> paired with a check that the service lcore _did_ stop.
>
> Add missing check on rte_service_lcore_may_be_active.
>
> Fixes: 4d55194d76a4 ("service: add attribute get function")
> Fixes: 52bb6be259ff ("test/service: fix race condition on stopping lcore")
> Cc: stable at dpdk.org
>
> Signed-off-by: David Marchand <david.marchand at redhat.com>
Acked-by: Aaron Conole <aconole at redhat.com>
Acked-by: Harry van Haaren <harry.van.haaren at intel.com>
Applied, thanks.
--
David Marchand
More information about the stable
mailing list