[dpdk-dev,2/2] service: fix service core launch
Checks
Commit Message
This patch fixes a potential bug, which was not consistently
showing up in the unit tests. The issue was that the service-
lcore being started was not in a "WAIT" state, and hence EAL
would return -EBUSY instead of launching the lcore.
In order to ensure a core is in a launch-ready state, the application
must call rte_eal_wait_lcore, to ensure that the core has completed
its previous task, and that EAL is ready to re-launch it.
The call to rte_eal_wait_lcore() is explicitly not in the
service core function, to make it visible to the application.
Requiring an explicit function call ensures the developer sees
that a lcore could block in the rte_eal_wait_lcore() function
if the core hasn't returned from its previous function.
From a usability perspective, hiding the wait_lcore() inside
service cores would cause confusion.
This patch adds rte_eal_wait_lcore() calls to the unit tests,
to ensure that the lcores for testing functionality are ready
to run the test.
Fixes: 21698354c832 ("service: introduce service cores concept")
+CC stable@dpdk.org
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
---
@Stable maintainers; this is an EXPERIMENTAL tagged API, so
I'm not sure what the expectation is in terms of backporting.
---
lib/librte_eal/common/include/rte_service.h | 4 +++-
test/test/test_service_cores.c | 6 ++++++
2 files changed, 9 insertions(+), 1 deletion(-)
Comments
Hi Harry,
On Wed, Dec 20, 2017 at 11:21:47AM +0000, Harry van Haaren wrote:
> diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c
> index 311c704..43f2318 100644
> --- a/test/test/test_service_cores.c
> +++ b/test/test/test_service_cores.c
> @@ -348,6 +348,7 @@ service_lcore_en_dis_able(void)
>
> /* call remote_launch to verify that app can launch ex-service lcore */
> service_remote_launch_flag = 0;
> + rte_eal_wait_lcore(slcore_id);
> int ret = rte_eal_remote_launch(service_remote_launch_func, NULL,
> slcore_id);
> TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed.");
> @@ -505,6 +506,10 @@ service_threaded_test(int mt_safe)
> if (!mt_safe)
> test_params[1] = 1;
>
> + /* wait for lcores before start() */
> + rte_eal_wait_lcore(slcore_1);
> + rte_eal_wait_lcore(slcore_2);
> +
> rte_service_lcore_start(slcore_1);
> rte_service_lcore_start(slcore_2);
As you are touching this file can you change following things:
Need to increase the delay to a value similar to other tc.
service_lcore_running_check(void)
{
uint64_t tick = service_tick;
- rte_delay_ms(SERVICE_DELAY * 10);
+ rte_delay_ms(100);
/* if (tick != service_tick) we know the lcore as polled the service */
return tick != service_tick;
}
As service_mt_unsafe_poll and service_mt_safe_poll use the same function body and
are called one after the other we need to wait for them to complete before
proceeding to the next tc i.e service_mt_unsafe_poll -> wait for the cores to
complete -> service_mt_safe_poll else it will lead to unintended side effects.
@@ -523,6 +523,8 @@ service_threaded_test(int mt_safe)
TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0),
"Failed to stop MT Safe service");
+ rte_eal_wait_lcore(slcore_1);
+ rte_eal_wait_lcore(slcore_2);
unregister_all();
/* return the value of the callback pass_test variable to caller */
Cheers,
Pavan.
>
> @@ -611,6 +616,7 @@ service_app_lcore_poll_impl(const int mt_safe)
> rte_service_runstate_set(id, 1);
>
> uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1);
> + rte_eal_wait_lcore(app_core2);
> int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func,
> &id, app_core2);
>
> --
> 2.7.4
>
@@ -274,7 +274,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
* Start a service core.
*
* Starting a core makes the core begin polling. Any services assigned to it
- * will be run as fast as possible.
+ * will be run as fast as possible. The application must ensure that the lcore
+ * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id
+ * before calling this function.
*
* @retval 0 Success
* @retval -EINVAL Failed to start core. The *lcore_id* passed in is not
@@ -348,6 +348,7 @@ service_lcore_en_dis_able(void)
/* call remote_launch to verify that app can launch ex-service lcore */
service_remote_launch_flag = 0;
+ rte_eal_wait_lcore(slcore_id);
int ret = rte_eal_remote_launch(service_remote_launch_func, NULL,
slcore_id);
TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed.");
@@ -505,6 +506,10 @@ service_threaded_test(int mt_safe)
if (!mt_safe)
test_params[1] = 1;
+ /* wait for lcores before start() */
+ rte_eal_wait_lcore(slcore_1);
+ rte_eal_wait_lcore(slcore_2);
+
rte_service_lcore_start(slcore_1);
rte_service_lcore_start(slcore_2);
@@ -611,6 +616,7 @@ service_app_lcore_poll_impl(const int mt_safe)
rte_service_runstate_set(id, 1);
uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1);
+ rte_eal_wait_lcore(app_core2);
int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func,
&id, app_core2);