[dpdk-dev] [RFC 3/5] eal: lcore state FINISHED is not required

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Fri Feb 26 00:33:21 CET 2021


+Thomas, David, Konstantin for input

<snip>

> > Subject: [RFC 3/5] eal: lcore state FINISHED is not required
> >
> > FINISHED state seems to be used to indicate that the worker's update
> > of the 'state' is not visible to other threads. There seems to be no
> > requirement to have such a state.
> 
> I am not sure "FINISHED" is necessary to be removed, and I propose some of my
> profiles for discussion.
>  There are three states for lcore now:
> "WAIT": indicate lcore can start working
> "RUNNING": indicate lcore is working
> "FINISHED": indicate lcore has finished its working and wait to be reset
If you look at the definitions of "WAIT" and "FINISHED" states, they look similar, except for "wait to be reset" in "FINISHED" state . The code really does not do anything to reset the lcore. It just changes the state to "WAIT".

> 
> From the description above, we can find "FINISHED" is different from "WAIT", it
> can shows that lcore has done the work and finished it. Thus, if we remove
> "FINISHED", maybe we will not know whether the lcore finishes its work or just
> doesn't start, because this two state has the same tag "WAIT".
Looking at "eal_thread_loop", the worker thread sets the state to "RUNNING" before sending the ack back to main core. After that it is guaranteed that the worker will run the assigned function. Only case where it will not run the assigned function is when the 'write' syscall fails, in which case it results in a panic.

> 
> Furthermore, consider such a scenario:
> Core 1 need to monitor Core 2 state, if Core 2 finishes one task, Core 1 can start
> its working.
> However, if there is only  one tag "WAIT", Core 1 maybe  start its work at the
> wrong time, when Core 2 still does not start its task at state "WAIT".
> This is just my guess, and at present, there is no similar application scenario in
> dpdk.
To be able to do this effectively, core 1 needs to observe the state change from WAIT->RUNNING->FINISHED. This requires that core 1 should be calling rte_eal_remote_launch and rte_eal_wait_lcore functions. It is not possible to observe this state transition from a 3rd core (for ex: a worker might go from RUNNING->FINISHED->WAIT->RUNNING which a 3rd core might not be able to observe).

> 
> On the other hand, if we decide to remove "FINISHED", please consider the
> following files:
> 1. lib/librte_eal/linux/eal_thread.c: line 31
>     lib/librte_eal/windows/eal_thread.c: line 22
>     lib/librte_eal/freebsd/eal_thread.c: line 31
I have looked at these lines, they do not capture "why" FINISHED state is required.

 2.
> lib/librte_eal/include/rte_launch.h: line 24, 44, 121, 123, 131 3. examples/l2fwd-
> keepalive/main.c: line 510
> rte_eal_wait_lcore(id_core) can be removed. Because the core state has been
> checked as "WAIT", this is a redundant operation
> 
> Best Regards
> Feifei
> 
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljedahl at arm.com>
> > ---
> >  drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 2 +-
> > drivers/event/octeontx/ssovf_evdev_selftest.c | 2 +-
> >  drivers/event/sw/sw_evdev_selftest.c          | 4 ++--
> >  examples/l2fwd-keepalive/main.c               | 2 +-
> >  lib/librte_eal/common/eal_common_launch.c     | 7 ++-----
> >  lib/librte_eal/freebsd/eal_thread.c           | 2 +-
> >  lib/librte_eal/linux/eal_thread.c             | 8 +-------
> >  lib/librte_eal/windows/eal_thread.c           | 8 +-------
> >  8 files changed, 10 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > index cd7311a94..bbbd20951 100644
> > --- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > +++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
> > @@ -468,7 +468,7 @@ wait_workers_to_join(int lcore, const
> > rte_atomic32_t
> > *count)  RTE_SET_USED(count);
> >
> >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > (rte_eal_get_lcore_state(lcore) != FINISHED) {
> > +while (rte_eal_get_lcore_state(lcore) != WAIT) {
> >  uint64_t new_cycles = rte_get_timer_cycles();
> >
> >  if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git
> > a/drivers/event/octeontx/ssovf_evdev_selftest.c
> > b/drivers/event/octeontx/ssovf_evdev_selftest.c
> > index 528f99dd8..d7b0d2211 100644
> > --- a/drivers/event/octeontx/ssovf_evdev_selftest.c
> > +++ b/drivers/event/octeontx/ssovf_evdev_selftest.c
> > @@ -579,7 +579,7 @@ wait_workers_to_join(int lcore, const
> > rte_atomic32_t
> > *count)  RTE_SET_USED(count);
> >
> >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > (rte_eal_get_lcore_state(lcore) != FINISHED) {
> > +while (rte_eal_get_lcore_state(lcore) != WAIT) {
> >  uint64_t new_cycles = rte_get_timer_cycles();
> >
> >  if (new_cycles - print_cycles > rte_get_timer_hz()) { diff --git
> > a/drivers/event/sw/sw_evdev_selftest.c
> > b/drivers/event/sw/sw_evdev_selftest.c
> > index e4bfb3a0f..7847a8645 100644
> > --- a/drivers/event/sw/sw_evdev_selftest.c
> > +++ b/drivers/event/sw/sw_evdev_selftest.c
> > @@ -3138,8 +3138,8 @@ worker_loopback(struct test *t, uint8_t
> > disable_implicit_release)
> > rte_eal_remote_launch(worker_loopback_worker_fn, t, w_lcore);
> >
> >  print_cycles = cycles = rte_get_timer_cycles(); -while
> > (rte_eal_get_lcore_state(p_lcore) != FINISHED ||
> > -rte_eal_get_lcore_state(w_lcore) != FINISHED) {
> > +while (rte_eal_get_lcore_state(p_lcore) != WAIT ||
> > +rte_eal_get_lcore_state(w_lcore) != WAIT) {
> >
> >  rte_service_run_iter_on_app_lcore(t->service_id, 1);
> >
> > diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-
> > keepalive/main.c index e4c2b2793..dd777c46a 100644
> > --- a/examples/l2fwd-keepalive/main.c
> > +++ b/examples/l2fwd-keepalive/main.c
> > @@ -506,7 +506,7 @@ dead_core(__rte_unused void *ptr_data, const int
> > id_core)  if (terminate_signal_received)  return;  printf("Dead core
> > %i - restarting..\n", id_core); -if (rte_eal_get_lcore_state(id_core)
> > == FINISHED) {
> > +if (rte_eal_get_lcore_state(id_core) == WAIT) {
> >  rte_eal_wait_lcore(id_core);
> >  rte_eal_remote_launch(l2fwd_launch_one_lcore, NULL, id_core);  } else
> > { diff --git a/lib/librte_eal/common/eal_common_launch.c
> > b/lib/librte_eal/common/eal_common_launch.c
> > index 34f854ad8..78fd94026 100644
> > --- a/lib/librte_eal/common/eal_common_launch.c
> > +++ b/lib/librte_eal/common/eal_common_launch.c
> > @@ -26,14 +26,11 @@ rte_eal_wait_lcore(unsigned worker_id)  if
> > (lcore_config[worker_id].state == WAIT)  return 0;
> >
> > -while (lcore_config[worker_id].state != WAIT &&
> > -       lcore_config[worker_id].state != FINISHED)
> > +while (lcore_config[worker_id].state != WAIT)
> >  rte_pause();
> >
> >  rte_rmb();
> >
> > -/* we are in finished state, go to wait state */ -
> > lcore_config[worker_id].state = WAIT;  return
> > lcore_config[worker_id].ret;  }
> >
> > @@ -62,7 +59,7 @@ rte_eal_mp_remote_launch(int (*f)(void *), void
> > *arg,
> >
> >  if (call_main == CALL_MAIN) {
> >  lcore_config[main_lcore].ret = f(arg);
> > -lcore_config[main_lcore].state = FINISHED;
> > +lcore_config[main_lcore].state = WAIT;
> >  }
> >
> >  return 0;
> > diff --git a/lib/librte_eal/freebsd/eal_thread.c
> > b/lib/librte_eal/freebsd/eal_thread.c
> > index 17b8f3996..6d6f1e2fd 100644
> > --- a/lib/librte_eal/freebsd/eal_thread.c
> > +++ b/lib/librte_eal/freebsd/eal_thread.c
> > @@ -140,7 +140,7 @@ eal_thread_loop(__rte_unused void *arg)
> > lcore_config[lcore_id].f = NULL;  lcore_config[lcore_id].arg = NULL;
> > rte_wmb(); -lcore_config[lcore_id].state = FINISHED;
> > +lcore_config[lcore_id].state = WAIT;
> >  }
> >
> >  /* never reached */
> > diff --git a/lib/librte_eal/linux/eal_thread.c
> > b/lib/librte_eal/linux/eal_thread.c
> > index a0a009104..7b9463df3 100644
> > --- a/lib/librte_eal/linux/eal_thread.c
> > +++ b/lib/librte_eal/linux/eal_thread.c
> > @@ -141,13 +141,7 @@ eal_thread_loop(__rte_unused void *arg)
> > lcore_config[lcore_id].arg = NULL;  rte_wmb();
> >
> > -/* when a service core returns, it should go directly to WAIT
> > - * state, because the application will not lcore_wait() for it.
> > - */
> > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) -
> > lcore_config[lcore_id].state = WAIT; -else
> > -lcore_config[lcore_id].state = FINISHED;
> > +lcore_config[lcore_id].state = WAIT;
> >  }
> >
> >  /* never reached */
> > diff --git a/lib/librte_eal/windows/eal_thread.c
> > b/lib/librte_eal/windows/eal_thread.c
> > index 7a9277c51..35d059a30 100644
> > --- a/lib/librte_eal/windows/eal_thread.c
> > +++ b/lib/librte_eal/windows/eal_thread.c
> > @@ -125,13 +125,7 @@ eal_thread_loop(void *arg __rte_unused)
> > lcore_config[lcore_id].arg = NULL;  rte_wmb();
> >
> > -/* when a service core returns, it should go directly to WAIT
> > - * state, because the application will not lcore_wait() for it.
> > - */
> > -if (lcore_config[lcore_id].core_role == ROLE_SERVICE) -
> > lcore_config[lcore_id].state = WAIT; -else
> > -lcore_config[lcore_id].state = FINISHED;
> > +lcore_config[lcore_id].state = WAIT;
> >  }
> >  }
> >
> > --
> > 2.17.1
> >



More information about the dev mailing list