[dpdk-dev] [PATCH v2 3/7] eal: add power monitor for multiple events
Ananyev, Konstantin
konstantin.ananyev at intel.com
Mon Jun 28 14:37:17 CEST 2021
> Use RTM and WAITPKG instructions to perform a wait-for-writes similar to
> what UMWAIT does, but without the limitation of having to listen for
> just one event. This works because the optimized power state used by the
> TPAUSE instruction will cause a wake up on RTM transaction abort, so if
> we add the addresses we're interested in to the read-set, any write to
> those addresses will wake us up.
>
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> ---
>
> Notes:
> v2:
> - Adapt to callback mechanism
>
> doc/guides/rel_notes/release_21_08.rst | 2 +
> lib/eal/arm/rte_power_intrinsics.c | 11 +++
> lib/eal/include/generic/rte_cpuflags.h | 2 +
> .../include/generic/rte_power_intrinsics.h | 35 ++++++++++
> lib/eal/ppc/rte_power_intrinsics.c | 11 +++
> lib/eal/version.map | 3 +
> lib/eal/x86/rte_cpuflags.c | 2 +
> lib/eal/x86/rte_power_intrinsics.c | 69 +++++++++++++++++++
> 8 files changed, 135 insertions(+)
>
...
> diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
> index 3c5c9ce7ad..3fc6f62ef5 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -4,6 +4,7 @@
>
> #include <rte_common.h>
> #include <rte_lcore.h>
> +#include <rte_rtm.h>
> #include <rte_spinlock.h>
>
> #include "rte_power_intrinsics.h"
> @@ -28,6 +29,7 @@ __umwait_wakeup(volatile void *addr)
> }
>
> static bool wait_supported;
> +static bool wait_multi_supported;
>
> static inline uint64_t
> __get_umwait_val(const volatile void *p, const uint8_t sz)
> @@ -164,6 +166,8 @@ RTE_INIT(rte_power_intrinsics_init) {
>
> if (i.power_monitor && i.power_pause)
> wait_supported = 1;
> + if (i.power_monitor_multi)
> + wait_multi_supported = 1;
> }
>
> int
> @@ -202,6 +206,9 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
> * In this case, since we've already woken up, the "wakeup" was
> * unneeded, and since T1 is still waiting on T2 releasing the lock, the
> * wakeup address is still valid so it's perfectly safe to write it.
> + *
> + * For multi-monitor case, the act of locking will in itself trigger the
> + * wakeup, so no additional writes necessary.
> */
> rte_spinlock_lock(&s->lock);
> if (s->monitor_addr != NULL)
> @@ -210,3 +217,65 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
>
> return 0;
> }
> +
> +int
> +rte_power_monitor_multi(const struct rte_power_monitor_cond pmc[],
> + const uint32_t num, const uint64_t tsc_timestamp)
> +{
> + const unsigned int lcore_id = rte_lcore_id();
> + struct power_wait_status *s = &wait_status[lcore_id];
> + uint32_t i, rc;
> +
> + /* check if supported */
> + if (!wait_multi_supported)
> + return -ENOTSUP;
> +
> + if (pmc == NULL || num == 0)
> + return -EINVAL;
> +
> + /* we are already inside transaction region, return */
> + if (rte_xtest() != 0)
> + return 0;
> +
> + /* start new transaction region */
> + rc = rte_xbegin();
> +
> + /* transaction abort, possible write to one of wait addresses */
> + if (rc != RTE_XBEGIN_STARTED)
> + return 0;
> +
> + /*
> + * the mere act of reading the lock status here adds the lock to
> + * the read set. This means that when we trigger a wakeup from another
> + * thread, even if we don't have a defined wakeup address and thus don't
> + * actually cause any writes, the act of locking our lock will itself
> + * trigger the wakeup and abort the transaction.
> + */
> + rte_spinlock_is_locked(&s->lock);
> +
> + /*
> + * add all addresses to wait on into transaction read-set and check if
> + * any of wakeup conditions are already met.
> + */
> + for (i = 0; i < num; i++) {
> + const struct rte_power_monitor_cond *c = &pmc[i];
> +
> + if (pmc->fn == NULL)
Should be c->fn, I believe.
> + continue;
Actually that way, if c->fn == NULL, we'll never add our c->addr to monitored addresses.
Is that what we really want?
My thought was, that if callback is not set, we'll just go to power-save state without extra checking, no?
Something like that:
const struct rte_power_monitor_cond *c = &pmc[i];
const uint64_t val = __get_umwait_val(c->addr, c->size);
if (c->fn && c->fn(val, c->opaque) != 0)
break;
Same thought for rte_power_monitor().
> + const uint64_t val = __get_umwait_val(pmc->addr, pmc->size);
Same thing: s/pmc->/c->/
> +
> + /* abort if callback indicates that we need to stop */
> + if (c->fn(val, c->opaque) != 0)
> + break;
> + }
> +
> + /* none of the conditions were met, sleep until timeout */
> + if (i == num)
> + rte_power_pause(tsc_timestamp);
> +
> + /* end transaction region */
> + rte_xend();
> +
> + return 0;
> +}
> --
> 2.25.1
More information about the dev
mailing list