[dpdk-dev] [21.08 PATCH v1 1/2] power: invert the monitor check

McDaniel, Timothy timothy.mcdaniel at intel.com
Tue May 11 18:28:02 CEST 2021


> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov at intel.com>
> Sent: Tuesday, May 11, 2021 10:32 AM
> To: dev at dpdk.org; McDaniel, Timothy <timothy.mcdaniel at intel.com>; Xing,
> Beilei <beilei.xing at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>; Yang,
> Qiming <qiming.yang at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>; Wang,
> Haiyue <haiyue.wang at intel.com>; Matan Azrad <matan at nvidia.com>; Shahaf
> Shuler <shahafs at nvidia.com>; Viacheslav Ovsiienko <viacheslavo at nvidia.com>;
> Richardson, Bruce <bruce.richardson at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>
> Cc: Loftus, Ciara <ciara.loftus at intel.com>
> Subject: [21.08 PATCH v1 1/2] power: invert the monitor check
> 
> Previously, the semantics of power monitor were such that we were
> checking current value against the expected value, and if they matched,
> then the sleep was aborted. This is somewhat inflexible, because it only
> allowed us to check for a specific value.
> 
> We can reverse the check, and instead have monitor sleep to be aborted
> if the expected value *doesn't* match what's in memory. This allows us
> to both implement all currently implemented driver code, as well as
> support more use cases which don't easily map to previous semantics
> (such as waiting on writes to AF_XDP counter value).
> 
> This commit also adjusts all current driver implementations to match the
> new semantics.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> ---
>  drivers/event/dlb2/dlb2.c                      | 2 +-
>  drivers/net/i40e/i40e_rxtx.c                   | 2 +-
>  drivers/net/iavf/iavf_rxtx.c                   | 2 +-
>  drivers/net/ice/ice_rxtx.c                     | 2 +-
>  drivers/net/ixgbe/ixgbe_rxtx.c                 | 2 +-
>  drivers/net/mlx5/mlx5_rx.c                     | 2 +-
>  lib/eal/include/generic/rte_power_intrinsics.h | 8 ++++----
>  lib/eal/x86/rte_power_intrinsics.c             | 4 ++--
>  8 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> index 3570678b9e..5701bbb8ab 100644
> --- a/drivers/event/dlb2/dlb2.c
> +++ b/drivers/event/dlb2/dlb2.c
> @@ -3188,7 +3188,7 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
>  			&cq_base[qm_port->cq_idx];
>  		monitor_addr++; /* cq_gen bit is in second 64bit location */
> 
> -		if (qm_port->gen_bit)
> +		if (!qm_port->gen_bit)
>  			expected_value = qe_mask.raw_qe[1];
>  		else
>  			expected_value = 0;
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index 02cf5e787c..4617ae914a 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -88,7 +88,7 @@ i40e_get_monitor_addr(void *rx_queue, struct
> rte_power_monitor_cond *pmc)
>  	 * we expect the DD bit to be set to 1 if this descriptor was already
>  	 * written to.
>  	 */
> -	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
> +	pmc->val = 0;
>  	pmc->mask = rte_cpu_to_le_64(1 <<
> I40E_RX_DESC_STATUS_DD_SHIFT);
> 
>  	/* registers are 64-bit */
> diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
> index 87f7eebc65..d8d9cc860c 100644
> --- a/drivers/net/iavf/iavf_rxtx.c
> +++ b/drivers/net/iavf/iavf_rxtx.c
> @@ -73,7 +73,7 @@ iavf_get_monitor_addr(void *rx_queue, struct
> rte_power_monitor_cond *pmc)
>  	 * we expect the DD bit to be set to 1 if this descriptor was already
>  	 * written to.
>  	 */
> -	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
> +	pmc->val = 0;
>  	pmc->mask = rte_cpu_to_le_64(1 <<
> IAVF_RX_DESC_STATUS_DD_SHIFT);
> 
>  	/* registers are 64-bit */
> diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
> index 92fbbc18da..4e349bfa3f 100644
> --- a/drivers/net/ice/ice_rxtx.c
> +++ b/drivers/net/ice/ice_rxtx.c
> @@ -43,7 +43,7 @@ ice_get_monitor_addr(void *rx_queue, struct
> rte_power_monitor_cond *pmc)
>  	 * we expect the DD bit to be set to 1 if this descriptor was already
>  	 * written to.
>  	 */
> -	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
> +	pmc->val = 0;
>  	pmc->mask = rte_cpu_to_le_16(1 <<
> ICE_RX_FLEX_DESC_STATUS0_DD_S);
> 
>  	/* register is 16-bit */
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index d69f36e977..2793718171 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1385,7 +1385,7 @@ ixgbe_get_monitor_addr(void *rx_queue, struct
> rte_power_monitor_cond *pmc)
>  	 * we expect the DD bit to be set to 1 if this descriptor was already
>  	 * written to.
>  	 */
> -	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
> +	pmc->val = 0;
>  	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
> 
>  	/* the registers are 32-bit */
> diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
> index 6cd71a44eb..3cbbe5bf59 100644
> --- a/drivers/net/mlx5/mlx5_rx.c
> +++ b/drivers/net/mlx5/mlx5_rx.c
> @@ -282,7 +282,7 @@ int mlx5_get_monitor_addr(void *rx_queue, struct
> rte_power_monitor_cond *pmc)
>  		return -rte_errno;
>  	}
>  	pmc->addr = &cqe->op_own;
> -	pmc->val =  !!idx;
> +	pmc->val =  !idx;
>  	pmc->mask = MLX5_CQE_OWNER_MASK;
>  	pmc->size = sizeof(uint8_t);
>  	return 0;
> diff --git a/lib/eal/include/generic/rte_power_intrinsics.h
> b/lib/eal/include/generic/rte_power_intrinsics.h
> index dddca3d41c..28c481a8d2 100644
> --- a/lib/eal/include/generic/rte_power_intrinsics.h
> +++ b/lib/eal/include/generic/rte_power_intrinsics.h
> @@ -45,10 +45,10 @@ struct rte_power_monitor_cond {
>   * Additionally, an expected value (`pmc->val`), mask (`pmc->mask`), and data
>   * size (`pmc->size`) are provided in the `pmc` power monitoring condition. If
>   * the mask is non-zero, the current value pointed to by the `pmc->addr` pointer
> - * will be read and compared against the expected value, and if they match, the
> - * entering of optimized power state will be aborted. This is intended to
> - * prevent the CPU from entering optimized power state and waiting on a write
> - * that has already happened by the time this API is called.
> + * will be read and compared against the expected value, and if they do not
> + * match, the entering of optimized power state will be aborted. This is
> + * intended to prevent the CPU from entering optimized power state and
> waiting
> + * on a write that has already happened by the time this API is called.
>   *
>   * @warning It is responsibility of the user to check if this function is
>   *   supported at runtime using `rte_cpu_get_intrinsics_support()` API call.
> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> b/lib/eal/x86/rte_power_intrinsics.c
> index 39ea9fdecd..7f0588d70e 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -116,8 +116,8 @@ rte_power_monitor(const struct
> rte_power_monitor_cond *pmc,
>  				pmc->addr, pmc->size);
>  		const uint64_t masked = cur_value & pmc->mask;
> 
> -		/* if the masked value is already matching, abort */
> -		if (masked == pmc->val)
> +		/* if the masked value is not matching, abort */
> +		if (masked != pmc->val)
>  			goto end;
>  	}
> 
> --
> 2.25.1

Acked-by: Timothy McDaniel  <timothy.mcdaniel at intel.com>


More information about the dev mailing list