[dpdk-dev] net/mlx4: fix drop flow resources not freed

Message ID 1517327640-182072-1-git-send-email-motih@mellanox.com (mailing list archive)
State Rejected, archived
Delegated to: Shahaf Shuler
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Moti Haimovsky Jan. 30, 2018, 3:54 p.m. UTC
  This patch fixes the drop-flow resources not being freed when the device
is closed.
Issue can be observed when running testpmd and adding the following rule
more than once:
"flow create 0 ingress pattern eth / end actions drop / end"
then either exiting testpmd using the "quit" command or by running the
command: "port stop all"

Fixes: d3a7e09234e4 ("net/mlx4: allocate drop flow resources on demand")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
 drivers/net/mlx4/mlx4_flow.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)
  

Comments

Adrien Mazarguil Jan. 30, 2018, 4:41 p.m. UTC | #1
Hi Moti,

On Tue, Jan 30, 2018 at 05:54:00PM +0200, Moti Haimovsky wrote:
> This patch fixes the drop-flow resources not being freed when the device
> is closed.
> Issue can be observed when running testpmd and adding the following rule
> more than once:
> "flow create 0 ingress pattern eth / end actions drop / end"
> then either exiting testpmd using the "quit" command or by running the
> command: "port stop all"
> 
> Fixes: d3a7e09234e4 ("net/mlx4: allocate drop flow resources on demand")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Moti Haimovsky <motih@mellanox.com>

Thanks for investigating this problem, however I do not think the proposed
patch uses the right approach to address it, more below.

> ---
>  drivers/net/mlx4/mlx4_flow.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c
> index fb84060..9e6d8dc 100644
> --- a/drivers/net/mlx4/mlx4_flow.c
> +++ b/drivers/net/mlx4/mlx4_flow.c
> @@ -895,6 +895,30 @@ struct mlx4_drop {
>  }
>  
>  /**
> + * Return the number of active drop flow rules currently present
> + * in the list of flows.
> + * Active flow is defined as a flow associated with an ibv_flow.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   Number of active drop-flows.
> + */
> +static int
> +drop_refcnt(struct priv *priv)
> +{
> +	struct rte_flow *flow;
> +	int count = 0;
> +
> +	LIST_FOREACH(flow, &priv->flows, next) {
> +		if (flow->drop && flow->ibv_flow)
> +			count++;
> +	}
> +	return count;
> +}
> +
> +/**
>   * Get a drop flow rule resources instance.
>   *
>   * @param priv
> @@ -910,9 +934,8 @@ struct mlx4_drop {
>  	struct mlx4_drop *drop = priv->drop;
>  
>  	if (drop) {
> -		assert(drop->refcnt);
> +		assert(drop_refcnt(priv));
>  		assert(drop->priv == priv);
> -		++drop->refcnt;
>  		return drop;
>  	}
>  	drop = rte_malloc(__func__, sizeof(*drop), 0);
> @@ -955,8 +978,10 @@ struct mlx4_drop {
>  static void
>  mlx4_drop_put(struct mlx4_drop *drop)
>  {
> -	assert(drop->refcnt);
> -	if (--drop->refcnt)
> +	int refcnt = drop_refcnt(drop->priv);
> +
> +	assert(refcnt >= 0);
> +	if (refcnt)
>  		return;
>  	drop->priv->drop = NULL;
>  	claim_zero(ibv_destroy_qp(drop->qp));

It looks like brute force to me, as in "if the counter doesn't have the
right value at this point, decrement it until it does, then assert() will
finally shut up". Getting rid of the refcount altogether would have also
worked.

We need to find out why we do not end up with a number of mlx5_drop_put()
calls matching that of mlx5_drop_get(). One is likely missing somewhere.
I'll have a look as well.
  
Adrien Mazarguil Jan. 30, 2018, 5:24 p.m. UTC | #2
On Tue, Jan 30, 2018 at 05:41:07PM +0100, Adrien Mazarguil wrote:
> Hi Moti,
> 
> On Tue, Jan 30, 2018 at 05:54:00PM +0200, Moti Haimovsky wrote:
> > This patch fixes the drop-flow resources not being freed when the device
> > is closed.
> > Issue can be observed when running testpmd and adding the following rule
> > more than once:
> > "flow create 0 ingress pattern eth / end actions drop / end"
> > then either exiting testpmd using the "quit" command or by running the
> > command: "port stop all"
> > 
> > Fixes: d3a7e09234e4 ("net/mlx4: allocate drop flow resources on demand")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> 
> Thanks for investigating this problem, however I do not think the proposed
> patch uses the right approach to address it, more below.
<snip>
> We need to find out why we do not end up with a number of mlx5_drop_put()
> calls matching that of mlx5_drop_get(). One is likely missing somewhere.
> I'll have a look as well.

After investigation, the following change in mlx4_flow_toggle() should
do the trick:

         if (flow->drop) {
 +               if (flow->ibv_flow)
 +                       return 0;
                 mlx4_drop_get(priv);

Without this, an already-enabled drop flow rule takes another reference when
re-enabled, hence the issue. I can send a fix tomorrow.
  

Patch

diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c
index fb84060..9e6d8dc 100644
--- a/drivers/net/mlx4/mlx4_flow.c
+++ b/drivers/net/mlx4/mlx4_flow.c
@@ -895,6 +895,30 @@  struct mlx4_drop {
 }
 
 /**
+ * Return the number of active drop flow rules currently present
+ * in the list of flows.
+ * Active flow is defined as a flow associated with an ibv_flow.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ *
+ * @return
+ *   Number of active drop-flows.
+ */
+static int
+drop_refcnt(struct priv *priv)
+{
+	struct rte_flow *flow;
+	int count = 0;
+
+	LIST_FOREACH(flow, &priv->flows, next) {
+		if (flow->drop && flow->ibv_flow)
+			count++;
+	}
+	return count;
+}
+
+/**
  * Get a drop flow rule resources instance.
  *
  * @param priv
@@ -910,9 +934,8 @@  struct mlx4_drop {
 	struct mlx4_drop *drop = priv->drop;
 
 	if (drop) {
-		assert(drop->refcnt);
+		assert(drop_refcnt(priv));
 		assert(drop->priv == priv);
-		++drop->refcnt;
 		return drop;
 	}
 	drop = rte_malloc(__func__, sizeof(*drop), 0);
@@ -955,8 +978,10 @@  struct mlx4_drop {
 static void
 mlx4_drop_put(struct mlx4_drop *drop)
 {
-	assert(drop->refcnt);
-	if (--drop->refcnt)
+	int refcnt = drop_refcnt(drop->priv);
+
+	assert(refcnt >= 0);
+	if (refcnt)
 		return;
 	drop->priv->drop = NULL;
 	claim_zero(ibv_destroy_qp(drop->qp));