[dpdk-dev] [PATCH v2 01/15] net/mlx5: support 16 hardware priorities

Nélio Laranjeiro nelio.laranjeiro at 6wind.com
Thu Apr 12 16:02:56 CEST 2018


On Thu, Apr 12, 2018 at 01:43:04PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Nélio Laranjeiro <nelio.laranjeiro at 6wind.com>
> > Sent: Thursday, April 12, 2018 5:09 PM
> > To: Xueming(Steven) Li <xuemingl at mellanox.com>
> > Cc: Shahaf Shuler <shahafs at mellanox.com>; dev at dpdk.org
> > Subject: Re: [PATCH v2 01/15] net/mlx5: support 16 hardware priorities
> > 
> > On Tue, Apr 10, 2018 at 03:22:46PM +0000, Xueming(Steven) Li wrote:
> > > Hi Nelio,
> > >
> > > > -----Original Message-----
> > > > From: Nélio Laranjeiro <nelio.laranjeiro at 6wind.com>
> > > > Sent: Tuesday, April 10, 2018 10:42 PM
> > > > To: Xueming(Steven) Li <xuemingl at mellanox.com>
> > > > Cc: Shahaf Shuler <shahafs at mellanox.com>; dev at dpdk.org
> > > > Subject: Re: [PATCH v2 01/15] net/mlx5: support 16 hardware
> > > > priorities
> > > >
> > > > On Tue, Apr 10, 2018 at 09:34:01PM +0800, Xueming Li wrote:
> > > > > Adjust flow priority mapping to adapt new hardware 16 verb flow
> > > > > priorites support:
> > > > > 0-3: RTE FLOW tunnel rule
> > > > > 4-7: RTE FLOW non-tunnel rule
> > > > > 8-15: PMD control flow
> > > >
> > > > This commit log is inducing people in error, this amount of priority
> > > > depends on the Mellanox OFED installed, it is not available on
> > > > upstream Linux kernel yet nor in the current Mellanox OFED GA.
> > > >
> > > > What happens when those amount of priority are not available, is it
> > > > removing a functionality?  Will it collide with other flows?
> > >
> > > If 16  priorities not available, simply behavior as 8 priorities.
> > 
> > It is not described in the commit log, please add it.
> > 
> > > > > Signed-off-by: Xueming Li <xuemingl at mellanox.com>
> > <snip/>
> > > > >  	},
> > > > >  	[HASH_RXQ_ETH] = {
> > > > >  		.hash_fields = 0,
> > > > >  		.dpdk_rss_hf = 0,
> > > > > -		.flow_priority = 3,
> > > > > +		.flow_priority = 2,
> > > > >  	},
> > > > >  };
> > > >
> > > > If the amount of priorities remains 8, you are removing the priority
> > > > for the tunnel flows introduced by commit 749365717f5c ("net/mlx5:
> > > > change tunnel flow priority")
> > > >
> > > > Please keep this functionality when this patch fails to get the
> > > > expected
> > > > 16 Verbs priorities.
> > >
> > > These priority shift are different in 16 priorities scenario, I
> > > changed it to calculation. In function mlx5_flow_priorities_detect(),
> > > priority shift will be 1 if 8 priorities, 4 in case of 16 priorities.
> > > Please refer to changes in function mlx5_flow_update_priority() as well.
> > 
> > Please light my lamp, I don't see it...
> 
> Sorry, please refer to priv->config.flow_priority_shift.
> 
> > 
> > <snip/>
> > > > >  static void
> > > > > -mlx5_flow_update_priority(struct mlx5_flow_parse *parser,
> > > > > +mlx5_flow_update_priority(struct rte_eth_dev *dev,
> > > > > +			  struct mlx5_flow_parse *parser,
> > > > >  			  const struct rte_flow_attr *attr)  {
> > > > > +	struct priv *priv = dev->data->dev_private;
> > > > >  	unsigned int i;
> > > > > +	uint16_t priority;
> > > > >
> > > > > +	if (priv->config.flow_priority_shift == 1)
> > > > > +		priority = attr->priority * MLX5_VERBS_FLOW_PRIO_4;
> > > > > +	else
> > > > > +		priority = attr->priority * MLX5_VERBS_FLOW_PRIO_8;
> > > > > +	if (!parser->inner)
> > > > > +		priority += priv->config.flow_priority_shift;
> 
> Here, if non-tunnel flow, lower(increase) 1 for 8 priorities, lower 4 otherwise.
> I'll append a comment here.

Thanks, I totally missed this one.

<snip/>
> > > > > diff --git a/drivers/net/mlx5/mlx5_trigger.c
> > > > > b/drivers/net/mlx5/mlx5_trigger.c index 6bb4ffb14..d80a2e688
> > > > > 100644
> > > > > --- a/drivers/net/mlx5/mlx5_trigger.c
> > > > > +++ b/drivers/net/mlx5/mlx5_trigger.c
> > > > > @@ -148,12 +148,6 @@ mlx5_dev_start(struct rte_eth_dev *dev)
> > > > >  	int ret;
> > > > >
> > > > >  	dev->data->dev_started = 1;
> > > > > -	ret = mlx5_flow_create_drop_queue(dev);
> > > > > -	if (ret) {
> > > > > -		DRV_LOG(ERR, "port %u drop queue allocation failed: %s",
> > > > > -			dev->data->port_id, strerror(rte_errno));
> > > > > -		goto error;
> > > > > -	}
> > > > >  	DRV_LOG(DEBUG, "port %u allocating and configuring hash Rx
> > queues",
> > > > >  		dev->data->port_id);
> > > > >  	rte_mempool_walk(mlx5_mp2mr_iter, priv); @@ -202,7 +196,6 @@
> > > > > mlx5_dev_start(struct rte_eth_dev *dev)
> > > > >  	mlx5_traffic_disable(dev);
> > > > >  	mlx5_txq_stop(dev);
> > > > >  	mlx5_rxq_stop(dev);
> > > > > -	mlx5_flow_delete_drop_queue(dev);
> > > > >  	rte_errno = ret; /* Restore rte_errno. */
> > > > >  	return -rte_errno;
> > > > >  }
> > > > > @@ -237,7 +230,6 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
> > > > >  	mlx5_rxq_stop(dev);
> > > > >  	for (mr = LIST_FIRST(&priv->mr); mr; mr = LIST_FIRST(&priv-
> > >mr))
> > > > >  		mlx5_mr_release(mr);
> > > > > -	mlx5_flow_delete_drop_queue(dev);
> > > > >  }
> > > > >
> > > > >  /**
> > > > > --
> > > > > 2.13.3
> > > >
> > > > I have few concerns on this, mlx5_pci_probe() will also probe any
> > > > under layer verbs device, and in a near future the representors
> > > > associated to a VF.
> > > > Making such detection should only be done once by the PF, I also
> > > > wander if it is possible to make such drop action in a representor
> > > > directly using Verbs.
> > >
> > > Then there should be some work to disable flows in representors? that
> > > supposed to cover this.
> > 
> > The code raising another Verbs device is already present and since the
> > first entrance of this PMD in the DPDK tree, you must respect the code
> > already present.
> > This request is not related directly to a new feature but to an existing
> > one, the representors being just an example.
> > 
> > This detection should be only done once and not for each of them.
> 
> Could you please elaborate on "under layer verbs device" and how to judge
> dependency to PF, is there a probe order between them?

The place where you are adding the mlx5_flow_create_drop_queue() in
mlx5_pci_probe() is probing any device returned by the verbs
ibv_get_device_list().

This code is also present in mlx4 where for a single PCI id there are 2
physical ports.

If the NIC handle several ibvdev (Verbs devices) it will create a new
rte_eth_dev for each one.

> BTW, VF representor code seems exists in 17.11, not upstream.
> 
> > 
> > > > Another concern is, this patch will be reverted in some time when
> > > > those
> > > > 16 priority will be always available.  It will be easier to remove
> > > > this detection function than searching for all those modifications.
> > > >
> > > > I would suggest to have a standalone mlx5_flow_priorities_detect()
> > > > which creates and deletes all resources needed for this detection.
> > >
> > > There is an upcoming new feature to support priorities more than 16,
> > > auto detection will be kept IMHO.
> > 
> > Until the final values of priorities will be backported to all kernels we
> > support.  You don't see far enough in the future.
> > 
> > > Besides, there will be a bundle of resource creation and removal in
> > > this standalone function, I'm not sure it valuable to duplicate them,
> > > please refer to function mlx5_flow_create_drop_queue().
> > 
> > You misunderstood, I am not asking you to not use the default drop queues
> > but instead of making an rte_flow attributes, items and actions to make
> > directly the Verbs specification on the stack.  It will be faster than
> > making a bunch of conversions (relying on malloc) from rte_flow to Verbs
> > whereas you know exactly what it needs i.e. 1 spec.
> 
> Sorry, still confused, mlx5_flow_priorities_detect() invokes ibv_destroy_flow(),
> not rte_flow stuff, no malloc at all. BTW, mlx5 flow api bypass verb flow in 
> offline mode, we can't use it to create flows at such stage.

Sorry I was the one confused. Priority detection is Ok.

After reading this, I'll suggest to use a boolean in the
mlx5_pci_probe() to only make this detection once, the underlying verbs
devices will inherit from such knowledge and adjust their own shift
accordingly.

What do you think?

-- 
Nélio Laranjeiro
6WIND


More information about the dev mailing list