net/mlx5: check Tx queue size overflow

Message ID 20190430190426.44018-1-yskoh@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Shahaf Shuler
Headers
Series net/mlx5: check Tx queue size overflow |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing fail Performance Testing issues

Commit Message

Yongseok Koh April 30, 2019, 7:04 p.m. UTC
  If Tx packet inlining is enabled, rdma-core library should allocate large
Tx WQ enough to support it. It is better for PMD to calculate the size of
WQ based on the parameters and return error with appropriate message if it
exceeds the device capability.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5_txq.c | 35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)
  

Comments

Stephen Hemminger April 30, 2019, 8:46 p.m. UTC | #1
On Tue, 30 Apr 2019 12:04:26 -0700
Yongseok Koh <yskoh@mellanox.com> wrote:

> +	    priv->sh->device_attr.orig_attr.max_qp_wr) {
> +		DRV_LOG(DEBUG,
> +			"port %u Tx WQEBB count exceeds the limit (%d),"
> +			" try smaller queue size again",
> +			dev->data->port_id,

The patch looks good, but it could be improved to make life easier
for the users.

This is an error, why not print it at NOTICE level since DEBUG messages
are usually suppressed.

Please don't break long lines in log messages. The latter part of the message
is obvious, why not skip it.

Also since max_qp_wr is __u32, the print format should be %u

Instead:
		DRV_LOG(NOTICE,
			"port %u Tx WQEBB count (%u) exceeds the limit (%u)",
			dev->data->port_id,
			txq_calc_wqebb_cnt(tmpl),
			priv->sh->device_attr.orig_attr.max_qp_wr);

Also, should it have a Fixes: tag to backport to stable?
  
Yongseok Koh May 1, 2019, 12:43 a.m. UTC | #2
> On Apr 30, 2019, at 1:46 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> On Tue, 30 Apr 2019 12:04:26 -0700
> Yongseok Koh <yskoh@mellanox.com> wrote:
> 
>> +	    priv->sh->device_attr.orig_attr.max_qp_wr) {
>> +		DRV_LOG(DEBUG,
>> +			"port %u Tx WQEBB count exceeds the limit (%d),"
>> +			" try smaller queue size again",
>> +			dev->data->port_id,
> 
> The patch looks good, but it could be improved to make life easier
> for the users.
> 
> This is an error, why not print it at NOTICE level since DEBUG messages
> are usually suppressed.

Okay, better to set ERROR.

> Please don't break long lines in log messages.

That doesn't add new-line and mlx5 specific coding style.

> The latter part of the message
> is obvious, why not skip it.

Because not all users know what WQEBB means, I wanted to make user's life easier
by informing what to do.

> Also since max_qp_wr is __u32, the print format should be %u

It was copied from the removing line but agree to change.

> Instead:
> 		DRV_LOG(NOTICE,
> 			"port %u Tx WQEBB count (%u) exceeds the limit (%u)",
> 			dev->data->port_id,
> 			txq_calc_wqebb_cnt(tmpl),
> 			priv->sh->device_attr.orig_attr.max_qp_wr);
> 
> Also, should it have a Fixes: tag to backport to stable?

Okay, no harm to give more information.


Thanks for the review.
Yongseok
  
Yongseok Koh May 1, 2019, 12:56 a.m. UTC | #3
> On Apr 30, 2019, at 5:43 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
>> 
>> On Apr 30, 2019, at 1:46 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
>> 
>> On Tue, 30 Apr 2019 12:04:26 -0700
>> Yongseok Koh <yskoh@mellanox.com> wrote:
>> 
>>> +	    priv->sh->device_attr.orig_attr.max_qp_wr) {
>>> +		DRV_LOG(DEBUG,
>>> +			"port %u Tx WQEBB count exceeds the limit (%d),"
>>> +			" try smaller queue size again",
>>> +			dev->data->port_id,
>> 
>> The patch looks good, but it could be improved to make life easier
>> for the users.
>> 
>> This is an error, why not print it at NOTICE level since DEBUG messages
>> are usually suppressed.
> 
> Okay, better to set ERROR.
> 
>> Please don't break long lines in log messages.
> 
> That doesn't add new-line and mlx5 specific coding style.
> 
>> The latter part of the message
>> is obvious, why not skip it.
> 
> Because not all users know what WQEBB means, I wanted to make user's life easier
> by informing what to do.
> 
>> Also since max_qp_wr is __u32, the print format should be %u
> 
> It was copied from the removing line but agree to change.

No, it is 'int'. Please check 'struct ibv_device_attr'.

> 
>> Instead:
>> 		DRV_LOG(NOTICE,
>> 			"port %u Tx WQEBB count (%u) exceeds the limit (%u)",
>> 			dev->data->port_id,
>> 			txq_calc_wqebb_cnt(tmpl),
>> 			priv->sh->device_attr.orig_attr.max_qp_wr);
>> 
>> Also, should it have a Fixes: tag to backport to stable?
> 
> Okay, no harm to give more information.
> 
> 
> Thanks for the review.
> Yongseok
  

Patch

diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 4d55fd413c..dfc9afbe75 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -679,6 +679,27 @@  mlx5_txq_ibv_verify(struct rte_eth_dev *dev)
 }
 
 /**
+ * Calcuate the total number of WQEBB for Tx queue.
+ *
+ * Simplified version of calc_sq_size() in rdma-core.
+ *
+ * @param txq_ctrl
+ *   Pointer to Tx queue control structure.
+ *
+ * @return
+ *   The number of WQEBB.
+ */
+static int
+txq_calc_wqebb_cnt(struct mlx5_txq_ctrl *txq_ctrl)
+{
+	unsigned int wqe_size;
+	const unsigned int desc = 1 << txq_ctrl->txq.elts_n;
+
+	wqe_size = MLX5_WQE_SIZE + txq_ctrl->max_inline_data;
+	return rte_align32pow2(wqe_size * desc) / MLX5_WQE_SIZE;
+}
+
+/**
  * Set Tx queue parameters from device configuration.
  *
  * @param txq_ctrl
@@ -824,10 +845,16 @@  mlx5_txq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	tmpl->txq.port_id = dev->data->port_id;
 	tmpl->txq.idx = idx;
 	txq_set_params(tmpl);
-	DRV_LOG(DEBUG, "port %u device_attr.max_qp_wr is %d",
-		dev->data->port_id, priv->sh->device_attr.orig_attr.max_qp_wr);
-	DRV_LOG(DEBUG, "port %u device_attr.max_sge is %d",
-		dev->data->port_id, priv->sh->device_attr.orig_attr.max_sge);
+	if (txq_calc_wqebb_cnt(tmpl) >
+	    priv->sh->device_attr.orig_attr.max_qp_wr) {
+		DRV_LOG(DEBUG,
+			"port %u Tx WQEBB count exceeds the limit (%d),"
+			" try smaller queue size again",
+			dev->data->port_id,
+			priv->sh->device_attr.orig_attr.max_qp_wr);
+		rte_errno = ENOMEM;
+		goto error;
+	}
 	tmpl->txq.elts =
 		(struct rte_mbuf *(*)[1 << tmpl->txq.elts_n])(tmpl + 1);
 	rte_atomic32_inc(&tmpl->refcnt);