[dpdk-dev,v3,1/7] net/mlx4: remove error flows from Tx fast path

Message ID 1509358049-18854-2-git-send-email-matan@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Matan Azrad Oct. 30, 2017, 10:07 a.m. UTC
  Move unnecessary error flows to DEBUG mode.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)
  

Comments

Adrien Mazarguil Oct. 30, 2017, 2:23 p.m. UTC | #1
On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> Move unnecessary error flows to DEBUG mode.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>

I missed a couple of details while reviewing the original version, the first
one being mlx4_post_send()'s return value is still documented as updating
rte_errno in case of error, it's not the case anymore after this patch.

Please see below for the other one:

> ---
>  drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c
<snip>
>  /**
> @@ -510,8 +508,6 @@ struct pv {
>  	assert(max <= elts_n);
>  	/* Always leave one free entry in the ring. */
>  	--max;
> -	if (max == 0)
> -		return 0;
>  	if (max > pkts_n)
>  		max = pkts_n;
>  	for (i = 0; (i != max); ++i) {

While minor, this change has nothing to do with this patch, right?

I think it can slightly degrade an application performance as it removes the
guarantee that subsequent code only needs to be run if there is at least one
packet to process in case the TX ring is constantly full (SW faster than
HW).

Can you remove it?
  
Matan Azrad Oct. 30, 2017, 6:11 p.m. UTC | #2
Hi Adrien

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Monday, October 30, 2017 4:23 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Ophir Munk <ophirmu@mellanox.com>
> Subject: Re: [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path
> 
> On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> > Move unnecessary error flows to DEBUG mode.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> 
> I missed a couple of details while reviewing the original version, the first one
> being mlx4_post_send()'s return value is still documented as updating
> rte_errno in case of error, it's not the case anymore after this patch.
> 
Good attention, Will be fixed in next version.

> Please see below for the other one:
> 
> > ---
> >  drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
> >  1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mlx4/mlx4_rxtx.c
> > b/drivers/net/mlx4/mlx4_rxtx.c
> <snip>
> >  /**
> > @@ -510,8 +508,6 @@ struct pv {
> >  	assert(max <= elts_n);
> >  	/* Always leave one free entry in the ring. */
> >  	--max;
> > -	if (max == 0)
> > -		return 0;
> >  	if (max > pkts_n)
> >  		max = pkts_n;
> >  	for (i = 0; (i != max); ++i) {
> 
> While minor, this change has nothing to do with this patch, right?
> 
Yes you right, maybe it can be merged in patch 4/7.
 
> I think it can slightly degrade an application performance as it removes the
> guarantee that subsequent code only needs to be run if there is at least one
> packet to process in case the TX ring is constantly full (SW faster than HW).
>

In case the TX ring is full, the loop condition should fail in the start and then return with 0  because the packet counter is 0.(more 2 checks)
Since this case are less common (in my opinion) than at least 1 free space in ring, we can prevent this unnecessary check for all these common cases.    
  
Are you sure the 2 extra check important for performance in this empty case? Doesn't the application will call us again? 
 
> Can you remove it?
> 
> --
> Adrien Mazarguil
> 6WIND
  
Adrien Mazarguil Oct. 31, 2017, 10:16 a.m. UTC | #3
Hi Matan,

On Mon, Oct 30, 2017 at 06:11:31PM +0000, Matan Azrad wrote:
> Hi Adrien
> 
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Monday, October 30, 2017 4:23 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: dev@dpdk.org; Ophir Munk <ophirmu@mellanox.com>
> > Subject: Re: [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path
> > 
> > On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> > > Move unnecessary error flows to DEBUG mode.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > 
> > I missed a couple of details while reviewing the original version, the first one
> > being mlx4_post_send()'s return value is still documented as updating
> > rte_errno in case of error, it's not the case anymore after this patch.
> > 
> Good attention, Will be fixed in next version.
> 
> > Please see below for the other one:
> > 
> > > ---
> > >  drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
> > >  1 file changed, 6 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx4/mlx4_rxtx.c
> > > b/drivers/net/mlx4/mlx4_rxtx.c
> > <snip>
> > >  /**
> > > @@ -510,8 +508,6 @@ struct pv {
> > >  	assert(max <= elts_n);
> > >  	/* Always leave one free entry in the ring. */
> > >  	--max;
> > > -	if (max == 0)
> > > -		return 0;
> > >  	if (max > pkts_n)
> > >  		max = pkts_n;
> > >  	for (i = 0; (i != max); ++i) {
> > 
> > While minor, this change has nothing to do with this patch, right?
> > 
> Yes you right, maybe it can be merged in patch 4/7.
>  
> > I think it can slightly degrade an application performance as it removes the
> > guarantee that subsequent code only needs to be run if there is at least one
> > packet to process in case the TX ring is constantly full (SW faster than HW).
> >
> 
> In case the TX ring is full, the loop condition should fail in the start and then return with 0  because the packet counter is 0.(more 2 checks)
> Since this case are less common (in my opinion) than at least 1 free space in ring, we can prevent this unnecessary check for all these common cases.    
>   
> Are you sure the 2 extra check important for performance in this empty case? Doesn't the application will call us again? 

No, I don't think they're important to performance, like the changes from
patch 4/7, I'm not certain they actually make any difference. My suggestion
was mainly to leave it alone because of that. It's OK if you want to keep
and move it to 4/7.
  

Patch

diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c
index 67dc712..4f899ff 100644
--- a/drivers/net/mlx4/mlx4_rxtx.c
+++ b/drivers/net/mlx4/mlx4_rxtx.c
@@ -169,6 +169,7 @@  struct pv {
 		 * Make sure we read the CQE after we read the ownership bit.
 		 */
 		rte_rmb();
+#ifndef NDEBUG
 		if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
 			     MLX4_CQE_OPCODE_ERROR)) {
 			struct mlx4_err_cqe *cqe_err =
@@ -178,6 +179,7 @@  struct pv {
 			      (void *)txq, cqe_err->vendor_err,
 			      cqe_err->syndrome);
 		}
+#endif /* NDEBUG */
 		/* Get WQE index reported in the CQE. */
 		new_index =
 			rte_be_to_cpu_16(cqe->wqe_index) & sq->txbb_cnt_mask;
@@ -322,7 +324,6 @@  struct pv {
 	uint32_t byte_count;
 	int wqe_real_size;
 	int nr_txbbs;
-	int rc;
 	struct pv *pv = (struct pv *)txq->bounce_buf;
 	int pv_counter = 0;
 
@@ -337,8 +338,7 @@  struct pv {
 	if (((sq->head - sq->tail) + nr_txbbs +
 	     sq->headroom_txbbs) >= sq->txbb_cnt ||
 	    nr_txbbs > MLX4_MAX_WQE_TXBBS) {
-		rc = ENOSPC;
-		goto err;
+		return -ENOSPC;
 	}
 	/* Get the control and data entries of the WQE. */
 	ctrl = (struct mlx4_wqe_ctrl_seg *)mlx4_get_send_wqe(sq, head_idx);
@@ -354,6 +354,7 @@  struct pv {
 		dseg->addr = rte_cpu_to_be_64(addr);
 		/* Memory region key for this memory pool. */
 		lkey = mlx4_txq_mp2mr(txq, mlx4_txq_mb2mp(buf));
+#ifndef NDEBUG
 		if (unlikely(lkey == (uint32_t)-1)) {
 			/* MR does not exist. */
 			DEBUG("%p: unable to get MP <-> MR association",
@@ -366,9 +367,9 @@  struct pv {
 			ctrl->fence_size = (wqe_real_size >> 4) & 0x3f;
 			mlx4_txq_stamp_freed_wqe(sq, head_idx,
 				     (sq->head & sq->txbb_cnt) ? 0 : 1);
-			rc = EFAULT;
-			goto err;
+			return -EFAULT;
 		}
+#endif /* NDEBUG */
 		dseg->lkey = rte_cpu_to_be_32(lkey);
 		if (likely(buf->data_len)) {
 			byte_count = rte_cpu_to_be_32(buf->data_len);
@@ -471,9 +472,6 @@  struct pv {
 					       MLX4_BIT_WQE_OWN : 0));
 	sq->head += nr_txbbs;
 	return 0;
-err:
-	rte_errno = rc;
-	return -rc;
 }
 
 /**
@@ -510,8 +508,6 @@  struct pv {
 	assert(max <= elts_n);
 	/* Always leave one free entry in the ring. */
 	--max;
-	if (max == 0)
-		return 0;
 	if (max > pkts_n)
 		max = pkts_n;
 	for (i = 0; (i != max); ++i) {