[dpdk-dev] [PATCH v2 5/5] net/mlx5: add vectorized Rx/Tx burst for SSE4.1

Yongseok Koh yskoh at mellanox.com
Wed Jul 5 19:41:32 CEST 2017


On Wed, Jul 05, 2017 at 10:21:26AM +0200, Nélio Laranjeiro wrote:
> On Tue, Jul 04, 2017 at 05:38:44PM -0700, Yongseok Koh wrote:
> > On Tue, Jul 04, 2017 at 10:58:52AM +0200, Nélio Laranjeiro wrote:
> > > Yongseok, some comments in this huge and great work,
> > > 
> > > On Fri, Jun 30, 2017 at 12:23:33PM -0700, Yongseok Koh wrote:
> > > > To make vectorized burst routines enabled, it is required to run on x86_64
> > > > architecture which can support at least SSE4.1. If all the conditions are
> > > > met, the vectorized burst functions are enabled automatically. The decision
> > > > is made individually on RX and TX. There's no PMD option to make a
> > > > selection.
> > > > 
> > > > Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
> > > > ---
> > > >  drivers/net/mlx5/Makefile            |   10 +
> > > >  drivers/net/mlx5/mlx5_defs.h         |   18 +
> > > >  drivers/net/mlx5/mlx5_ethdev.c       |   28 +-
> > > >  drivers/net/mlx5/mlx5_rxq.c          |   55 +-
> > > >  drivers/net/mlx5/mlx5_rxtx.c         |  339 ++------
> > > >  drivers/net/mlx5/mlx5_rxtx.h         |  283 ++++++-
> > > >  drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 1451 ++++++++++++++++++++++++++++++++++
> > > >  drivers/net/mlx5/mlx5_txq.c          |    2 +-
> > > >  8 files changed, 1909 insertions(+), 277 deletions(-)
> > > >  create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_sse.c
> > > > 
> > > > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> > > > index 51e258a15..2d0894fcd 100644
> > > > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > [...]
> > > > +	txq_complete(txq);
> > > > +	/* A CQE slot must always be available. */
> > > > +	assert((1u << txq->cqe_n) - (txq->cq_pi - txq->cq_ci));
> > > 
> > > This assert should be moved to the txq_complete(), or it should not be
> > > an assert.
> > txq_complete() is a common function, so this can't force to spare at least one
> > slot in a completion queue. This assert is to force to allocate enough CQE slots
> > by accurate calculation as completion is suppressed by MLX5_TX_COMP_THRESH. If
> > the CQ size is well defined (e.g. size of Tx ring / MLX5_TX_COMP_THRESH), it
> > doesn't need to check deficiency of CQ slot but checking slots in Tx ring
> > (max_elts) is sufficient. If you are okay with this, please let me know, then
> > I'll send out v3.
> 
> Just using your comment...
>  /* A CQE slot must always be available. */
> 
> This is always true in any Tx function where this assumption should be
> true to avoid testing it before posting a completion request and thus
> avoid cycles waste and this independently of the computation for the CQ
> ring size.
> 
> As it is an assert it is only present to help the developer in error
> code he developed (or for a user to point an issue in the code), this
> can be any-where in the Tx data path.  It becomes useful for any Tx
> function using this txq_complete().  
> 
> If this assert is only related to your code, it may means it should be
> an if which avoids to post a completion request when no slots are
> available.
I think I was wrong about the assert. If the Tx ring is full (max_elts == 0),
then there won't be any empty slot left in CQ. I'll remove the two asserts.

Thanks
Yongseok


More information about the dev mailing list