[dpdk-dev] [PATCH v2 5/5] net/mlx5: add vectorized Rx/Tx burst for SSE4.1

Yongseok Koh yskoh at mellanox.com
Wed Jul 5 02:38:44 CEST 2017


On Tue, Jul 04, 2017 at 10:58:52AM +0200, Nélio Laranjeiro wrote:
> Yongseok, some comments in this huge and great work,
> 
> On Fri, Jun 30, 2017 at 12:23:33PM -0700, Yongseok Koh wrote:
> > To make vectorized burst routines enabled, it is required to run on x86_64
> > architecture which can support at least SSE4.1. If all the conditions are
> > met, the vectorized burst functions are enabled automatically. The decision
> > is made individually on RX and TX. There's no PMD option to make a
> > selection.
> > 
> > Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
> > ---
> >  drivers/net/mlx5/Makefile            |   10 +
> >  drivers/net/mlx5/mlx5_defs.h         |   18 +
> >  drivers/net/mlx5/mlx5_ethdev.c       |   28 +-
> >  drivers/net/mlx5/mlx5_rxq.c          |   55 +-
> >  drivers/net/mlx5/mlx5_rxtx.c         |  339 ++------
> >  drivers/net/mlx5/mlx5_rxtx.h         |  283 ++++++-
> >  drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 1451 ++++++++++++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5_txq.c          |    2 +-
> >  8 files changed, 1909 insertions(+), 277 deletions(-)
> >  create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_sse.c
> > 
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> > index 51e258a15..2d0894fcd 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
[...]
> > +	txq_complete(txq);
> > +	/* A CQE slot must always be available. */
> > +	assert((1u << txq->cqe_n) - (txq->cq_pi - txq->cq_ci));
> 
> This assert should be moved to the txq_complete(), or it should not be
> an assert.
txq_complete() is a common function, so this can't force to spare at least one
slot in a completion queue. This assert is to force to allocate enough CQE slots
by accurate calculation as completion is suppressed by MLX5_TX_COMP_THRESH. If
the CQ size is well defined (e.g. size of Tx ring / MLX5_TX_COMP_THRESH), it
doesn't need to check deficiency of CQ slot but checking slots in Tx ring
(max_elts) is sufficient. If you are okay with this, please let me know, then
I'll send out v3.

And I agree on all other comments you made. I'll make changes accordingly in v3.

Thanks for your quick review!
Yongseok


More information about the dev mailing list