[dpdk-dev] [RFC PATCH 0/1] net/mlx5: add vectorized Rx/Tx burst for ARM

Yongseok Koh yskoh at mellanox.com
Fri Aug 25 20:40:22 CEST 2017


The SSE(x86) Rx/Tx burst functions added in v17.08 would be ported for ARM NEON
in v17.11. Although this is still ongoing effort (more implementation and
further optimization), this intrim patch can be applied on top of v17.08 and
forward packts.

One of topics to discuss is that I used inilne assembly for performance critical
code blocks because I don't think intrinsics for NEON aren't well optimized yet,
especially vqtbl2q_u8()/vqtbl3q_u8()/vqtbl4q_u8() and gcc's register
optimization. And older gcc doesn't even have vld1q_u8_x4(). I used it to get
rid of hotspots shown in profiling result. I'm not sure whether inline assembly
is allowed in DPDK community. But, I believe there's no reason to prohibit it.

In my patch, some of functions are commented out as I'm not done migrating those
yet. But this is functional (Rx/Tx). For Tx, "--txqflags=0xf01" is needed
because I haven't ported txq_scatter_v() yet.

Yongseok Koh (1):
  net/mlx5: add vectorized Rx/Tx burst for ARM

 drivers/net/mlx5/Makefile             |    2 +
 drivers/net/mlx5/mlx5_ethdev.c        |    4 +-
 drivers/net/mlx5/mlx5_prm.h           |   15 +
 drivers/net/mlx5/mlx5_rxq.c           |   61 ++
 drivers/net/mlx5/mlx5_rxtx.h          |    3 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.c | 1464 +++++++++++++++++++++++++++++++++
 6 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_neon.c

-- 
2.11.0



More information about the dev mailing list