[dpdk-dev] [PATCH 2/3] arm64: acl: add neon based acl implementation
Jerin Jacob
jerin.jacob at caviumnetworks.com
Mon Nov 2 17:19:54 CET 2015
On Mon, Nov 02, 2015 at 04:39:37PM +0100, Jan Viktorin wrote:
> On Mon, 2 Nov 2015 19:48:40 +0530
> Jerin Jacob <jerin.jacob at caviumnetworks.com> wrote:
>
> > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > ---
> > app/test-acl/main.c | 4 +
> > lib/librte_acl/Makefile | 5 +
> > lib/librte_acl/acl.h | 4 +
> > lib/librte_acl/acl_run_neon.c | 46 +++++++
> > lib/librte_acl/acl_run_neon.h | 290 ++++++++++++++++++++++++++++++++++++++++++
> > lib/librte_acl/rte_acl.c | 25 ++++
> > lib/librte_acl/rte_acl.h | 1 +
> > 7 files changed, 375 insertions(+)
> > create mode 100644 lib/librte_acl/acl_run_neon.c
> > create mode 100644 lib/librte_acl/acl_run_neon.h
> >
> > diff --git a/app/test-acl/main.c b/app/test-acl/main.c
> > index 72ce83c..0b0c093 100644
> > --- a/app/test-acl/main.c
> > +++ b/app/test-acl/main.c
> > @@ -101,6 +101,10 @@ static const struct acl_alg acl_alg[] = {
> > .name = "avx2",
> > .alg = RTE_ACL_CLASSIFY_AVX2,
> > },
> > + {
> > + .name = "neon",
> > + .alg = RTE_ACL_CLASSIFY_NEON,
> > + },
> > };
> >
> > static struct {
> > diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> > index 7a1cf8a..27f91d5 100644
> > --- a/lib/librte_acl/Makefile
> > +++ b/lib/librte_acl/Makefile
> > @@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
> > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
> > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
> > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
> > +ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
> > +SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c
>
> Are the used NEON instrinsics for ACL ARMv8-specific? If so, the file should be named
> something like acl_run_neonv8.c...
>
Yes, bit of armv8 specific, looks like vqtbl1q_u8 NEON instrinsics
defined only in armv8. I could rename to acl_run_neonv8.c but keeping
as acl_run_neon.c, may in future it can be extend to armv7 also.
I am open to any decision, let me know your views.
> > +else
> > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> > +endif
> >
> > CFLAGS_acl_run_sse.o += -msse4.1
> > +CFLAGS_acl_run_neon.o += -flax-vector-conversions -Wno-maybe-uninitialized
>
> From man gcc:
>
> -flax-vector-conversions
> Allow implicit conversions between vectors with differing numbers of elements and/or
> incompatible element types. This option should not be used for new code.
>
> I've already pointed to this in the Dave's ARMv8 patchset. They dropped it silently.
> What is the purpose? Is it necessary?
Yes, the same tr hi value we can representing as unsigned and signed
based on it DFA or QRANGE .
>
> Jan
>
> >
> > #
> > # If the compiler supports AVX2 instructions,
> > diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
> > index eb4930c..09d6784 100644
> > --- a/lib/librte_acl/acl.h
> > +++ b/lib/librte_acl/acl.h
> > @@ -230,6 +230,10 @@ int
> > rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t **data,
> > uint32_t *results, uint32_t num, uint32_t categories);
> >
> --snip--
>
> --
> Jan Viktorin E-mail: Viktorin at RehiveTech.com
> System Architect Web: www.RehiveTech.com
> RehiveTech
> Brno, Czech Republic
More information about the dev
mailing list