[dpdk-dev] eal/armv7: emulate vaddvq u16 variant

Message ID 20170707162654.4638-1-jerin.jacob@caviumnetworks.com (mailing list archive)
State Accepted, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Jerin Jacob July 7, 2017, 4:26 p.m. UTC
  vaddvq_u16() is not available for armv7.
Emulate the vaddvq_u16() using armv7 NEON intrinsics.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Thomas Monjalon July 8, 2017, 5:08 p.m. UTC | #1
07/07/2017 18:26, Jerin Jacob:
> vaddvq_u16() is not available for armv7.
> Emulate the vaddvq_u16() using armv7 NEON intrinsics.

After implementing this function, another missing function appears:

	lib/librte_sched/rte_sched.c:1747:7: error:
	implicit declaration of function ‘vminvq_u32’
  
Jianbo Liu July 10, 2017, 3:34 a.m. UTC | #2
On 8 July 2017 at 00:26, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> vaddvq_u16() is not available for armv7.
> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> index 0670ca2ee..69fd428f3 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> @@ -77,6 +77,17 @@ vqtbl1q_u8(uint8x16_t a, uint8x16_t b)
>
>         return vld1q_u8(rte_ret.u8);
>  }
> +
> +static inline uint16_t
> +vaddvq_u16(uint16x8_t a)
> +{
> +       uint32x4_t m = vpaddlq_u16(a);
> +       uint64x2_t n = vpaddlq_u32(m);
> +       uint64x1_t o = vget_low_u64(n) + vget_high_u64(n);
> +
> +       return vget_lane_u32((uint32x2_t)o, 0);
> +}
> +
>  #endif
>
>  #if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 70000)
> --
> 2.13.2
>

Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
  
Jianbo Liu July 10, 2017, 3:51 a.m. UTC | #3
On 9 July 2017 at 01:08, Thomas Monjalon <thomas@monjalon.net> wrote:
> 07/07/2017 18:26, Jerin Jacob:
>> vaddvq_u16() is not available for armv7.
>> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
>
> After implementing this function, another missing function appears:
>
>         lib/librte_sched/rte_sched.c:1747:7: error:
>         implicit declaration of function ‘vminvq_u32’

But sched_vector is disabled in defconfig_arm-armv7a-linuxapp-gcc:
    CONFIG_RTE_SCHED_VECTOR=n
  
Thomas Monjalon July 10, 2017, 7:28 a.m. UTC | #4
10/07/2017 05:51, Jianbo Liu:
> On 9 July 2017 at 01:08, Thomas Monjalon <thomas@monjalon.net> wrote:
> > 07/07/2017 18:26, Jerin Jacob:
> >> vaddvq_u16() is not available for armv7.
> >> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
> >
> > After implementing this function, another missing function appears:
> >
> >         lib/librte_sched/rte_sched.c:1747:7: error:
> >         implicit declaration of function ‘vminvq_u32’
> 
> But sched_vector is disabled in defconfig_arm-armv7a-linuxapp-gcc:
>     CONFIG_RTE_SCHED_VECTOR=n

Yes, I really need to fix test-build.sh which is enabling SCHED_VECTOR.

So with this patch, the error remains:
examples/l3fwd/l3fwd_neon.h:113:6: error:
implicit declaration of function ‘vaddvq_u16’
  v = vaddvq_u16(dp1);
      ^~~~~~~~~~

We need to include rte_vect.h.
  
Thomas Monjalon July 10, 2017, 7:32 a.m. UTC | #5
10/07/2017 09:28, Thomas Monjalon:
> 10/07/2017 05:51, Jianbo Liu:
> > On 9 July 2017 at 01:08, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 07/07/2017 18:26, Jerin Jacob:
> > >> vaddvq_u16() is not available for armv7.
> > >> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
> > >
> > > After implementing this function, another missing function appears:
> > >
> > >         lib/librte_sched/rte_sched.c:1747:7: error:
> > >         implicit declaration of function ‘vminvq_u32’
> > 
> > But sched_vector is disabled in defconfig_arm-armv7a-linuxapp-gcc:
> >     CONFIG_RTE_SCHED_VECTOR=n
> 
> Yes, I really need to fix test-build.sh which is enabling SCHED_VECTOR.
> 
> So with this patch, the error remains:
> examples/l3fwd/l3fwd_neon.h:113:6: error:
> implicit declaration of function ‘vaddvq_u16’
>   v = vaddvq_u16(dp1);
>       ^~~~~~~~~~
> 
> We need to include rte_vect.h.

Forget it, I mixed up my branches :/

So the patch is OK, compilation is fixed.
  
Thomas Monjalon July 10, 2017, 7:35 a.m. UTC | #6
10/07/2017 05:34, Jianbo Liu:
> On 8 July 2017 at 00:26, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > vaddvq_u16() is not available for armv7.
> > Emulate the vaddvq_u16() using armv7 NEON intrinsics.
> >
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> 
> Acked-by: Jianbo Liu <jianbo.liu@linaro.org>

Applied, thanks
  

Patch

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
index 0670ca2ee..69fd428f3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -77,6 +77,17 @@  vqtbl1q_u8(uint8x16_t a, uint8x16_t b)
 
 	return vld1q_u8(rte_ret.u8);
 }
+
+static inline uint16_t
+vaddvq_u16(uint16x8_t a)
+{
+	uint32x4_t m = vpaddlq_u16(a);
+	uint64x2_t n = vpaddlq_u32(m);
+	uint64x1_t o = vget_low_u64(n) + vget_high_u64(n);
+
+	return vget_lane_u32((uint32x2_t)o, 0);
+}
+
 #endif
 
 #if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 70000)