[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Dec 3 13:42:13 CET 2015



> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Thursday, December 03, 2015 12:17 PM
> To: Ananyev, Konstantin
> Cc: Thomas Monjalon; dev at dpdk.org; viktorin at rehivetech.com
> Subject: Re: [dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs
> 
> On Thu, Dec 03, 2015 at 11:02:07AM +0000, Ananyev, Konstantin wrote:
> 
> Hi Konstantin,
> 
> > Hi Jerin,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Thursday, December 03, 2015 9:34 AM
> > > To: Thomas Monjalon
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs
> > >
> > > On Wed, Dec 02, 2015 at 05:57:10PM +0100, Thomas Monjalon wrote:
> > > > 2015-12-02 22:23, Jerin Jacob:
> > > > > On Wed, Dec 02, 2015 at 05:40:13PM +0100, Thomas Monjalon wrote:
> > > > > > 2015-12-02 20:04, Jerin Jacob:
> > > > > > > On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> > > > > > > > On 2 December 2015 at 18:39, Jerin Jacob <jerin.jacob at caviumnetworks.com> wrote:
> > > > > > > > > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > > > > > > > > that lead to multiple definition and its not good.
> > > > > > > > >
> > > > > > > > But you will have similar issue since "typedef int32x4_t __m128i"
> > > > > > > > appears in both your patch and this header file.
> > > > > > >
> > > > > > > I just tested it, it won't break, back to back "typedef int32x4_t __m128i"
> > > > > > > is fine(unlike inline function).
> > > > > > >
> > > > > > > my intention to keep __m128i "as is"  because changing the __m128i to rte_???
> > > > > > > something would break the ABI.
> > > > > >
> > > > > > Isn't it already broken in 2.2?
> > > > >
> > > > > Does it mean, You would like to have rte_128i(or similar) kind of
> > > > > abstraction to represent 128bit SIMD variable in DPDK?
> > > >
> > > > If you are convinced that it is the best way to write a generic code, yes.
> > >
> > > I grep-ed through DPDK API list to see the dependency with SIMD in API
> > > definition.I see only rte_lpm_lookupx4 API has SIMD dependency in API
> > > definition.
> > >
> > > I believe that's the root cause of the problem. IMO, The
> > > better way to fix this would be to remove __m128i from API and have more
> > > general representation to remove the architecture dependency from API
> > >
> > > something like this,
> > >
> > > rte_lpm_lookupx4(const struct rte_lpm *lpm, uint32_t *ip, uint16_t
> > > hop[4], uint16_t defv)
> > >
> > > instead of
> > >
> > > rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t
> > > hop[4],  uint16_t defv)
> >
> > The idea for that function was that rte_lpm_lookupx4() accepts 4 IPv4 addresses that are:
> > 1. already in 128bit register
> > 2. 'prepared' - byte swap is already done for them if needed.
> >
> > About ways to fix  __m128i dependency: as I can see x86 and arm DPDK code
> > already has xmm_t typedef:
> >
> > $ find lib -type f | xargs grep xmm_t | grep typedef
> > lib/librte_eal/common/include/arch/x86/rte_vect.h:typedef __m128i xmm_t;
> > lib/librte_eal/common/include/arch/arm/rte_vect.h:typedef int32x4_t xmm_t;
> >
> > Why not to  change rte_lpm_lookupx4() to accept xmm_t as input parameter.
> > As I understand it would solve the problem, and wouldn't introduce any API/ABI breakage, right?
> 
> Yes, If we have API/ABI breakage concerns.
> 
> IMO, Now this would call for some kind of rte_vect_* abstraction load,
> store, set kind of SIMD operation on xmm_t in common test code to
> aviod #ifdef's in app/test/test_lpm.c

Yes, seems so.

> 
> I guess we may not need those abstractions in
> lib/librte_eal/common/include/arch/ directory.
> keeping in app/test/xmmt_ops.h should be enough, right?

That sounds ok to me.
At least for now.
For future the more generic question - do we like to have some
generic layer abstraction for similar vector instrincts across different archs?
>From one side it might help people writing/using vector implementation of some stuff,
from other side - there would be extra hassle creating/supporting it.    

Konstantin

> 
> 
> >
> > Konstantin
> >
> > >
> > > Now I am not sure why this API was created like this, from l3fwd.c
> > > example, it looks to accommodate the IPV4 byte swap[1]. If it's true,
> > > maybe we can have eal byte swap abstraction for optimized byte swap on
> > > memory for 4 IP address in one shot
> > >
> > > or
> > >
> > > Have rte_lpm_lookupx4 take an argument for byte swap or not ?
> > >
> > > or
> > >
> > > something similar?
> > >
> > > Thoughts ?
> > >
> > > [1]
> > > const  __m128i bswap_mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11,
> > >                                                 4, 5, 6, 7, 0, 1, 2, 3);
> > > /* Byte swap 4 IPV4 addresses. */
> > > dip = _mm_shuffle_epi8(dip, bswap_mask);
> > >
> > > Jerin
> > >
> > > > I think the most important question is to know what is the best solution
> > > > for performance and maintainability. The API/ABI questions will be considered
> > > > after.
> > > >
> > > > Thanks for your involvement guys.


More information about the dev mailing list