[dpdk-dev] [PATCH v2 1/2] Add RIB library

Bruce Richardson bruce.richardson at intel.com
Thu Mar 29 22:41:49 CEST 2018
Previous message: [dpdk-dev] [PATCH v2 1/2] Add RIB library
Next message: [dpdk-dev] [PATCH v3 1/2] Add RIB library
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Mar 29, 2018 at 11:11:20PM +0300, Vladimir Medvedkin wrote:
> 2018-03-29 13:27 GMT+03:00 Bruce Richardson <bruce.richardson at intel.com>:
> 
> > On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote:
> > > RIB is an alternative to current LPM library.
<snip>
> > > +#define LOOKUP_FUNC(suffix, type, bulk_prefetch)                     \
> > > +int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t
> > *ips,       \
> > > +     uint64_t *next_hops, const unsigned n)                          \
> > > +{                                                                    \
> > > +     struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;  \
> > > +     uint64_t tmp;                                                   \
> > > +     uint32_t i;                                                     \
> > > +     uint32_t prefetch_offset = RTE_MIN((unsigned)bulk_prefetch, n); \
> > > +                                                                     \
> > > +     RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) ||       \
> > > +             (next_hops == NULL)), -EINVAL);                         \
> > > +                                                                     \
> > > +     for (i = 0; i < prefetch_offset; i++)                           \
> > > +             rte_prefetch0(get_tbl24_p(fib, ips[i]));                \
> > > +     for (i = 0; i < (n - prefetch_offset); i++) {                   \
> > > +             rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset]));
> > \
> > > +             tmp = ((type *)fib->tbl24)[ips[i] >> 8];                \
> > > +             if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==       \
> > > +                     RTE_DIR24_8_VALID_EXT_ENT)) {                   \
> > > +                     tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +     \
> > > +                             ((tmp >> 1) *
> > RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > > +             }                                                       \
> > > +             next_hops[i] = tmp >> 1;                                \
> > > +     }                                                               \
> > > +     for (; i < n; i++) {                                            \
> > > +             tmp = ((type *)fib->tbl24)[ips[i] >> 8];                \
> > > +             if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==       \
> > > +                     RTE_DIR24_8_VALID_EXT_ENT)) {                   \
> > > +                     tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +     \
> > > +                             ((tmp >> 1) *
> > RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > > +             }                                                       \
> > > +             next_hops[i] = tmp >> 1;                                \
> > > +     }                                                               \
> > > +     return 0;                                                       \
> > > +}                                                                    \
> >
> > What is the advantage of doing this as a macro? Unless I'm missing
> > something "suffix" is never actually used in the function at all, and you
> > reference the size of the data from fix->nh_sz. Therefore there can be no
> > performance benefit from having such a lookup function, that I can see.
> >
> suffix is to create 4 different function names, look at the end of
> rte_dir24_8.c, there are
> LOOKUP_FUNC(1b, uint8_t, 5)
> LOOKUP_FUNC(2b, uint16_t, 6)
> LOOKUP_FUNC(4b, uint32_t, 15)
> LOOKUP_FUNC(8b, uint64_t, 12)
> 
> Now I made single lookup function that references the size of the data from
> fib->nh_sz instead of static casting to passed type in macro.
> was:
> BULK RIB Lookup: 24.2 cycles (fails = 0.0%)
> become:
> BULK RIB Lookup: 26.1 cycles (fails = 0.0%)
> proc E3-1230v1 at 3.6Ghz
> 
> 
So you are saying that it turned out to be faster to do a lookup of the
size rather than hardcoding it. Seems strange, but ok.
I'm still confused why the four functions with four different names. What
is different between the four implementations. Just the amount of
prefetching done? It could still be done by a single call with a
compile-time constant parameter. It's whats used a lot in the ring library
and works well there.

> >
> > Therefore, if performance is ok, I suggest just making a single lookup_bulk
> > function that works with all sizes - as the inlined lookup function does in
> > the header.
> >
> > Alternatively, if you do want specific functions for each
> > entry size, you still don't need macros. Write a single function that takes
> > as a final parameter the entry-size and use that in calculations rather
> > than nh_sz.  Then wrap that function in the set of public ones, passing in
> > the final size parameter explicitly as "1", "2", "4" or "8". The compiler
> > will then know that as a compile-time constant and generate the correct
> > code for each size. However, for this path I suggest you check for any
> > resulting performance improvement, e.g. with l3fwd, as I think it's not
> > likely to be significant.
> >
> > > +
<snip> 
> >
> > > +{
> > > +     uint64_t res;
> > > +     struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> > > +
> > > +     RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) ||
> > > +             (next_hop == NULL)), -EINVAL);
> > > +
> > > +     res = RTE_DIR24_8_GET_TBL24(fib, ip);
> > > +     if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) ==
> > > +             RTE_DIR24_8_VALID_EXT_ENT)) {
> > > +             res = RTE_DIR24_8_GET_TBL8(fib, res, ip);
> > > +     }
> > > +     *next_hop = res >> 1;
> > > +     return 0;
> > > +}
> >
> > Do we need this static inline function? Can the bulk functions do on their
> > own? If we can remove this, we can move the most of the header file
> > contents, especially the structures, out of the public header. That would
> > greatly improve the ease with which ABI can be maintained.
> >
> It was done in some manner to LPM. There was separate single lookup and
> bulk versions.
> Of course it is possible to remove this function at all and use bulk
> version to lookup single packet. But I thought maybe somebody could use it.
> 

Yes, I understand that. However, if it's unlikely to be used, I would
prioritize having ABI-friendliness over having it.

/Bruce
Previous message: [dpdk-dev] [PATCH v2 1/2] Add RIB library
Next message: [dpdk-dev] [PATCH v3 1/2] Add RIB library
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list