[dpdk-dev] [PATCH v2 1/2] Add RIB library

Vladimir Medvedkin medvedkinv at gmail.com
Thu Mar 29 22:11:20 CEST 2018


2018-03-29 13:27 GMT+03:00 Bruce Richardson <bruce.richardson at intel.com>:

> On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote:
> > RIB is an alternative to current LPM library.
> > It solves the following problems
> >  - Increases the speed of control plane operations against lpm such as
> >    adding/deleting routes
> >  - Adds abstraction from dataplane algorithms, so it is possible to add
> >    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
> >    in addition to current dir24_8
> >  - It is possible to keep user defined application specific additional
> >    information in struct rte_rib_node which represents route entry.
> >    It can be next hop/set of next hops (i.e. active and feasible),
> >    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
> >    plenty of additional control plane information.
> >  - For dir24_8 implementation it is possible to remove
> rte_lpm_tbl_entry.depth
> >    field that helps to save 6 bits.
> >  - Also new dir24_8 implementation supports different next_hop sizes
> >    (1/2/4/8 bytes per next hop)
> >  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate ternary
> operator.
> >    Instead it returns special default value if there is no route.
> >
> > Signed-off-by: Medvedkin Vladimir <medvedkinv at gmail.com>
> > ---
>
> Hi again,
>
> some initial comments on the dir24_8 files below.
>
> /Bruce
>
> >  config/common_base                 |   6 +
> >  doc/api/doxy-api.conf              |   1 +
> >  lib/Makefile                       |   2 +
> >  lib/librte_rib/Makefile            |  22 ++
> >  lib/librte_rib/rte_dir24_8.c       | 482 ++++++++++++++++++++++++++++++
> +++
> >  lib/librte_rib/rte_dir24_8.h       | 116 ++++++++
> >  lib/librte_rib/rte_rib.c           | 526 ++++++++++++++++++++++++++++++
> +++++++
> >  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
> >  lib/librte_rib/rte_rib_version.map |  18 ++
> >  mk/rte.app.mk                      |   1 +
> >  10 files changed, 1496 insertions(+)
> >  create mode 100644 lib/librte_rib/Makefile
> >  create mode 100644 lib/librte_rib/rte_dir24_8.c
> >  create mode 100644 lib/librte_rib/rte_dir24_8.h
> >  create mode 100644 lib/librte_rib/rte_rib.c
> >  create mode 100644 lib/librte_rib/rte_rib.h
> >  create mode 100644 lib/librte_rib/rte_rib_version.map
> >
>
> <snip>
>
> > diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
> > new file mode 100644
> > index 0000000..a12f882
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_dir24_8.c
> > @@ -0,0 +1,482 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv at gmail.com>
> > + */
> > +
> > +#include <stdint.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <rte_debug.h>
> > +#include <rte_malloc.h>
> > +#include <rte_prefetch.h>
> > +#include <rte_errno.h>
> > +
> > +#include <inttypes.h>
> > +
> > +#include <rte_memory.h>
> > +#include <rte_branch_prediction.h>
> > +
> > +#include <rte_rib.h>
> > +#include <rte_dir24_8.h>
> > +
> > +#define BITMAP_SLAB_BIT_SIZE_LOG2    6
> > +#define BITMAP_SLAB_BIT_SIZE         (1 << BITMAP_SLAB_BIT_SIZE_LOG2)
> > +#define BITMAP_SLAB_BITMASK          (BITMAP_SLAB_BIT_SIZE - 1)
> > +
> > +#define ROUNDUP(x, y)         RTE_ALIGN_CEIL(x, (1 << (32 - y)))
> > +
> > +static __rte_always_inline __attribute__((pure)) void *
> > +get_tbl24_p(struct rte_dir24_8_tbl *fib, uint32_t ip)
> > +{
> > +     return (void *)&((uint8_t *)fib->tbl24)[(ip &
> > +             RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)];
> > +}
> > +
> > +#define LOOKUP_FUNC(suffix, type, bulk_prefetch)                     \
> > +int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t
> *ips,       \
> > +     uint64_t *next_hops, const unsigned n)                          \
> > +{                                                                    \
> > +     struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;  \
> > +     uint64_t tmp;                                                   \
> > +     uint32_t i;                                                     \
> > +     uint32_t prefetch_offset = RTE_MIN((unsigned)bulk_prefetch, n); \
> > +                                                                     \
> > +     RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) ||       \
> > +             (next_hops == NULL)), -EINVAL);                         \
> > +                                                                     \
> > +     for (i = 0; i < prefetch_offset; i++)                           \
> > +             rte_prefetch0(get_tbl24_p(fib, ips[i]));                \
> > +     for (i = 0; i < (n - prefetch_offset); i++) {                   \
> > +             rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset]));
> \
> > +             tmp = ((type *)fib->tbl24)[ips[i] >> 8];                \
> > +             if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==       \
> > +                     RTE_DIR24_8_VALID_EXT_ENT)) {                   \
> > +                     tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +     \
> > +                             ((tmp >> 1) *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > +             }                                                       \
> > +             next_hops[i] = tmp >> 1;                                \
> > +     }                                                               \
> > +     for (; i < n; i++) {                                            \
> > +             tmp = ((type *)fib->tbl24)[ips[i] >> 8];                \
> > +             if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==       \
> > +                     RTE_DIR24_8_VALID_EXT_ENT)) {                   \
> > +                     tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +     \
> > +                             ((tmp >> 1) *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
> > +             }                                                       \
> > +             next_hops[i] = tmp >> 1;                                \
> > +     }                                                               \
> > +     return 0;                                                       \
> > +}                                                                    \
>
> What is the advantage of doing this as a macro? Unless I'm missing
> something "suffix" is never actually used in the function at all, and you
> reference the size of the data from fix->nh_sz. Therefore there can be no
> performance benefit from having such a lookup function, that I can see.
>
suffix is to create 4 different function names, look at the end of
rte_dir24_8.c, there are
LOOKUP_FUNC(1b, uint8_t, 5)
LOOKUP_FUNC(2b, uint16_t, 6)
LOOKUP_FUNC(4b, uint32_t, 15)
LOOKUP_FUNC(8b, uint64_t, 12)

Now I made single lookup function that references the size of the data from
fib->nh_sz instead of static casting to passed type in macro.
was:
BULK RIB Lookup: 24.2 cycles (fails = 0.0%)
become:
BULK RIB Lookup: 26.1 cycles (fails = 0.0%)
proc E3-1230v1 at 3.6Ghz


>
> Therefore, if performance is ok, I suggest just making a single lookup_bulk
> function that works with all sizes - as the inlined lookup function does in
> the header.
>
> Alternatively, if you do want specific functions for each
> entry size, you still don't need macros. Write a single function that takes
> as a final parameter the entry-size and use that in calculations rather
> than nh_sz.  Then wrap that function in the set of public ones, passing in
> the final size parameter explicitly as "1", "2", "4" or "8". The compiler
> will then know that as a compile-time constant and generate the correct
> code for each size. However, for this path I suggest you check for any
> resulting performance improvement, e.g. with l3fwd, as I think it's not
> likely to be significant.
>
> > +
> > +static void
> > +write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int
> n)
> > +{
> > +     int i;
> > +     uint8_t *ptr8 = (uint8_t *)ptr;
> > +     uint16_t *ptr16 = (uint16_t *)ptr;
> > +     uint32_t *ptr32 = (uint32_t *)ptr;
> > +     uint64_t *ptr64 = (uint64_t *)ptr;
> > +
> > +     switch (size) {
> > +     case RTE_DIR24_8_1B:
> > +             for (i = 0; i < n; i++)
> > +                     ptr8[i] = (uint8_t)val;
> > +             break;
> > +     case RTE_DIR24_8_2B:
> > +             for (i = 0; i < n; i++)
> > +                     ptr16[i] = (uint16_t)val;
> > +             break;
> > +     case RTE_DIR24_8_4B:
> > +             for (i = 0; i < n; i++)
> > +                     ptr32[i] = (uint32_t)val;
> > +             break;
> > +     case RTE_DIR24_8_8B:
> > +             for (i = 0; i < n; i++)
> > +                     ptr64[i] = (uint64_t)val;
> > +             break;
> > +     }
> > +}
> > +
> > +static int
> > +tbl8_get_idx(struct rte_dir24_8_tbl *fib)
> > +{
> > +     uint32_t i;
> > +     int bit_idx;
> > +
> > +     for (i = 0; (i < (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2))
> &&
> > +             (fib->tbl8_idxes[i] == UINT64_MAX); i++)
> > +             ;
> > +     if (i <= (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) {
> > +             bit_idx = __builtin_ctzll(~fib->tbl8_idxes[i]);
> > +             fib->tbl8_idxes[i] |= (1ULL << bit_idx);
> > +             return (i << BITMAP_SLAB_BIT_SIZE_LOG2) + bit_idx;
> > +     }
> > +     return -ENOSPC;
> > +}
> > +
> > +static inline void
> > +tbl8_free_idx(struct rte_dir24_8_tbl *fib, int idx)
> > +{
> > +     fib->tbl8_idxes[idx >> BITMAP_SLAB_BIT_SIZE_LOG2] &=
> > +             ~(1ULL << (idx & BITMAP_SLAB_BITMASK));
> > +}
> > +
> > +static int
> > +tbl8_alloc(struct rte_dir24_8_tbl *fib, uint64_t nh)
> > +{
> > +     int     tbl8_idx;
> > +     uint8_t *tbl8_ptr;
> > +
> > +     tbl8_idx = tbl8_get_idx(fib);
> > +     if (tbl8_idx < 0)
> > +             return tbl8_idx;
> > +     tbl8_ptr = (uint8_t *)fib->tbl8 +
> > +             ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
> > +             fib->nh_sz);
> > +     /*Init tbl8 entries with nexthop from tbl24*/
> > +     write_to_fib((void *)tbl8_ptr, nh|
> > +             RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz,
> > +             RTE_DIR24_8_TBL8_GRP_NUM_ENT);
> > +     return tbl8_idx;
> > +}
> > +
> > +static void
> > +tbl8_recycle(struct rte_dir24_8_tbl *fib, uint32_t ip, uint64_t
> tbl8_idx)
> > +{
> > +     int i;
> > +     uint64_t nh;
> > +     uint8_t *ptr8;
> > +     uint16_t *ptr16;
> > +     uint32_t *ptr32;
> > +     uint64_t *ptr64;
> > +
> > +     switch (fib->nh_sz) {
> > +     case RTE_DIR24_8_1B:
> > +             ptr8 = &((uint8_t *)fib->tbl8)[tbl8_idx *
> > +                             RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > +             nh = *ptr8;
> > +             for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > +                     if (nh != ptr8[i])
> > +                             return;
> > +             }
> > +             ((uint8_t *)fib->tbl24)[ip >> 8] =
> > +                     nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > +             for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > +                     ptr8[i] = 0;
> > +             break;
> > +     case RTE_DIR24_8_2B:
> > +             ptr16 = &((uint16_t *)fib->tbl8)[tbl8_idx *
> > +                             RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > +             nh = *ptr16;
> > +             for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > +                     if (nh != ptr16[i])
> > +                             return;
> > +             }
> > +             ((uint16_t *)fib->tbl24)[ip >> 8] =
> > +                     nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > +             for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > +                     ptr16[i] = 0;
> > +             break;
> > +     case RTE_DIR24_8_4B:
> > +             ptr32 = &((uint32_t *)fib->tbl8)[tbl8_idx *
> > +                             RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > +             nh = *ptr32;
> > +             for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > +                     if (nh != ptr32[i])
> > +                             return;
> > +             }
> > +             ((uint32_t *)fib->tbl24)[ip >> 8] =
> > +                     nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > +             for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > +                     ptr32[i] = 0;
> > +             break;
> > +     case RTE_DIR24_8_8B:
> > +             ptr64 = &((uint64_t *)fib->tbl8)[tbl8_idx *
> > +                             RTE_DIR24_8_TBL8_GRP_NUM_ENT];
> > +             nh = *ptr64;
> > +             for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
> > +                     if (nh != ptr64[i])
> > +                             return;
> > +             }
> > +             ((uint64_t *)fib->tbl24)[ip >> 8] =
> > +                     nh & ~RTE_DIR24_8_VALID_EXT_ENT;
> > +             for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
> > +                     ptr64[i] = 0;
> > +             break;
> > +     }
> > +     tbl8_free_idx(fib, tbl8_idx);
> > +}
> > +
> > +static int
> > +install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t
> redge,
> > +     uint64_t next_hop)
> > +{
> > +     uint64_t        tbl24_tmp;
> > +     int     tbl8_idx;
> > +     int tmp_tbl8_idx;
> > +     uint8_t *tbl8_ptr;
> > +
> > +     /*case for 0.0.0.0/0*/
> > +     if (unlikely((ledge == 0) && (redge == 0))) {
> > +             write_to_fib(fib->tbl24, next_hop << 1, fib->nh_sz, 1 <<
> 24);
> > +             return 0;
> > +     }
> > +     if (ROUNDUP(ledge, 24) <= redge) {
> > +             if (ledge < ROUNDUP(ledge, 24)) {
> > +                     tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> > +                     if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > +                             RTE_DIR24_8_VALID_EXT_ENT) {
> > +                             tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > +                             tmp_tbl8_idx = tbl8_get_idx(fib);
> > +                             if ((tbl8_idx < 0) || (tmp_tbl8_idx < 0))
> > +                                     return -ENOSPC;
> > +                             tbl8_free_idx(fib, tmp_tbl8_idx);
> > +                             /*update dir24 entry with tbl8 index*/
> > +                             write_to_fib(get_tbl24_p(fib, ledge),
> > +                                     (tbl8_idx << 1)|
> > +                                     RTE_DIR24_8_VALID_EXT_ENT,
> > +                                     fib->nh_sz, 1);
> > +                     } else
> > +                             tbl8_idx = tbl24_tmp >> 1;
> > +                     tbl8_ptr = (uint8_t *)fib->tbl8 +
> > +                             (((tbl8_idx *
> RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> > +                             (ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> > +                             fib->nh_sz);
> > +                     /*update tbl8 with new next hop*/
> > +                     write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > +                             RTE_DIR24_8_VALID_EXT_ENT,
> > +                             fib->nh_sz, ROUNDUP(ledge, 24) - ledge);
> > +                     tbl8_recycle(fib, ledge, tbl8_idx);
> > +             }
> > +             if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK))
> {
> > +                     write_to_fib(get_tbl24_p(fib, ROUNDUP(ledge, 24)),
> > +                             next_hop << 1, fib->nh_sz,
> > +                             ((redge & RTE_DIR24_8_TBL24_MASK) -
> > +                             ROUNDUP(ledge, 24)) >> 8);
> > +             }
> > +             if (redge & ~RTE_DIR24_8_TBL24_MASK) {
> > +                     tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge);
> > +                     if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > +                                     RTE_DIR24_8_VALID_EXT_ENT) {
> > +                             tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > +                             if (tbl8_idx < 0)
> > +                                     return -ENOSPC;
> > +                             /*update dir24 entry with tbl8 index*/
> > +                             write_to_fib(get_tbl24_p(fib, redge),
> > +                                     (tbl8_idx << 1)|
> > +                                     RTE_DIR24_8_VALID_EXT_ENT,
> > +                                     fib->nh_sz, 1);
> > +                     } else
> > +                             tbl8_idx = tbl24_tmp >> 1;
> > +                     tbl8_ptr = (uint8_t *)fib->tbl8 +
> > +                             ((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT)
> <<
> > +                             fib->nh_sz);
> > +                     /*update tbl8 with new next hop*/
> > +                     write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > +                             RTE_DIR24_8_VALID_EXT_ENT,
> > +                             fib->nh_sz, redge &
> ~RTE_DIR24_8_TBL24_MASK);
> > +                     tbl8_recycle(fib, redge, tbl8_idx);
> > +             }
> > +     } else {
> > +             tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
> > +             if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
> > +                     RTE_DIR24_8_VALID_EXT_ENT) {
> > +                     tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
> > +                     if (tbl8_idx < 0)
> > +                             return -ENOSPC;
> > +                     /*update dir24 entry with tbl8 index*/
> > +                     write_to_fib(get_tbl24_p(fib, ledge),
> > +                             (tbl8_idx << 1)|
> > +                             RTE_DIR24_8_VALID_EXT_ENT,
> > +                             fib->nh_sz, 1);
> > +             } else
> > +                     tbl8_idx = tbl24_tmp >> 1;
> > +             tbl8_ptr = (uint8_t *)fib->tbl8 +
> > +                     (((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
> > +                     (ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
> > +                     fib->nh_sz);
> > +             /*update tbl8 with new next hop*/
> > +             write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
> > +                     RTE_DIR24_8_VALID_EXT_ENT,
> > +                     fib->nh_sz, redge - ledge);
> > +             tbl8_recycle(fib, ledge, tbl8_idx);
> > +     }
> > +     return 0;
> > +}
> > +
> > +static int
> > +modify_fib(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> > +     uint64_t next_hop)
> > +{
> > +     struct rte_rib_node *tmp = NULL;
> > +     struct rte_dir24_8_tbl *fib;
> > +     uint32_t ledge, redge;
> > +     int ret;
> > +
> > +     fib = rib->fib;
> > +
> > +     if (next_hop > DIR24_8_MAX_NH(fib))
> > +             return -EINVAL;
> > +
> > +     ledge = ip;
> > +     do {
> > +             tmp = rte_rib_tree_get_nxt(rib, ip, depth, tmp,
> > +                     RTE_RIB_GET_NXT_COVER);
> > +             if (tmp != NULL) {
> > +                     if (tmp->depth == depth)
> > +                             continue;
> > +                     redge = tmp->key;
> > +                     if (ledge == redge) {
> > +                             ledge = redge +
> > +                                     (uint32_t)(1ULL << (32 -
> tmp->depth));
> > +                             continue;
> > +                     }
> > +                     ret = install_to_fib(fib, ledge, redge,
> > +                             next_hop);
> > +                     if (ret != 0)
> > +                             return ret;
> > +                     ledge = redge +
> > +                             (uint32_t)(1ULL << (32 - tmp->depth));
> > +             } else {
> > +                     redge = ip + (uint32_t)(1ULL << (32 - depth));
> > +                     ret = install_to_fib(fib, ledge, redge,
> > +                             next_hop);
> > +                     if (ret != 0)
> > +                             return ret;
> > +             }
> > +     } while (tmp);
> > +
> > +     return 0;
> > +}
> > +
> > +int
> > +rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t depth,
> > +     uint64_t next_hop, enum rte_rib_op op)
> > +{
> > +     struct rte_dir24_8_tbl *fib;
> > +     struct rte_rib_node *tmp = NULL;
> > +     struct rte_rib_node *node;
> > +     struct rte_rib_node *parent;
> > +     int ret = 0;
> > +
> > +     if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> > +             return -EINVAL;
> > +
> > +     fib = rib->fib;
> > +     RTE_ASSERT(fib);
> > +
> > +     ip &= (uint32_t)(UINT64_MAX << (32 - depth));
> > +
> > +     node = rte_rib_tree_lookup_exact(rib, ip, depth);
> > +     switch (op) {
> > +     case RTE_RIB_ADD:
> > +             if (node != NULL) {
> > +                     if (node->nh == next_hop)
> > +                             return 0;
> > +                     ret = modify_fib(rib, ip, depth, next_hop);
> > +                     if (ret == 0)
> > +                             node->nh = next_hop;
> > +                     return 0;
> > +             }
> > +             if (depth > 24) {
> > +                     tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
> > +                             RTE_RIB_GET_NXT_COVER);
> > +                     if ((tmp == NULL) &&
> > +                             (fib->cur_tbl8s >= fib->number_tbl8s))
> > +                             return -ENOSPC;
> > +
> > +             }
> > +             node = rte_rib_tree_insert(rib, ip, depth);
> > +             if (node == NULL)
> > +                     return -rte_errno;
> > +             node->nh = next_hop;
> > +             parent = rte_rib_tree_lookup_parent(node);
> > +             if ((parent != NULL) && (parent->nh == next_hop))
> > +                     return 0;
> > +             ret = modify_fib(rib, ip, depth, next_hop);
> > +             if (ret) {
> > +                     rte_rib_tree_remove(rib, ip, depth);
> > +                     return ret;
> > +             }
> > +             if ((depth > 24) && (tmp == NULL))
> > +                     fib->cur_tbl8s++;
> > +             return 0;
> > +     case RTE_RIB_DEL:
> > +             if (node == NULL)
> > +                     return -ENOENT;
> > +
> > +             parent = rte_rib_tree_lookup_parent(node);
> > +             if (parent != NULL) {
> > +                     if (parent->nh != node->nh)
> > +                             ret = modify_fib(rib, ip, depth,
> parent->nh);
> > +             } else
> > +                     ret = modify_fib(rib, ip, depth, fib->def_nh);
> > +             if (ret == 0) {
> > +                     rte_rib_tree_remove(rib, ip, depth);
> > +                     if (depth > 24) {
> > +                             tmp = rte_rib_tree_get_nxt(rib, ip, 24,
> NULL,
> > +                                     RTE_RIB_GET_NXT_COVER);
> > +                             if (tmp == NULL)
> > +                                     fib->cur_tbl8s--;
> > +                     }
> > +             }
> > +             return ret;
> > +     default:
> > +             break;
> > +     }
> > +     return -EINVAL;
> > +}
> > +
> > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int
> socket_id,
> > +     enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh)
> > +{
> > +     char mem_name[RTE_RIB_NAMESIZE];
> > +     struct rte_dir24_8_tbl *fib;
> > +
> > +     snprintf(mem_name, sizeof(mem_name), "FIB_%s", name);
> > +     fib = rte_zmalloc_socket(name, sizeof(struct rte_dir24_8_tbl) +
> > +             RTE_DIR24_8_TBL24_NUM_ENT * (1 << nh_sz),
> RTE_CACHE_LINE_SIZE,
> > +             socket_id);
> > +     if (fib == NULL)
> > +             return fib;
> > +
> > +     snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name);
> > +     fib->tbl8 = rte_zmalloc_socket(mem_name,
> RTE_DIR24_8_TBL8_GRP_NUM_ENT *
> > +                     (1 << nh_sz) * RTE_DIR24_8_TBL8_NUM_GROUPS,
> > +                     RTE_CACHE_LINE_SIZE, socket_id);
> > +     if (fib->tbl8 == NULL) {
> > +             rte_free(fib);
> > +             return NULL;
> > +     }
> > +     fib->def_nh = def_nh;
> > +     fib->nh_sz = nh_sz;
> > +     fib->number_tbl8s = RTE_MIN((uint32_t)RTE_DIR24_8_TBL8_NUM_GROUPS,
> > +                             DIR24_8_MAX_NH(fib));
> > +
> > +     snprintf(mem_name, sizeof(mem_name), "TBL8_idxes_%s", name);
> > +     fib->tbl8_idxes = rte_zmalloc_socket(mem_name,
> > +                     RTE_ALIGN_CEIL(fib->number_tbl8s, 64) >> 3,
> > +                     RTE_CACHE_LINE_SIZE, socket_id);
> > +     if (fib->tbl8_idxes == NULL) {
> > +             rte_free(fib->tbl8);
> > +             rte_free(fib);
> > +             return NULL;
> > +     }
> > +
> > +     return fib;
> > +}
> > +
> > +void
> > +rte_dir24_8_free(void *fib_p)
> > +{
> > +     struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> > +
> > +     rte_free(fib->tbl8_idxes);
> > +     rte_free(fib->tbl8);
> > +     rte_free(fib);
> > +}
> > +
> > +LOOKUP_FUNC(1b, uint8_t, 5)
> > +LOOKUP_FUNC(2b, uint16_t, 6)
> > +LOOKUP_FUNC(4b, uint32_t, 15)
> > +LOOKUP_FUNC(8b, uint64_t, 12)
> > diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h
> > new file mode 100644
> > index 0000000..f779409
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_dir24_8.h
> > @@ -0,0 +1,116 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv at gmail.com>
> > + */
> > +
> > +#ifndef _RTE_DIR24_8_H_
> > +#define _RTE_DIR24_8_H_
> > +
> > +/**
> > + * @file
> > + * RTE Longest Prefix Match (LPM)
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/** @internal Total number of tbl24 entries. */
> > +#define RTE_DIR24_8_TBL24_NUM_ENT    (1 << 24)
> > +
> > +/** Maximum depth value possible for IPv4 LPM. */
> > +#define RTE_DIR24_8_MAX_DEPTH                32
> > +
> > +/** @internal Number of entries in a tbl8 group. */
> > +#define RTE_DIR24_8_TBL8_GRP_NUM_ENT 256
> > +
> > +/** @internal Total number of tbl8 groups in the tbl8. */
> > +#define RTE_DIR24_8_TBL8_NUM_GROUPS  65536
> > +
> > +/** @internal bitmask with valid and valid_group fields set */
> > +#define RTE_DIR24_8_VALID_EXT_ENT    0x01
> > +
> > +#define RTE_DIR24_8_TBL24_MASK               0xffffff00
> > +
> > +/** Size of nexthop (1 << nh_sz) bits */
> > +enum rte_dir24_8_nh_sz {
> > +     RTE_DIR24_8_1B,
> > +     RTE_DIR24_8_2B,
> > +     RTE_DIR24_8_4B,
> > +     RTE_DIR24_8_8B
> > +};
> > +
> > +
> > +#define DIR24_8_BITS_IN_NH(fib)              (8 * (1 << fib->nh_sz))
> > +#define DIR24_8_MAX_NH(fib)  ((1ULL << (DIR24_8_BITS_IN_NH(fib) - 1)) -
> 1)
> > +
> > +#define DIR24_8_TBL_IDX(a, fib)              ((a) >> (3 - fib->nh_sz))
> > +#define DIR24_8_PSD_IDX(a, fib)              ((a) & ((1 << (3 -
> fib->nh_sz)) - 1))
> > +
> > +#define DIR24_8_TBL24_VAL(ip)        (ip >> 8)
> > +#define DIR24_8_TBL8_VAL(res, ip)                                    \
> > +     ((res >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip)       \
> > +
> > +#define DIR24_8_LOOKUP_MSK                                           \
> > +     (((1ULL << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1)            \
> > +
> > +#define RTE_DIR24_8_GET_TBL24(fib, ip)
>      \
> > +     ((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip), fib)] >>    \
> > +     (DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip), fib) *                  \
> > +     DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)                 \
> > +
> > +#define RTE_DIR24_8_GET_TBL8(fib, res, ip)                           \
> > +     ((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip), fib)] >> \
> > +     (DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip), fib) *              \
> > +     DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)                 \
> >
> I would strongly suggest making each of the above macros into inline
> functions instead. It would allow easier readability since you have
> parameter types and can split things across lines easier.
> Also, some comments might be good too.
>
> +
> > +
> > +struct rte_dir24_8_tbl {
> > +     uint32_t        number_tbl8s;   /**< Total number of tbl8s. */
> > +     uint32_t        cur_tbl8s;      /**< Current cumber of tbl8s. */
> > +     uint64_t        def_nh;
> > +     enum rte_dir24_8_nh_sz  nh_sz;  /**< Size of nexthop entry */
> > +     uint64_t        *tbl8;          /**< LPM tbl8 table. */
> > +     uint64_t        *tbl8_idxes;
> > +     uint64_t        tbl24[0] __rte_cache_aligned; /**< LPM tbl24
> table. */
> > +};
> > +
> > +struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int
> socket_id,
> > +     enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh);
> > +void rte_dir24_8_free(void *fib_p);
> > +int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key,
> > +     uint8_t depth, uint64_t next_hop, enum rte_rib_op op);
> > +int rte_dir24_8_lookup_bulk_1b(void *fib_p, const uint32_t *ips,
> > +     uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_2b(void *fib_p, const uint32_t *ips,
> > +     uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_4b(void *fib_p, const uint32_t *ips,
> > +     uint64_t *next_hops, const unsigned n);
> > +int rte_dir24_8_lookup_bulk_8b(void *fib_p, const uint32_t *ips,
> > +     uint64_t *next_hops, const unsigned n);
> > +
> > +
> > +static inline int
> > +rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop)
>
> Why use void * as parameter, since the proper type is defined just above?

agree, will change to struct rte_dir24_8_tbl *


>
> > +{
> > +     uint64_t res;
> > +     struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
> > +
> > +     RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ip == NULL) ||
> > +             (next_hop == NULL)), -EINVAL);
> > +
> > +     res = RTE_DIR24_8_GET_TBL24(fib, ip);
> > +     if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) ==
> > +             RTE_DIR24_8_VALID_EXT_ENT)) {
> > +             res = RTE_DIR24_8_GET_TBL8(fib, res, ip);
> > +     }
> > +     *next_hop = res >> 1;
> > +     return 0;
> > +}
>
> Do we need this static inline function? Can the bulk functions do on their
> own? If we can remove this, we can move the most of the header file
> contents, especially the structures, out of the public header. That would
> greatly improve the ease with which ABI can be maintained.
>
It was done in some manner to LPM. There was separate single lookup and
bulk versions.
Of course it is possible to remove this function at all and use bulk
version to lookup single packet. But I thought maybe somebody could use it.


>
>
> +
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_DIR24_8_H_ */
> > +
>



-- 
Regards,
Vladimir


More information about the dev mailing list