[dpdk-dev] [RFC] hash: introduce resizable hash list

Bing Zhao bingz at mellanox.com
Tue Jun 13 08:34:17 CEST 2023

Previous message (by thread): [dpdk-dev] [RFC] hash: introduce resizable hash list
Next message (by thread): [dpdk-dev] [EXT] [PATCH 1/5] examples/l2fwd-event: free resources in case of error
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen at networkplumber.org>
> Sent: Tuesday, June 13, 2023 12:44 AM
> To: Bing Zhao <bingz at mellanox.com>
> Cc: yipeng1.wang at intel.com; sameh.gobriel at intel.com;
> bruce.richardson at intel.com; pablo.de.lara.guarch at intel.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC] hash: introduce resizable hash list
> 
> On Wed, 28 Aug 2019 14:51:48 +0800
> Bing Zhao <bingz at mellanox.com> wrote:
> 
> > In the current hash library, there are already two hash tables and
> > several hash calculation functions.
> >
> > FBK hash table is very lightweight, but the key length is limited to
> > 4 bytes and the size of the table is fixed during startup.
> >
> > Cuckoo hash table is a quite great idea and nice implementation in the
> > library, it is efficient and flexible. After some study of the code
> > and information from internet, I find that the table extension is some
> > pain point (correct me if anything wrong). It means that we need to
> > allocate the tables in advance by defining a maximum size.
> > So there is a chance that we waste some unused memory. Right now there
> > is an extendable buckets table, but it seems that the number is also
> > fixed and the same as the primary table.
> > Take the flows information for example, we may support as many as
> > several millions of flows. In some case, only several hundreds of
> > flows will be created but in the other case, millions of flows are
> > needed. So this means that we need to create the hash table to support
> > millions of elements in advance during the startup time. If one key of
> > flow information will consume 64B and 1M flows, so it will occupy more
> > than one hundred MB memory (the table fields included). What is worse
> > if there is only 2M + several elements, it needs to create a table of
> > 4M (or more: depends on the hash collision rate) and it is some
> > wasting of the memory.
> >
> > In order to handle this, an resizable hash list is introduced.
> > The table could start with a small number of the elements and be
> > allocated dynamically during the runtime. In the meanwhile, an
> > on-demand list splitting mechanism is used in order not to make a
> > significant performance degradation. Then there is no need to re-hash
> > and relocate all the existing elements in the table when the table is
> > extended.
> >
> > The brief design is as below:
> > 1. The table is consisted of LIST header array and the LIST entries.
> >    In each entry, a pre-calculated hash signature is stored and is
> >    used to decide which header should it be linked to, by using
> >    "AND" with the mask to select the LSBs of the signature.
> > 2. The header array size could start with a very small number, and a
> >    specified max number of each list is used to check if a table
> >    extension is needed.
> > 3. When the max number on any of list is reached, the header array
> >    size will be doubled. Then each entries linked to this list will
> >    be split into two lists with the new mask (one more bit 1 in
> >    the mask, e.g. 0xfff -> 0x1fff). And a global shift number and
> >    local shift number of each list is used for the further checking.
> > 4. When some other list is being accessed, a comparison for the shift
> >    numbers is used to check if the splitting of this list is needed.
> >    If so, then there will be two conditions:
> >    a. If the local shift number is only 1 less than global or
> >       non-zero, then this list is the one that needs to be split.
> >    b. If more than 1, then it means that the table is extended more
> >       than once. And If the local shift is zero, a mechanism is used
> >       to find the last unsplit list.
> >    And then the list will be split into 2/4/8... lists depends on
> >    the gap. All the entries then will linked to the proper header.
> > So, each time when the hlist APIs are called, only one list will be
> > traversed but not all the lists. And since there is parameter to set a
> > max number of entries in a list. The traversal time is predictable and
> > these will not cause a significant performance degradation.
> >
> > BR. Bing
> >
> >
> > Bing Zhao (1):
> >   rte_hash: introduce hash list into hash lib
> >
> >  lib/librte_hash/Makefile             |   2 +
> >  lib/librte_hash/rte_hash_version.map |  15 +
> >  lib/librte_hash/rte_hlist.c          | 687
> +++++++++++++++++++++++++++++++++++
> >  lib/librte_hash/rte_hlist.h          | 563 ++++++++++++++++++++++++++++
> >  4 files changed, 1267 insertions(+)
> >  create mode 100644 lib/librte_hash/rte_hlist.c  create mode 100644
> > lib/librte_hash/rte_hlist.h
> >
> 
> A resizeable hash table (with RCU?) would be good to have.
> See old article about this https://lwn.net/Articles/573431/ But this patch got
> review feedback and no follow up.
> Marking it as Changes Requested, maybe someone can resuscitate it later.

To my understanding, resizable hash may be needed by the application in the real life. Sometimes the maximal entries may not be known during startup, and it is some waste to allocate a huge amount of memory. (Probably there would be some failure to insert, even)
The extendable hash list would be the simplest way to go. While I am not a researcher, maybe there is some other advanced way to go in the papers and we need to translate it into code. Any ideas?

BR. Bing

Previous message (by thread): [dpdk-dev] [RFC] hash: introduce resizable hash list
Next message (by thread): [dpdk-dev] [EXT] [PATCH 1/5] examples/l2fwd-event: free resources in case of error
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list