[dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library

Ilya Maximets i.maximets at samsung.com
Thu Nov 22 08:31:08 CET 2018


Hi.
Is the any differentiation points with liburcu [1] ?
Is there any profit having own implementation inside DPDK ?

[1] http://liburcu.org/
    https://lwn.net/Articles/573424/

Best regards, Ilya Maximets.

> Lock-less data structures provide scalability and determinism.
> They enable use cases where locking may not be allowed
> (for ex: real-time applications).
> 
> In the following paras, the term 'memory' refers to memory allocated
> by typical APIs like malloc or anything that is representative of
> memory, for ex: an index of a free element array.
> 
> Since these data structures are lock less, the writers and readers
> are accessing the data structures simultaneously. Hence, while removing
> an element from a data structure, the writers cannot return the memory
> to the allocator, without knowing that the readers are not
> referencing that element/memory anymore. Hence, it is required to
> separate the operation of removing an element into 2 steps:
> 
> Delete: in this step, the writer removes the element from the
> data structure but does not return the associated memory to the allocator.
> This will ensure that new readers will not get a reference to the removed
> element. Removing the reference is an atomic operation.
> 
> Free: in this step, the writer returns the memory to the
> memory allocator, only after knowing that all the readers have stopped
> referencing the removed element.
> 
> This library helps the writer determine when it is safe to free the
> memory.
> 
> This library makes use of Thread Quiescent State (TQS). TQS can be
> defined as 'any point in the thread execution where the thread does
> not hold a reference to shared memory'. It is upto the application to
> determine its quiescent state. Let us consider the following diagram:
> 
>     Time -------------------------------------------------->
> 
>                                 |     |
>   RT1   $++++****D1****+++***D2*|**+++|+++**D3*****++++$
>                                 |     |
>   RT2          $++++****D1****++|+**D2|***++++++**D3*****++++$
>                                 |     |
>   RT3      $++++****D1****+++***|D2***|++++++**D2*****++++$
>                                 |     |
>                                 |<--->|
>                                Del | Free
>                                    |
>                               Cannot free memory
>                               during this period
> 
> RTx - Reader thread
> < and > - Start and end of while(1) loop
> ***Dx*** - Reader thread is accessing the shared data structure Dx.
>            i.e. critical section.
> +++ - Reader thread is not accessing any shared data structure.
>       i.e. non critical section or quiescent state.
> Del - Point in time when the reference to the entry is removed using
>       atomic operation.
> Free - Point in time when the writer can free the entry.
> 
> As shown thread RT1 acesses data structures D1, D2 and D3. When it is
> accessing D2, if the writer has to remove an element from D2, the
> writer cannot return the memory associated with that element to the
> allocator. The writer can return the memory to the allocator only after
> the reader stops referencng D2. In other words, reader thread RT1
> has to enter a quiescent state.
> 
> Similarly, since thread RT3 is also accessing D2, writer has to wait till
> RT3 enters quiescent state as well.
> 
> However, the writer does not need to wait for RT2 to enter quiescent state.
> Thread RT2 was not accessing D2 when the delete operation happened.
> So, RT2 will not get a reference to the deleted entry.
> 
> It can be noted that, the critical sections for D2 and D3 are quiescent states
> for D1. i.e. for a given data structure Dx, any point in the thread execution
> that does not reference Dx is a quiescent state.
> 
> For DPDK applications, the start and end of while(1) loop (where no shared
> data structures are getting accessed) act as perfect quiescent states. This
> will combine all the shared data structure accesses into a single critical
> section and keeps the over head introduced by this library to the minimum.
> 
> However, the length of the critical section and the number of reader threads
> is proportional to the time taken to identify the end of critical section.
> So, if the application desires, it should be possible to identify the end
> of critical section for each data structure.
> 
> To provide the required flexibility, this library has a concept of TQS
> variable. The application can create one or more TQS variables to help it
> track the end of one or more critical sections.
> 
> The application can create a TQS variable using the API rte_tqs_alloc.
> It takes a mask of lcore IDs that will report their quiescent states
> using this variable. This mask can be empty to start with.
> 
> rte_tqs_register_lcore API will register a reader thread to report its
> quiescent state. This can be called from any control plane thread or from
> the reader thread. The application can create a TQS variable with no reader
> threads and add the threads dynamically using this API.
> 
> The application can trigger the reader threads to report their quiescent
> state status by calling the API rte_tqs_start. It is possible for multiple
> writer threads to query the quiescent state status simultaneously. Hence,
> rte_tqs_start returns a token to each caller.
> 
> The application has to call rte_tqs_check API with the token to get the
> current status. Option to block till all the threads enter the quiescent
> state is provided. If this API indicates that all the threads have entered
> the quiescent state, the application can free the deleted entry.
> 
> The separation of triggering the reporting from querying the status provides
> the writer threads flexibility to do useful work instead of waiting for the
> reader threads to enter the quiescent state.
> 
> rte_tqs_unregister_lcore API will remove a reader thread from reporting its
> quiescent state using a TQS variable. The rte_tqs_check API will not wait
> for this reader thread to report the quiescent state status anymore.
> 
> Finally, a TQS variable can be deleted by calling rte_tqs_free API.
> Application must make sure that the reader threads are not referencing the
> TQS variable anymore before deleting it.
> 
> The reader threads should call rte_tqs_update API to indicate that they
> entered a quiescent state. This API checks if a writer has triggered a
> quiescent state query and update the state accordingly.
> 
> Next Steps:
> 1) Add more test cases
> 2) Convert to patch
> 3) Incorporate feedback from community
> 4) Add documentation
> 
> Dharmik Thakkar (1):
>   test/tqs: Add API and functional tests
> 
> Honnappa Nagarahalli (2):
>   log: add TQS log type
>   tqs: add thread quiescent state library
> 
>  config/common_base                      |   6 +
>  lib/Makefile                            |   2 +
>  lib/librte_eal/common/include/rte_log.h |   1 +
>  lib/librte_tqs/Makefile                 |  23 +
>  lib/librte_tqs/meson.build              |   5 +
>  lib/librte_tqs/rte_tqs.c                | 249 +++++++++++
>  lib/librte_tqs/rte_tqs.h                | 352 +++++++++++++++
>  lib/librte_tqs/rte_tqs_version.map      |  16 +
>  lib/meson.build                         |   2 +-
>  mk/rte.app.mk                           |   1 +
>  test/test/Makefile                      |   2 +
>  test/test/autotest_data.py              |   6 +
>  test/test/meson.build                   |   5 +-
>  test/test/test_tqs.c                    | 540 ++++++++++++++++++++++++
>  14 files changed, 1208 insertions(+), 2 deletions(-)
>  create mode 100644 lib/librte_tqs/Makefile
>  create mode 100644 lib/librte_tqs/meson.build
>  create mode 100644 lib/librte_tqs/rte_tqs.c
>  create mode 100644 lib/librte_tqs/rte_tqs.h
>  create mode 100644 lib/librte_tqs/rte_tqs_version.map
>  create mode 100644 test/test/test_tqs.c
> 
> -- 
> 2.17.1



More information about the dev mailing list