[dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library
Ilya Maximets
i.maximets at samsung.com
Thu Nov 22 08:31:08 CET 2018
Hi.
Is the any differentiation points with liburcu [1] ?
Is there any profit having own implementation inside DPDK ?
[1] http://liburcu.org/
https://lwn.net/Articles/573424/
Best regards, Ilya Maximets.
> Lock-less data structures provide scalability and determinism.
> They enable use cases where locking may not be allowed
> (for ex: real-time applications).
>
> In the following paras, the term 'memory' refers to memory allocated
> by typical APIs like malloc or anything that is representative of
> memory, for ex: an index of a free element array.
>
> Since these data structures are lock less, the writers and readers
> are accessing the data structures simultaneously. Hence, while removing
> an element from a data structure, the writers cannot return the memory
> to the allocator, without knowing that the readers are not
> referencing that element/memory anymore. Hence, it is required to
> separate the operation of removing an element into 2 steps:
>
> Delete: in this step, the writer removes the element from the
> data structure but does not return the associated memory to the allocator.
> This will ensure that new readers will not get a reference to the removed
> element. Removing the reference is an atomic operation.
>
> Free: in this step, the writer returns the memory to the
> memory allocator, only after knowing that all the readers have stopped
> referencing the removed element.
>
> This library helps the writer determine when it is safe to free the
> memory.
>
> This library makes use of Thread Quiescent State (TQS). TQS can be
> defined as 'any point in the thread execution where the thread does
> not hold a reference to shared memory'. It is upto the application to
> determine its quiescent state. Let us consider the following diagram:
>
> Time -------------------------------------------------->
>
> | |
> RT1 $++++****D1****+++***D2*|**+++|+++**D3*****++++$
> | |
> RT2 $++++****D1****++|+**D2|***++++++**D3*****++++$
> | |
> RT3 $++++****D1****+++***|D2***|++++++**D2*****++++$
> | |
> |<--->|
> Del | Free
> |
> Cannot free memory
> during this period
>
> RTx - Reader thread
> < and > - Start and end of while(1) loop
> ***Dx*** - Reader thread is accessing the shared data structure Dx.
> i.e. critical section.
> +++ - Reader thread is not accessing any shared data structure.
> i.e. non critical section or quiescent state.
> Del - Point in time when the reference to the entry is removed using
> atomic operation.
> Free - Point in time when the writer can free the entry.
>
> As shown thread RT1 acesses data structures D1, D2 and D3. When it is
> accessing D2, if the writer has to remove an element from D2, the
> writer cannot return the memory associated with that element to the
> allocator. The writer can return the memory to the allocator only after
> the reader stops referencng D2. In other words, reader thread RT1
> has to enter a quiescent state.
>
> Similarly, since thread RT3 is also accessing D2, writer has to wait till
> RT3 enters quiescent state as well.
>
> However, the writer does not need to wait for RT2 to enter quiescent state.
> Thread RT2 was not accessing D2 when the delete operation happened.
> So, RT2 will not get a reference to the deleted entry.
>
> It can be noted that, the critical sections for D2 and D3 are quiescent states
> for D1. i.e. for a given data structure Dx, any point in the thread execution
> that does not reference Dx is a quiescent state.
>
> For DPDK applications, the start and end of while(1) loop (where no shared
> data structures are getting accessed) act as perfect quiescent states. This
> will combine all the shared data structure accesses into a single critical
> section and keeps the over head introduced by this library to the minimum.
>
> However, the length of the critical section and the number of reader threads
> is proportional to the time taken to identify the end of critical section.
> So, if the application desires, it should be possible to identify the end
> of critical section for each data structure.
>
> To provide the required flexibility, this library has a concept of TQS
> variable. The application can create one or more TQS variables to help it
> track the end of one or more critical sections.
>
> The application can create a TQS variable using the API rte_tqs_alloc.
> It takes a mask of lcore IDs that will report their quiescent states
> using this variable. This mask can be empty to start with.
>
> rte_tqs_register_lcore API will register a reader thread to report its
> quiescent state. This can be called from any control plane thread or from
> the reader thread. The application can create a TQS variable with no reader
> threads and add the threads dynamically using this API.
>
> The application can trigger the reader threads to report their quiescent
> state status by calling the API rte_tqs_start. It is possible for multiple
> writer threads to query the quiescent state status simultaneously. Hence,
> rte_tqs_start returns a token to each caller.
>
> The application has to call rte_tqs_check API with the token to get the
> current status. Option to block till all the threads enter the quiescent
> state is provided. If this API indicates that all the threads have entered
> the quiescent state, the application can free the deleted entry.
>
> The separation of triggering the reporting from querying the status provides
> the writer threads flexibility to do useful work instead of waiting for the
> reader threads to enter the quiescent state.
>
> rte_tqs_unregister_lcore API will remove a reader thread from reporting its
> quiescent state using a TQS variable. The rte_tqs_check API will not wait
> for this reader thread to report the quiescent state status anymore.
>
> Finally, a TQS variable can be deleted by calling rte_tqs_free API.
> Application must make sure that the reader threads are not referencing the
> TQS variable anymore before deleting it.
>
> The reader threads should call rte_tqs_update API to indicate that they
> entered a quiescent state. This API checks if a writer has triggered a
> quiescent state query and update the state accordingly.
>
> Next Steps:
> 1) Add more test cases
> 2) Convert to patch
> 3) Incorporate feedback from community
> 4) Add documentation
>
> Dharmik Thakkar (1):
> test/tqs: Add API and functional tests
>
> Honnappa Nagarahalli (2):
> log: add TQS log type
> tqs: add thread quiescent state library
>
> config/common_base | 6 +
> lib/Makefile | 2 +
> lib/librte_eal/common/include/rte_log.h | 1 +
> lib/librte_tqs/Makefile | 23 +
> lib/librte_tqs/meson.build | 5 +
> lib/librte_tqs/rte_tqs.c | 249 +++++++++++
> lib/librte_tqs/rte_tqs.h | 352 +++++++++++++++
> lib/librte_tqs/rte_tqs_version.map | 16 +
> lib/meson.build | 2 +-
> mk/rte.app.mk | 1 +
> test/test/Makefile | 2 +
> test/test/autotest_data.py | 6 +
> test/test/meson.build | 5 +-
> test/test/test_tqs.c | 540 ++++++++++++++++++++++++
> 14 files changed, 1208 insertions(+), 2 deletions(-)
> create mode 100644 lib/librte_tqs/Makefile
> create mode 100644 lib/librte_tqs/meson.build
> create mode 100644 lib/librte_tqs/rte_tqs.c
> create mode 100644 lib/librte_tqs/rte_tqs.h
> create mode 100644 lib/librte_tqs/rte_tqs_version.map
> create mode 100644 test/test/test_tqs.c
>
> --
> 2.17.1
More information about the dev
mailing list