[dpdk-dev] [PATCH v5 0/9] New sync modes for ring

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Sun Apr 19 04:32:59 CEST 2020


Hi Konstantin,
	Changes look good overall, I have integrated RCU defer APIs patch as well. Please consider adding the following (in another patch?)

1) Release notes
2) Updates to programmer guide for RTS and HTS modes

Thank you,
Honnappa

> -----Original Message-----
> From: Konstantin Ananyev <konstantin.ananyev at intel.com>
> Sent: Saturday, April 18, 2020 11:32 AM
> To: dev at dpdk.org
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>;
> david.marchand at redhat.com; jielong.zjl at antfin.com; Konstantin Ananyev
> <konstantin.ananyev at intel.com>
> Subject: [PATCH v5 0/9] New sync modes for ring
> 
> V4 - V5:
> 1. fix i686 clang build problem
> 2. fix formal API comments
> 
> V3 - V4 changes:
> Address comments from Honnappa:
> 1. for new sync modes make legacy API wrappers around _elem_ calls 2.
> remove rte_ring_(hts|rts)_generic.h 3. few changes in C11 version 4. peek API
> - add missing functions for _elem_ 5. remove _IS_SP/_IS_MP, etc. internal
> macros 6. fix param types (obj_table) for _elem_functions 7. fix formal API
> comments 8. deduplicate code for test_ring_stress 9. added functional tests
> for new sync modes
> 
> V2 - V3 changes:
> 1. few more compilation fixes (for gcc 4.8.X) 2. extra update
> devtools/libabigail.abignore (workaround)
> 
> V1 - V2 changes:
> 1. fix compilation issues
> 2. add C11 atomics support
> 3. updates devtools/libabigail.abignore (workaround)
> 
> RFC - V1 changes:
> 1. remove ABI brekage (at least I hope I did) 2. Add support for ring_elem 3.
> rework peek related API a bit 4. rework test to make it less verbose and unite
> all test-cases
>    in one command
> 5. add new test-case for MT peek API
> 
> TODO list:
> 1. Update docs
> 
> These days more and more customers use(/try to use) DPDK based apps
> within overcommitted systems (multiple acttive threads over same pysical
> cores):
> VM, container deployments, etc.
> One quite common problem they hit:
> Lock-Holder-Preemption/Lock-Waiter-Preemption with rte_ring.
> LHP is quite a common problem for spin-based sync primitives (spin-locks, etc.)
> on overcommitted systems.
> The situation gets much worse when some sort of fair-locking technique is
> used (ticket-lock, etc.).
> As now not only lock-owner but also lock-waiters scheduling order matters a
> lot (LWP).
> These two problems are well-known for kernel within VMs:
> http://www-archive.xenproject.org/files/xensummitboston08/LHP.pdf
> https://www.cs.hs-rm.de/~kaiser/events/wamos2017/Slides/selcuk.pdf
> The problem with rte_ring is that while head accusion is sort of un-fair locking,
> waiting on tail is very similar to ticket lock schema - tail has to be updated in
> particular order.
> That makes current rte_ring implementation to perform really pure on some
> overcommited scenarios.
> It is probably not possible to completely resolve LHP problem in userspace
> only (without some kernel communication/intervention).
> But removing fairness at tail update helps to avoid LWP and can mitigate the
> situation significantly.
> This patch proposes two new optional ring synchronization modes:
> 1) Head/Tail Sync (HTS) mode
> In that mode enqueue/dequeue operation is fully serialized:
>     only one thread at a time is allowed to perform given op.
>     As another enhancement provide ability to split enqueue/dequeue
>     operation into two phases:
>       - enqueue/dequeue start
>       - enqueue/dequeue finish
>     That allows user to inspect objects in the ring without removing
>     them from it (aka MT safe peek).
> 2) Relaxed Tail Sync (RTS)
> The main difference from original MP/MC algorithm is that tail value is
> increased not by every thread that finished enqueue/dequeue, but only by the
> last one.
> That allows threads to avoid spinning on ring tail value, leaving actual tail
> value change to the last thread in the update queue.
> 
> Note that these new sync modes are optional.
> For current rte_ring users nothing should change (both in terms of API/ABI
> and performance).
> Existing sync modes MP/MC,SP/SC kept untouched, set up in the same way
> (via flags and _init_), and MP/MC remains as default one.
> The only thing that changed:
> Format of prod/cons now could differ depending on mode selected at _init_.
> So user has to stick with one sync model through whole ring lifetime.
> In other words, user can't create a ring for let say SP mode and then in the
> middle of data-path change his mind and start using MP_RTS mode.
> For existing modes (SP/MP, SC/MC) format remains the same and user can
> still use them interchangeably, though of course it is an error prone practice.
> 
> Test results on IA (see below) show significant improvements for average
> enqueue/dequeue op times on overcommitted systems.
> For 'classic' DPDK deployments (one thread per core) original MP/MC
> algorithm still shows best numbers, though for 64-bit target RTS numbers are
> not that far away.
> Numbers were produced by new UT test-case: ring_stress_autotest, i.e.:
> echo ring_stress_autotest | ./dpdk-test -n 4 --lcores='...'
> 
> X86_64 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> DEQ+ENQ average cycles/obj
>                                                 MP/MC      HTS     RTS
> 1thread at 1core(--lcores=6-7)                     8.00       8.15    8.99
> 2thread at 2core(--lcores=6-8)                     19.14      19.61   20.35
> 4thread at 4core(--lcores=6-10)                    29.43      29.79   31.82
> 8thread at 8core(--lcores=6-14)                    110.59     192.81  119.50
> 16thread at 16core(--lcores=6-22)                  461.03     813.12  495.59
> 32thread/@32core(--lcores='6-22,55-70')         982.90     1972.38 1160.51
> 
> 2thread at 1core(--lcores='6,(10-11)@7'            20140.50   23.58   25.14
> 4thread at 2core(--lcores='6,(10-11)@7,(20-21)@8'  153680.60  76.88   80.05
> 8thread at 2core(--lcores='6,(10-13)@7,(20-23)@8'  280314.32  294.72
> 318.79 16thread at 2core(--lcores='6,(10-17)@7,(20-27)@8' 643176.59
> 1144.02 1175.14 32thread at 2core(--lcores='6,(10-25)@7,(30-45)@8'
> 4264238.80 4627.48 4892.68
> 
> 8thread at 2core(--lcores='6,(10-17)@(7,8))'       321085.98  298.59  307.47
> 16thread at 4core(--lcores='6,(20-35)@(7-10))'     1900705.61 575.35  678.29
> 32thread at 4core(--lcores='6,(20-51)@(7-10))'     5510445.85 2164.36
> 2714.12
> 
> i686 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> DEQ+ENQ average cycles/obj
>                                                 MP/MC      HTS     RTS
> 1thread at 1core(--lcores=6-7)                     7.85       12.13   11.31
> 2thread at 2core(--lcores=6-8)                     17.89      24.52   21.86
> 8thread at 8core(--lcores=6-14)                    32.58      354.20  54.58
> 32thread/@32core(--lcores='6-22,55-70')         813.77     6072.41 2169.91
> 
> 2thread at 1core(--lcores='6,(10-11)@7'            16095.00   36.06   34.74
> 8thread at 2core(--lcores='6,(10-13)@7,(20-23)@8'  1140354.54 346.61
> 361.57 16thread at 2core(--lcores='6,(10-17)@7,(20-27)@8' 1920417.86
> 1314.90 1416.65
> 
> 8thread at 2core(--lcores='6,(10-17)@(7,8))'       594358.61  332.70  357.74
> 32thread at 4core(--lcores='6,(20-51)@(7-10))'     5319896.86 2836.44
> 3028.87
> 
> Konstantin Ananyev (9):
>   test/ring: add contention stress test
>   ring: prepare ring to allow new sync schemes
>   ring: introduce RTS ring mode
>   test/ring: add contention stress test for RTS ring
>   ring: introduce HTS ring mode
>   test/ring: add contention stress test for HTS ring
>   ring: introduce peek style API
>   test/ring: add stress test for MT peek API
>   test/ring: add functional tests for new sync modes
> 
>  app/test/Makefile                      |   5 +
>  app/test/meson.build                   |   5 +
>  app/test/test_pdump.c                  |   6 +-
>  app/test/test_ring.c                   |  93 ++++--
>  app/test/test_ring_hts_stress.c        |  32 ++
>  app/test/test_ring_mpmc_stress.c       |  31 ++
>  app/test/test_ring_peek_stress.c       |  43 +++
>  app/test/test_ring_rts_stress.c        |  32 ++
>  app/test/test_ring_stress.c            |  57 ++++
>  app/test/test_ring_stress.h            |  38 +++
>  app/test/test_ring_stress_impl.h       | 396 ++++++++++++++++++++++
>  devtools/libabigail.abignore           |   7 +
>  lib/librte_pdump/rte_pdump.c           |   2 +-
>  lib/librte_port/rte_port_ring.c        |  12 +-
>  lib/librte_ring/Makefile               |   8 +-
>  lib/librte_ring/meson.build            |  11 +-
>  lib/librte_ring/rte_ring.c             | 114 ++++++-
>  lib/librte_ring/rte_ring.h             | 243 ++++++++------
>  lib/librte_ring/rte_ring_c11_mem.h     |  44 +++
>  lib/librte_ring/rte_ring_core.h        | 184 ++++++++++
>  lib/librte_ring/rte_ring_elem.h        | 141 ++++++--
>  lib/librte_ring/rte_ring_generic.h     |  48 +++
>  lib/librte_ring/rte_ring_hts.h         | 332 +++++++++++++++++++
>  lib/librte_ring/rte_ring_hts_c11_mem.h | 207 ++++++++++++
>  lib/librte_ring/rte_ring_peek.h        | 442 +++++++++++++++++++++++++
>  lib/librte_ring/rte_ring_rts.h         | 439 ++++++++++++++++++++++++
>  lib/librte_ring/rte_ring_rts_c11_mem.h | 179 ++++++++++
>  27 files changed, 2977 insertions(+), 174 deletions(-)  create mode 100644
> app/test/test_ring_hts_stress.c  create mode 100644
> app/test/test_ring_mpmc_stress.c  create mode 100644
> app/test/test_ring_peek_stress.c  create mode 100644
> app/test/test_ring_rts_stress.c  create mode 100644
> app/test/test_ring_stress.c  create mode 100644 app/test/test_ring_stress.h
> create mode 100644 app/test/test_ring_stress_impl.h  create mode 100644
> lib/librte_ring/rte_ring_core.h  create mode 100644
> lib/librte_ring/rte_ring_hts.h  create mode 100644
> lib/librte_ring/rte_ring_hts_c11_mem.h
>  create mode 100644 lib/librte_ring/rte_ring_peek.h  create mode 100644
> lib/librte_ring/rte_ring_rts.h  create mode 100644
> lib/librte_ring/rte_ring_rts_c11_mem.h
> 
> --
> 2.17.1



More information about the dev mailing list