[dpdk-stable] patch 'ring/c11: move atomic load of head above the loop' has been queued to stable release 18.08.1
Kevin Traynor
ktraynor at redhat.com
Fri Nov 23 11:26:31 CET 2018
Hi,
FYI, your patch has been queued to stable release 18.08.1
Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/29/18. So please
shout if anyone has objections.
Also note that after the patch there's a diff of the upstream commit vs the patch applied
to the branch. If the code is different (ie: not only metadata diffs), due for example to
a change in context or macro names, please double check it.
Thanks.
Kevin Traynor
---
>From b687a72eb402f07dc3fcdf1beb88a60e2cac422f Mon Sep 17 00:00:00 2001
From: Gavin Hu <gavin.hu at arm.com>
Date: Fri, 2 Nov 2018 19:21:28 +0800
Subject: [PATCH] ring/c11: move atomic load of head above the loop
[ upstream commit 047adc17245892198be31c54cf6658080df3dc6d ]
In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
the do {} while loop as upon failure the old_head will be updated,
another load is costly and not necessary.
This helps a little on the latency,about 1~5%.
Test result with the patch(two cores):
SP/SC bulk enq/dequeue (size: 8): 5.64
MP/MC bulk enq/dequeue (size: 8): 9.58
SP/SC bulk enq/dequeue (size: 32): 1.98
MP/MC bulk enq/dequeue (size: 32): 2.30
Fixes: 39368ebfc606 ("ring: introduce C11 memory model barrier option")
Signed-off-by: Gavin Hu <gavin.hu at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
Reviewed-by: Steve Capper <steve.capper at arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl at arm.com>
Reviewed-by: Jia He <justin.he at arm.com>
Acked-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz at 6wind.com>
---
lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 52da95a21..7bc74a4cb 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -62,11 +62,9 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
int success;
+ *old_head = __atomic_load_n(&r->prod.head, __ATOMIC_ACQUIRE);
do {
/* Reset n to the initial burst count */
n = max;
- *old_head = __atomic_load_n(&r->prod.head,
- __ATOMIC_ACQUIRE);
-
/* load-acquire synchronize with store-release of ht->tail
* in update_tail.
@@ -94,4 +92,5 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
r->prod.head = *new_head, success = 1;
else
+ /* on failure, *old_head is updated */
success = __atomic_compare_exchange_n(&r->prod.head,
old_head, *new_head,
@@ -136,11 +135,9 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
/* move cons.head atomically */
+ *old_head = __atomic_load_n(&r->cons.head, __ATOMIC_ACQUIRE);
do {
/* Restore n as it may change every loop */
n = max;
- *old_head = __atomic_load_n(&r->cons.head,
- __ATOMIC_ACQUIRE);
-
/* this load-acquire synchronize with store-release of ht->tail
* in update_tail.
@@ -167,4 +164,5 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
r->cons.head = *new_head, success = 1;
else
+ /* on failure, *old_head will be updated */
success = __atomic_compare_exchange_n(&r->cons.head,
old_head, *new_head,
--
2.19.0
---
Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- - 2018-11-23 10:22:54.960223892 +0000
+++ 0027-ring-c11-move-atomic-load-of-head-above-the-loop.patch 2018-11-23 10:22:54.000000000 +0000
@@ -1,8 +1,10 @@
-From 047adc17245892198be31c54cf6658080df3dc6d Mon Sep 17 00:00:00 2001
+From b687a72eb402f07dc3fcdf1beb88a60e2cac422f Mon Sep 17 00:00:00 2001
From: Gavin Hu <gavin.hu at arm.com>
Date: Fri, 2 Nov 2018 19:21:28 +0800
Subject: [PATCH] ring/c11: move atomic load of head above the loop
+[ upstream commit 047adc17245892198be31c54cf6658080df3dc6d ]
+
In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
the do {} while loop as upon failure the old_head will be updated,
another load is costly and not necessary.
@@ -16,7 +18,6 @@
MP/MC bulk enq/dequeue (size: 32): 2.30
Fixes: 39368ebfc606 ("ring: introduce C11 memory model barrier option")
-Cc: stable at dpdk.org
Signed-off-by: Gavin Hu <gavin.hu at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
@@ -27,29 +28,9 @@
Tested-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz at 6wind.com>
---
- doc/guides/rel_notes/release_18_11.rst | 10 ++++++++++
- lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
- 2 files changed, 14 insertions(+), 6 deletions(-)
-
-diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
-index c60879c69..cfa92b8c0 100644
---- a/doc/guides/rel_notes/release_18_11.rst
-+++ b/doc/guides/rel_notes/release_18_11.rst
-@@ -70,4 +70,14 @@ New Features
- one device has addressing limitations, the dma mask is the more restricted one.
-
-+* **Updated the C11 memory model version of ring library.**
-+
-+ The latency is decreased for architectures using the C11 memory model
-+ version of the ring library.
-+
-+ On Cavium ThunderX2 platform, the changes decreased latency by 27~29%
-+ and 3~15% for MPMC and SPSC cases respectively (with 2 lcores). The
-+ real improvements may vary with the number of contending lcores and
-+ the size of ring.
-+
- * **Added hot-unplug handle mechanism.**
-
+ lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
+ 1 file changed, 4 insertions(+), 6 deletions(-)
+
diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 52da95a21..7bc74a4cb 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
More information about the stable
mailing list