[dpdk-stable] patch 'ring/c11: move atomic load of head above the loop' has been queued to stable release 18.08.1

Kevin Traynor ktraynor at redhat.com
Fri Nov 23 11:26:31 CET 2018


Hi,

FYI, your patch has been queued to stable release 18.08.1

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/29/18. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the patch applied
to the branch. If the code is different (ie: not only metadata diffs), due for example to
a change in context or macro names, please double check it.

Thanks.

Kevin Traynor

---
>From b687a72eb402f07dc3fcdf1beb88a60e2cac422f Mon Sep 17 00:00:00 2001
From: Gavin Hu <gavin.hu at arm.com>
Date: Fri, 2 Nov 2018 19:21:28 +0800
Subject: [PATCH] ring/c11: move atomic load of head above the loop

[ upstream commit 047adc17245892198be31c54cf6658080df3dc6d ]

In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
the do {} while loop as upon failure the old_head will be updated,
another load is costly and not necessary.

This helps a little on the latency,about 1~5%.

 Test result with the patch(two cores):
 SP/SC bulk enq/dequeue (size: 8): 5.64
 MP/MC bulk enq/dequeue (size: 8): 9.58
 SP/SC bulk enq/dequeue (size: 32): 1.98
 MP/MC bulk enq/dequeue (size: 32): 2.30

Fixes: 39368ebfc606 ("ring: introduce C11 memory model barrier option")

Signed-off-by: Gavin Hu <gavin.hu at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
Reviewed-by: Steve Capper <steve.capper at arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl at arm.com>
Reviewed-by: Jia He <justin.he at arm.com>
Acked-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz at 6wind.com>
---
 lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index 52da95a21..7bc74a4cb 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -62,11 +62,9 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 	int success;
 
+	*old_head = __atomic_load_n(&r->prod.head, __ATOMIC_ACQUIRE);
 	do {
 		/* Reset n to the initial burst count */
 		n = max;
 
-		*old_head = __atomic_load_n(&r->prod.head,
-					__ATOMIC_ACQUIRE);
-
 		/* load-acquire synchronize with store-release of ht->tail
 		 * in update_tail.
@@ -94,4 +92,5 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 			r->prod.head = *new_head, success = 1;
 		else
+			/* on failure, *old_head is updated */
 			success = __atomic_compare_exchange_n(&r->prod.head,
 					old_head, *new_head,
@@ -136,11 +135,9 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
 
 	/* move cons.head atomically */
+	*old_head = __atomic_load_n(&r->cons.head, __ATOMIC_ACQUIRE);
 	do {
 		/* Restore n as it may change every loop */
 		n = max;
 
-		*old_head = __atomic_load_n(&r->cons.head,
-					__ATOMIC_ACQUIRE);
-
 		/* this load-acquire synchronize with store-release of ht->tail
 		 * in update_tail.
@@ -167,4 +164,5 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
 			r->cons.head = *new_head, success = 1;
 		else
+			/* on failure, *old_head will be updated */
 			success = __atomic_compare_exchange_n(&r->cons.head,
 							old_head, *new_head,
-- 
2.19.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2018-11-23 10:22:54.960223892 +0000
+++ 0027-ring-c11-move-atomic-load-of-head-above-the-loop.patch	2018-11-23 10:22:54.000000000 +0000
@@ -1,8 +1,10 @@
-From 047adc17245892198be31c54cf6658080df3dc6d Mon Sep 17 00:00:00 2001
+From b687a72eb402f07dc3fcdf1beb88a60e2cac422f Mon Sep 17 00:00:00 2001
 From: Gavin Hu <gavin.hu at arm.com>
 Date: Fri, 2 Nov 2018 19:21:28 +0800
 Subject: [PATCH] ring/c11: move atomic load of head above the loop
 
+[ upstream commit 047adc17245892198be31c54cf6658080df3dc6d ]
+
 In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
 the do {} while loop as upon failure the old_head will be updated,
 another load is costly and not necessary.
@@ -16,7 +18,6 @@
  MP/MC bulk enq/dequeue (size: 32): 2.30
 
 Fixes: 39368ebfc606 ("ring: introduce C11 memory model barrier option")
-Cc: stable at dpdk.org
 
 Signed-off-by: Gavin Hu <gavin.hu at arm.com>
 Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
@@ -27,29 +28,9 @@
 Tested-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
 Acked-by: Olivier Matz <olivier.matz at 6wind.com>
 ---
- doc/guides/rel_notes/release_18_11.rst | 10 ++++++++++
- lib/librte_ring/rte_ring_c11_mem.h     | 10 ++++------
- 2 files changed, 14 insertions(+), 6 deletions(-)
-
-diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
-index c60879c69..cfa92b8c0 100644
---- a/doc/guides/rel_notes/release_18_11.rst
-+++ b/doc/guides/rel_notes/release_18_11.rst
-@@ -70,4 +70,14 @@ New Features
-   one device has addressing limitations, the dma mask is the more restricted one.
- 
-+* **Updated the C11 memory model version of ring library.**
-+
-+  The latency is decreased for architectures using the C11 memory model
-+  version of the ring library.
-+
-+  On Cavium ThunderX2 platform, the changes decreased latency by 27~29%
-+  and 3~15% for MPMC and SPSC cases respectively (with 2 lcores). The
-+  real improvements may vary with the number of contending lcores and
-+  the size of ring.
-+
- * **Added hot-unplug handle mechanism.**
- 
+ lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
+ 1 file changed, 4 insertions(+), 6 deletions(-)
+
 diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
 index 52da95a21..7bc74a4cb 100644
 --- a/lib/librte_ring/rte_ring_c11_mem.h


More information about the stable mailing list