[dpdk-stable] patch 'ring/c11: relax ordering for load and store of the head' has been queued to stable release 18.08.1

Kevin Traynor ktraynor at redhat.com
Fri Nov 23 11:27:03 CET 2018

Previous message: [dpdk-stable] patch 'ring/c11: keep deterministic order allowing retry to work' has been queued to stable release 18.08.1
Next message: [dpdk-stable] patch 'pci: fix parsing of address without function number' has been queued to stable release 18.08.1
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

FYI, your patch has been queued to stable release 18.08.1

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/29/18. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the patch applied
to the branch. If the code is different (ie: not only metadata diffs), due for example to
a change in context or macro names, please double check it.

Thanks.

Kevin Traynor

---
>From 9514283c5c6663b4926b8f752f96e36b6a9fa723 Mon Sep 17 00:00:00 2001
From: Gavin Hu <gavin.hu at arm.com>
Date: Fri, 9 Nov 2018 19:42:47 +0800
Subject: [PATCH] ring/c11: relax ordering for load and store of the head

[ upstream commit 49594a63147a994d9674b1f479d0107e70fe1cbc ]

When calling __atomic_compare_exchange_n, use relaxed ordering for the
success case, as multiple producers/consumers do not release updates to
each other so no need for acquire or release ordering.

Because the thread fence in place, ordering for the first iteration can
be relaxed.

Run the ring perf test on the following testbed:
HW: ThunderX2 B0 CPU CN9975 v2.0, 2 sockets, 28core,4 threads/core,2.5GHz
OS: Ubuntu 16.04.5 LTS, Kernel: 4.15.0-36-generic
DPDK: 18.08, Configuration: arm64-armv8a-linuxapp-gcc
gcc: 8.1.0
$sudo ./test/test/test -l 16-19,44-47,72-75,100-103 -n 4 \
--socket-mem=1024 -- -i

Without the patch:
*** Testing using two physical cores ***
SP/SC bulk enq/dequeue (size: 8): 5.75
MP/MC bulk enq/dequeue (size: 8): 10.18
SP/SC bulk enq/dequeue (size: 32): 1.80
MP/MC bulk enq/dequeue (size: 32): 2.34

With the patch:
*** Testing using two physical cores ***
SP/SC bulk enq/dequeue (size: 8): 5.59
MP/MC bulk enq/dequeue (size: 8): 10.54
SP/SC bulk enq/dequeue (size: 32): 1.73
MP/MC bulk enq/dequeue (size: 32): 2.38

No significant improvement, nor regression was seen, as the optimisation
is not at the critical path.

Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option")

Signed-off-by: Gavin Hu <gavin.hu at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
Reviewed-by: Steve Capper <steve.capper at arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl at arm.com>
---
 lib/librte_ring/rte_ring_c11_mem.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h
index dc49a998f..0fb73a337 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -62,5 +62,5 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 	int success;
 
-	*old_head = __atomic_load_n(&r->prod.head, __ATOMIC_ACQUIRE);
+	*old_head = __atomic_load_n(&r->prod.head, __ATOMIC_RELAXED);
 	do {
 		/* Reset n to the initial burst count */
@@ -98,5 +98,5 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 			success = __atomic_compare_exchange_n(&r->prod.head,
 					old_head, *new_head,
-					0, __ATOMIC_ACQUIRE,
+					0, __ATOMIC_RELAXED,
 					__ATOMIC_RELAXED);
 	} while (unlikely(success == 0));
@@ -138,5 +138,5 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
 
 	/* move cons.head atomically */
-	*old_head = __atomic_load_n(&r->cons.head, __ATOMIC_ACQUIRE);
+	*old_head = __atomic_load_n(&r->cons.head, __ATOMIC_RELAXED);
 	do {
 		/* Restore n as it may change every loop */
@@ -173,5 +173,5 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
 			success = __atomic_compare_exchange_n(&r->cons.head,
 							old_head, *new_head,
-							0, __ATOMIC_ACQUIRE,
+							0, __ATOMIC_RELAXED,
 							__ATOMIC_RELAXED);
 	} while (unlikely(success == 0));
-- 
2.19.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2018-11-23 10:22:55.777704374 +0000
+++ 0059-ring-c11-relax-ordering-for-load-and-store-of-the-he.patch	2018-11-23 10:22:54.000000000 +0000
@@ -1,8 +1,10 @@
-From 49594a63147a994d9674b1f479d0107e70fe1cbc Mon Sep 17 00:00:00 2001
+From 9514283c5c6663b4926b8f752f96e36b6a9fa723 Mon Sep 17 00:00:00 2001
 From: Gavin Hu <gavin.hu at arm.com>
 Date: Fri, 9 Nov 2018 19:42:47 +0800
 Subject: [PATCH] ring/c11: relax ordering for load and store of the head
 
+[ upstream commit 49594a63147a994d9674b1f479d0107e70fe1cbc ]
+
 When calling __atomic_compare_exchange_n, use relaxed ordering for the
 success case, as multiple producers/consumers do not release updates to
 each other so no need for acquire or release ordering.
@@ -36,7 +38,6 @@
 is not at the critical path.
 
 Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option")
-Cc: stable at dpdk.org
 
 Signed-off-by: Gavin Hu <gavin.hu at arm.com>
 Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>

Previous message: [dpdk-stable] patch 'ring/c11: keep deterministic order allowing retry to work' has been queued to stable release 18.08.1
Next message: [dpdk-stable] patch 'pci: fix parsing of address without function number' has been queued to stable release 18.08.1
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the stable mailing list