[dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue
Olivier MATZ
olivier.matz at 6wind.com
Thu Oct 12 17:53:51 CEST 2017
Hi,
On Tue, Oct 10, 2017 at 05:56:36PM +0800, Jia He wrote:
> Before this patch:
> In __rte_ring_move_cons_head()
> ...
> do {
> /* Restore n as it may change every loop */
> n = max;
>
> *old_head = r->cons.head; //1st load
> const uint32_t prod_tail = r->prod.tail; //2nd load
>
> In weak memory order architectures(powerpc,arm), the 2nd load might be
> reodered before the 1st load, that makes *entries is bigger than we wanted.
> This nasty reording messed enque/deque up.
>
> cpu1(producer) cpu2(consumer) cpu3(consumer)
> load r->prod.tail
> in enqueue:
> load r->cons.tail
> load r->prod.head
>
> store r->prod.tail
>
> load r->cons.head
> load r->prod.tail
> ...
> store r->cons.{head,tail}
> load r->cons.head
>
> THEN,r->cons.head will be bigger than prod_tail, then make *entries very big
>
> After this patch, the old cons.head will be recaculated after failure of
> rte_atomic32_cmpset
>
> There is no such issue in X86 cpu, because X86 is strong memory order model
>
> Signed-off-by: Jia He <hejianet at gmail.com>
> Signed-off-by: jia.he at hxt-semitech.com
> Signed-off-by: jie2.liu at hxt-semitech.com
> Signed-off-by: bing.zhao at hxt-semitech.com
>
> ---
> lib/librte_ring/rte_ring.h | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 5e9b3b7..15c72e2 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -409,6 +409,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> n = max;
>
> *old_head = r->prod.head;
> +
> + /* load of prod.tail can't be reordered before cons.head */
> + rte_smp_rmb();
> +
> const uint32_t cons_tail = r->cons.tail;
> /*
> * The subtraction is done between two unsigned 32bits value
> @@ -517,6 +521,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> n = max;
>
> *old_head = r->cons.head;
> +
> + /* load of prod.tail can't be reordered before cons.head */
> + rte_smp_rmb();
> +
> const uint32_t prod_tail = r->prod.tail;
> /* The subtraction is done between two unsigned 32bits value
> * (the result is always modulo 32 bits even if we have
> --
> 2.7.4
>
The explanation convinces me.
However, since it's in a critical path, it would be good to have other
opinions. This patch reminds me this discussion, that was also related to
memory barrier, but at another place:
http://dpdk.org/ml/archives/dev/2016-July/043765.html
Lead to that patch: http://dpdk.org/browse/dpdk/commit/?id=ecc7d10e448e
But finally reverted: http://dpdk.org/browse/dpdk/commit/?id=c3acd92746c3
Konstatin, Jerin, do you have any comment?
More information about the dev
mailing list