[dpdk-dev] [PATCH 1/2] mem: add write memory barrier before changing heap state
didier.pallard
didier.pallard at 6wind.com
Wed Apr 16 10:55:07 CEST 2014
On 04/15/2014 04:08 PM, Richardson, Bruce wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of David Marchand
>> Sent: Tuesday, April 15, 2014 2:51 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH 1/2] mem: add write memory barrier before
>> changing heap state
>>
>> From: Didier Pallard <didier.pallard at 6wind.com>
>>
>> a write memory barrier is needed before changing heap state value, else some
>> concurrent core may see state changing before all initialization values are
>> written to memory, causing unpredictable results in malloc function.
>>
>> Signed-off-by: Didier Pallard <didier.pallard at 6wind.com>
> No barrier should be necessary here. As in a number of other places, such as rings, compiler barriers can be used in place of write memory barriers, due to IA ordering rules. However, in this case, both variables referenced are volatile variables and so the assignments to them cannot be reordered by the compiler so no compiler barrier is necessary either.
>
> Regards,
> /Bruce
Hi bruce,
Indeed a compiler barrier is absolutely needed here. volatile variable
use is absolutely not a serializing instruction from compiler point of
view; only atomic variable use is serializing, due to asm volatile
(memory) directive use.
Here is the assembler generated with and without rte_wmb:
With rte_wmb
142: f0 45 0f b1 07 lock cmpxchg %r8d,(%r15)
147: 0f 94 c0 sete %al
14a: 84 c0 test %al,%al
14c: 74 ea je 138 <malloc_heap_alloc+0x68>
14e: 49 c7 47 10 00 00 00 movq $0x0,0x10(%r15)
155: 00
156: 41 c7 47 18 00 00 00 movl $0x0,0x18(%r15)
15d: 00
15e: 41 c7 47 08 00 00 00 movl $0x0,0x8(%r15)
165: 00
166: 41 c7 47 1c 00 00 00 movl $0x0,0x1c(%r15)
16d: 00
16e: 49 c7 47 20 00 00 00 movq $0x0,0x20(%r15)
175: 00
176: 45 89 57 04 mov %r10d,0x4(%r15)
17a: 0f ae f8 sfence
* 17d: 41 c7 07 02 00 00 00 movl $0x2,(%r15)**
** 184: 41 8b 37 mov (%r15),%esi**
* 187: 83 fe 02 cmp $0x2,%esi
18a: 75 b4 jne 140 <malloc_heap_alloc+0x70>
18c: 0f 1f 40 00 nopl 0x0(%rax)
190: 48 83 c3 3f add $0x3f,%rbx
Without rte_wmb
142: f0 45 0f b1 07 lock cmpxchg %r8d,(%r15)
147: 0f 94 c0 sete %al
14a: 84 c0 test %al,%al
14c: 74 ea je 138 <malloc_heap_alloc+0x68>
14e: 49 c7 47 10 00 00 00 movq $0x0,0x10(%r15)
155: 00
156: 41 c7 47 08 00 00 00 movl $0x0,0x8(%r15)
15d: 00
* 15e: 41 c7 07 02 00 00 00 movl $0x2,(%r15)**
** 165: 41 8b 37 mov (%r15),%esi**
* 168: 41 c7 47 18 00 00 00 movl $0x0,0x18(%r15)
16f: 00
170: 41 c7 47 1c 00 00 00 movl $0x0,0x1c(%r15)
177: 00
178: 49 c7 47 20 00 00 00 movq $0x0,0x20(%r15)
17f: 00
180: 45 89 57 04 mov %r10d,0x4(%r15)
184: 83 fe 02 cmp $0x2,%esi
187: 75 b7 jne 140 <malloc_heap_alloc+0x70>
189: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
190: 48 83 c3 3f add $0x3f,%rbx
It's clear that the *heap->initialised = INITIALISED;* instruction has
been reordered by the compiler.
About rte_wmb and rte_rmb use, i agree with you that on intel
architecture those macro should do nothing more than compiler barrier,
due to Intel architecture choices.
But for code completness, i think those memory barriers should remain in
place in the code, and rte_*mb should map to compiler barrier on intel
architecture.
didier
More information about the dev
mailing list