[dpdk-dev] [5/5] vhost: remove useless casts to volatile

Ilya Maximets i.maximets at samsung.com
Fri Dec 7 12:16:53 CET 2018


On 06.12.2018 19:59, Maxime Coquelin wrote:
> Hi Ilya,
> 
> On 12/5/18 2:52 PM, Ilya Maximets wrote:
>> On 05.12.2018 12:49, Maxime Coquelin wrote:
>>> Cast to volatile is done when reading avail index and writing
>>> the used index. This would not be necessary if proper barriers
>>> are used.
>>
>> 'volatile' and barriers are not really connected. 'volatile' is
>> the disabling of the compiler optimizations, while barriers are
>> for runtime CPU level optimizations. In general, casts here made
>> to force compiler to actually read the value and not cache it
>> somehow. In fact that vhost library never writes to avail index,
>> "very smart" compiler could drop it at all. None of modern compilers
>> will do that for a single operation within a function, so,
>> volatiles are not really necessary in current code, but they could
>> save some nerves in case of code/compiler changes.
> 
> Ok, thanks for the explanation.
> Why don't we do the same in Virtio PMD?

Maybe we should. It works because in virtio all the accesses wrapped
by short access functions like 'vq_update_avail_idx'. And we, actually,
never reading the same value twice in the same function. Compilers
today does not optimize such memory accesses.

> 
>> OTOH, IMHO, the main purpose of the casts in current code is
>> the self-documenting. Casts forces to pay special attention to
>> these variables and reminds that they could be updated in other
>> process. Casts allows to understand which variables are local and
>> which are shared. I don't think that we should remove them anyway.
> 
> It is not only self-documenting, it has an impact on generated code:
> 
>>>
>>> Now that the read barrier has been added, we can remove these
>>> cast to volatile.
>>>
>>> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
>>> ---
>>>   lib/librte_vhost/virtio_net.c | 7 +++----
>>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
>>> index 679ce388b..eab1a5b4c 100644
>>> --- a/lib/librte_vhost/virtio_net.c
>>> +++ b/lib/librte_vhost/virtio_net.c
>>> @@ -114,7 +114,7 @@ flush_shadow_used_ring_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
>>>         vhost_log_cache_sync(dev, vq);
>>>   -    *(volatile uint16_t *)&vq->used->idx += vq->shadow_used_idx;
>>> +    vq->used->idx += vq->shadow_used_idx;
> 
> With cast to volatile:
>     *(volatile uint16_t *)&vq->used->idx += vq->shadow_used_idx;
>     35f8:    49 8b 53 10              mov    0x10(%r11),%rdx
>     vq->shadow_used_idx = 0;
>     35fc:    31 db                    xor    %ebx,%ebx
>     *(volatile uint16_t *)&vq->used->idx += vq->shadow_used_idx;
>     35fe:    0f b7 42 02              movzwl 0x2(%rdx),%eax
>     3602:    66 41 03 43 70           add    0x70(%r11),%ax
>     3607:    66 89 42 02              mov    %ax,0x2(%rdx)
>     vq->shadow_used_idx = 0;
> 
> Without it:
>     vq->used->idx += vq->shadow_used_idx;
>     35f8:    49 8b 43 10              mov    0x10(%r11),%rax
>     35fc:    41 0f b7 53 70           movzwl 0x70(%r11),%edx
>     vq->shadow_used_idx = 0;
>     3601:    31 db                    xor    %ebx,%ebx
>     vq->used->idx += vq->shadow_used_idx;
>     3603:    66 01 50 02              add    %dx,0x2(%rax)
>     vq->shadow_used_idx = 0;
> 
> If my understanding is correct there is no functional change, but we save one instruction by removing the cast to volatile.

IMHO, it's a gcc issue that it could not understand that cast and
dereference could be dropped. For example, clang on my ubuntu
generates equal code:

With cast to volatile:

        *(volatile uint16_t *)&vq->used->idx += vq->shadow_used_idx;
   32550:       41 0f b7 42 70          movzwl 0x70(%r10),%eax
   32555:       49 8b 4a 10             mov    0x10(%r10),%rcx
   32559:       66 01 41 02             add    %ax,0x2(%rcx)
        vq->shadow_used_idx = 0;
   3255d:       66 41 c7 42 70 00 00    movw   $0x0,0x70(%r10)

Without it:

        vq->used->idx += vq->shadow_used_idx;
   32550:       41 0f b7 42 70          movzwl 0x70(%r10),%eax
   32555:       49 8b 4a 10             mov    0x10(%r10),%rcx
   32559:       66 01 41 02             add    %ax,0x2(%rcx)
        vq->shadow_used_idx = 0;
   3255d:       66 41 c7 42 70 00 00    movw   $0x0,0x70(%r10)


However, different code appears only in '+=' case.
Why we have this increment at all? Following change will eliminate
the generated code difference:

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 5e1a1a727..5776975ca 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -114,7 +114,7 @@ flush_shadow_used_ring_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
 
        vhost_log_cache_sync(dev, vq);
 
-       *(volatile uint16_t *)&vq->used->idx += vq->shadow_used_idx;
+       *(volatile uint16_t *)&vq->used->idx = vq->last_used_idx;
        vq->shadow_used_idx = 0;
        vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
                sizeof(vq->used->idx));
---

What do you think?


Best regards, Ilya Maximets.


More information about the dev mailing list