[dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags

Linhaifeng haifeng.lin at huawei.com
Wed Jun 10 04:36:48 CEST 2015



On 2015/6/9 21:34, Xie, Huawei wrote:
> On 6/9/2015 4:47 PM, Michael S. Tsirkin wrote:
>> On Tue, Jun 09, 2015 at 03:04:02PM +0800, Linhaifeng wrote:
>>>
>>> On 2015/4/24 15:27, Luke Gorrie wrote:
>>>> On 24 April 2015 at 03:01, Linhaifeng <haifeng.lin at huawei.com> wrote:
>>>>
>>>>> If not add memory fence what would happen? Packets loss or interrupt
>>>>> loss?How to test it ?
>>>>>
>>>> You should be able to test it like this:
>>>>
>>>> 1. Boot two Linux kernel (e.g. 3.13) guests.
>>>> 2. Connect them via vhost switch.
>>>> 3. Run continuous traffic between them (e.g. iperf).
>>>>
>>>> I would expect that within a reasonable timeframe (< 1 hour) one of the
>>>> guests' network interfaces will hang indefinitely due to a missed interrupt.
>>>>
>>>> You won't be able to reproduce this using DPDK guests because they are not
>>>> using the same interrupt suppression method.
>>>>
>>>> This is a serious real-world problem. I wouldn't deploy the vhost
>>>> implementation without this fix.
>>>>
>>>> Cheers,
>>>> -Luke
>>>>
>>> I think this patch can't resole this problem. On the other hand we still would miss interrupt.
>>>
>>> After add rte_mb() function the we want the case is :
>>> 1.write used->idx. ring is full or empty.
>>> 2.virtio_net open interrupt.
>>> 3.read avail->flags.
>>>
>>> but this case(miss interrupt) would happen too:
>>> 1.write used->idx. ring is full or empty.
>>> 2.read avail->flags.
>>> 3.virtio_net open interrupt.
>>>
>> That's why a correct guest, after detecting an empty used ring, must always
>> re-check used idx at least once after writing avail->flags.
>>
>> By the way, similarly, host side must re-check avail idx after writing
>> used flags. I don't see where snabbswitch does it - is that a bug
>> in snabbswitch?
>>
> yes, both host and guest should recheck if there is more work added
> after they toggle the flag.
> For DPDK vHost, as it runs in polling mode, we will recheck avail idx
> soon, so we don't need recheck.
> 
> 

DPDK does check the avail idx but does nothing like this:
if (vq->last_used_idx == avail_idx) {
	return;
}

If we miss an interrupt after calling rte_mb(),
(!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) is False;
while (vq->last_used_idx == avail_idx) is True,

then the guest will miss the interrupt forever and virtio-net would stop working.

Would this case happen?





More information about the dev mailing list