[dpdk-dev] [PATCH] vhost: improve dirty pages logging performance

Maxime Coquelin maxime.coquelin at redhat.com
Wed May 16 17:00:50 CEST 2018



On 05/16/2018 08:10 AM, Tiwei Bie wrote:
> On Tue, May 15, 2018 at 03:50:54PM +0200, Maxime Coquelin wrote:
>> Hi Tiwei,
>>
>> I just see I missed to reply to your comment on my commit message:
>>
>> On 05/03/2018 01:56 PM, Tiwei Bie wrote:
>>> On Mon, Apr 30, 2018 at 05:59:54PM +0200, Maxime Coquelin wrote:
>>>> This patch caches all dirty pages logging until the used ring index
>>>> is updated. These dirty pages won't be accessed by the guest as
>>>> long as the host doesn't give them back to it by updating the
>>>> index.
>>> Below sentence in above commit message isn't the reason why
>>> we can cache the dirty page logging. Right?
>>>
>>> """
>>> These dirty pages won't be accessed by the guest as
>>> long as the host doesn't give them back to it by updating the
>>> index.
>>
>> That's my understanding.
>> As long as the used index is not updated, the guest will not process
>> the descs.
>> If the migration converges between the time the descs are written,
>> and the time the used index is updated on source side. Then the guest
>> running on destination will not see the descriptors as used but as
>> available, and so will be overwritten by the vhost backend on
>> destination.
> 
> If my understanding is correct, theoretically the vhost
> backend can cache all the dirty page loggings before it
> responds to the GET_VRING_BASE messages. Below are the
> steps how QEMU live migration works (w/o postcopy):
> 
> 1. Syncing dirty pages between src and dst;
> 2. The dirty page sync converges;
> 3. The src QEMU sends GET_VRING_BASE to vhost backend;
> 4. Vhost backend still has a chance to log some dirty
>     pages before responding the GET_VRING_BASE messages;
> 5. The src QEMU receives GET_VRING_BASE response (which
>     means the device has stopped);
> 6. QEMU sync the remaining dirty pages;
> 7. QEMU on the dst starts running.

Thanks for the clarification.

> (The steps 3~6 are the downtime which we want to minimize)

Right, we want to minimize the downtime, but in the same time,
be able to converge.

> So I think the words in commit log isn't really related
> to why we can cache the dirty page loggings.

I'll try to improve it in v3.

Thanks,
Maxime

> Best regards,
> Tiwei Bie
> 
>>
>> Regards,
>> Maxime


More information about the dev mailing list