[dpdk-dev] vhost-net stops sending to virito pmd -- already fixed?

Xie, Huawei huawei.xie at intel.com
Wed Sep 16 03:37:21 CEST 2015


On 9/16/2015 5:05 AM, Kyle Larose wrote:
> On Sun, Sep 13, 2015 at 5:43 PM, Thomas Monjalon
> <thomas.monjalon at 6wind.com> wrote:
>> Hi,
>>
>> 2015-09-11 12:32, Kyle Larose:
>>> Looking through the version tree for virtio_rxtx.c, I saw the following
>>> commit:
>>>
>>> http://dpdk.org/browse/dpdk/commit/lib/librte_pmd_virtio?id=8c09c20fb4cde76e53d87bd50acf2b441ecf6eb8
>>>
>>> Does anybody know offhand if the issue fixed by that commit could be the
>>> root cause of what I am seeing?
>> I won't have the definitive answer but I would like to use your question
>> to highlight a common issue in git messages:
>>
>> PLEASE, authors of fixes, explain the bug you are fixing and how it can
>> be reproduced. Good commit messages are REALLY read and useful.
>>
>> Thanks
>>
> I've figured out what happened. It has nothing to do with the fix I
> pasted above. Instead, the issue has to do with running low on mbufs.
>
> Here's the general logic:
>
> 1. If packets are not queued, return
> 2. Fetch each queued packet, as an mbuf, into the provided array. This
> may involve some merging/etc
> 3. Try to fill the virtio receive ring with new mbufs
>   3.a. If we fail to allocate an mbuf, break out of the refill loop
> 4. Update the receive ring information and kick the host
>
> This is obviously a simplification, but the key point is 3.a. If we
> hit this logic when the virtio receive ring is completely used up, we
> essentially lock up. The host will have no buffers with which to queue
> packets, so the next time we poll, we will hit case 1. However, since
> we hit case 1, we will not allocate mbufs to the virtio receive ring,
> regardless of how many are now free. Rinse and repeat; we are stuck
> until the pmd is restarted or the link is restarted.
>
> This is very easy to reproduce when the mbuf pool is fairly small, and
> packets are being passed to worker threads/processes which may
> increase the length of the pipeline.
Sorry for the trouble, and thanks a lot for your investigation. It quite
makes sense. We would check the code and come back to you.
I remember we fixed a similar problem before. Would check if there is
other dead lock issue.
>
> I took a quick look at the ixgbe driver, and it looks like it checks
> if it needs to allocate mbufs to the ring before trying to pull
> packets off the nic. Should we not be doing something similar for
> virtio? Rather than breaking out early if no packets are queued, we
> should first make sure there are resources with which to queue
> packets!
Yes, this is a implementation bug.
>
> One solution here is to increase the mbuf pool to a size where such
> exhaustion is impossible, but that doesn't seem like a graceful
> solution. For example, it may be desirable to drop packets rather than
> have a large memory pool, and becoming stuck under such a situation is
> not good. Further, it isn't easy to know the exact size required. You
> may end up wasting a bunch of resources allocating far more than
> necessary, or you may unknowingly under allocate, only to find out
> once your application has been deployed into production, and it's
> dropping everything on the floor.
Kyle:
Could you tell us how did you produce this issue, very small pool size
or you are using pipeline model?
>
> Does anyone have thoughts on this? I took a look at virtio_rxtx and
> head and I didn't see anything resembling my suggestion.
>
> Comments would be appreciated. Thanks,
>
> Kyle
>



More information about the dev mailing list