[dpdk-dev] virtio "how to restart applications" - //dpdk.org/doc/virtio-net-pmd
Gopakumar Choorakkot Edakkunni
gopakumar.c.e at gmail.com
Sat Mar 18 22:37:56 CET 2017
I mean vtpci_reset is called from rte_eal_pci_probe() which is the *last*
thing in rte_eal_init(), *after* hugepage init, so if I can somehow get
that done *before* hugepage init maybe all will be well (because I cant do
anything to fix the host side)
Rgds,
Gopa.
On Sat, Mar 18, 2017 at 2:32 PM, Gopakumar Choorakkot Edakkunni <
gopakumar.c.e at gmail.com> wrote:
> Hi Yuan,
>
> As a "hack"/"workaround", in rte_eal_init(), if I can call vtpci_reset()
> just before rte_eal_memory_init(), that should take care of the problem of
> host zeroing out hugepages right ? As of today vtpci_reset() is called in
> rte_eal_dev_init() which comes *after* rte_eal_memory_init()
>
> Rgds,
> Gopa.
>
> On Thu, Mar 16, 2017 at 10:50 PM, Gopakumar Choorakkot Edakkunni <
> gopakumar.c.e at gmail.com> wrote:
>
>> Thanks again Yuanhan, you are the true expert!!
>>
>> Rgds,
>> Gopa.
>>
>> On Thu, Mar 16, 2017 at 10:40 PM, Yuanhan Liu <
>> yuanhan.liu at linux.intel.com> wrote:
>>
>>> On Thu, Mar 16, 2017 at 10:30:09PM -0700, Gopakumar Choorakkot Edakkunni
>>> wrote:
>>> > Thanks for the confirmation, glad I reached the person who knows the
>>> nuts and
>>> > bolts of virtio :-). So if the host is not in our control (ie if I am
>>> just
>>> > running as a VM on host provided by thirdparty vendor), is there any
>>> workaround
>>> > I can do from the guest side to prevent problems from happening on a
>>> guest
>>> > restart ?
>>>
>>> Not too much. You might want to hack the guest DPDK EAL memory initiation
>>> part though, to not reset the hugepage memory on start. But that's too
>>> hacky
>>> that I will not recommend you to do so!
>>>
>>> > And if theres no workarounds at all and the host has to change,
>>> instead of
>>> > asking the third party vendor to do a wholesale upgrade to 16.04, is
>>> there one/
>>> > few commits that can be added to the host ovs-dpdk to take care of
>>> this guest
>>> > restart virtio-reset-before opening case ?
>>>
>>> Yes, backporting the commits I have mentioned should be able to fix it.
>>> But please note that I did some code refactorings before those fixes: it
>>> won't apply cleanly to DPDK v2.2.
>>>
>>> And if you want to upgrade, I'd suggest to upgrade to v16.11, which is
>>> LTS release.
>>>
>>> --yliu
>>> >
>>> > Rgds,
>>> > Gopa.
>>> >
>>> > On Thu, Mar 16, 2017 at 10:24 PM, Yuanhan Liu <
>>> yuanhan.liu at linux.intel.com>
>>> > wrote:
>>> >
>>> > On Thu, Mar 16, 2017 at 10:20:30PM -0700, Gopakumar Choorakkot
>>> Edakkunni
>>> > wrote:
>>> > > >> When I was saying dpdk version, I meant the DPDK version with
>>> OVS.
>>> > >
>>> > > Oh I see! My apologies for the misuderstanding. The dpdk version
>>> used by
>>> > host
>>> > > ovs should be dpdk2.2, the guest process uses dpdk16.07. The OVS
>>> process
>>> > is not
>>> > > getting restarted, what is getting restarted is the guest
>>> process using
>>> > > dpdk16.07 - so the above clarifications you had about virtio
>>> being
>>> > > reset-before-opened on guest restart - does that still hold good
>>> or does
>>> > that
>>> > > need the HOST side dpdk to be 16.04 or above ?
>>> >
>>> > Yes, the HOST dpdk should be >= v16.04.
>>> >
>>> > --yliu
>>> > >
>>> > > >> And yes, the fixes are not included in the DPDK required for
>>> OVS 2.4.
>>> > >
>>> > > Thanks for the info.
>>> > >
>>> > > Rgds,
>>> > > Gopa.
>>> > >
>>> > > On Thu, Mar 16, 2017 at 10:13 PM, Yuanhan Liu <
>>> > yuanhan.liu at linux.intel.com>
>>> > > wrote:
>>> > >
>>> > > On Thu, Mar 16, 2017 at 09:56:01PM -0700, Gopakumar
>>> Choorakkot
>>> > Edakkunni
>>> > > wrote:
>>> > > > Hi Yuanhan,
>>> > > >
>>> > > > Thanks for the confirmation about not having to do
>>> anything special
>>> > to
>>> > > close
>>> > > > the ports on dpdk going down or coming up.
>>> > > >
>>> > > > As for the question about if I met any issue of ovs
>>> getting stuck -
>>> > yes,
>>> > > my
>>> > > > guest process runs dpdk 16.07 as I mentioned earlier - and
>>> if I
>>> > kill my
>>> > > guest
>>> > > > process, then the host OVS-dpdk on the host reports stall
>>> ! The
>>> > OVS-dpdk
>>> > > and
>>> > > > emu versions I use are as below. But maybe that is because
>>> of the
>>> > ovs
>>> > > missing
>>> > > > the fixes you mentioned ?
>>> > >
>>> > > When I was saying dpdk version, I meant the DPDK version
>>> with OVS.
>>> > >
>>> > > > ~# ovs-vswitchd --version
>>> > > > ovs-vswitchd (Open vSwitch) 2.4.1
>>> > >
>>> > > And yes, the fixes are not included in the DPDK required for
>>> OVS 2.4.
>>> > >
>>> > > --yliu
>>> > >
>>> > > > Compiled Nov 14 2016 06:53:31
>>> > > > # kvm --version
>>> > > > QEMU emulator version 2.2.0, Copyright (c) 2003-2008
>>> Fabrice
>>> > Bellard
>>> > > > ~#
>>> > > >
>>> > > >
>>> > > > Rgds,
>>> > > > Gopa.
>>> > > >
>>> > > > On Thu, Mar 16, 2017 at 9:35 PM, Yuanhan Liu <
>>> > yuanhan.liu at linux.intel.com
>>> > > >
>>> > > > wrote:
>>> > > >
>>> > > > On Thu, Mar 16, 2017 at 07:48:28PM -0700, Gopakumar
>>> Choorakkot
>>> > > Edakkunni
>>> > > > wrote:
>>> > > > > Thanks a lot for the response Yuanhan. I am using
>>> dpdk
>>> > v16.07. So
>>> > > what
>>> > > > you are
>>> > > > > saying is that in 16.07, we dont really need to call
>>> > > rte_eth_dev_close()
>>> > > > on
>>> > > > > exit,
>>> > > >
>>> > > > It's not about "don't really need", it's more like
>>> "it's hard
>>> > to".
>>> > > Just
>>> > > > think that it may crash at any time.
>>> > > >
>>> > > > > because dpdk will ensure that it will do virtio
>>> reset before
>>> > init
>>> > > when it
>>> > > > > comes up right ?
>>> > > >
>>> > > > No, It just handles the abnormal case well when guest
>>> APP
>>> > restarts.
>>> > > >
>>> > > > > Regarding the vhost commits you mentioned - do we
>>> still need
>>> > those
>>> > > fixes
>>> > > > if we
>>> > > > > have the "virtio reset before init" mechanism ?
>>> > > >
>>> > > > Yes, we still need them: just think some malicious
>>> guest may
>>> > also
>>> > > forge
>>> > > > data like that.
>>> > > >
>>> > > > I'm a bit confused then. Have you actually met any
>>> issue (like
>>> > got
>>> > > stucked)
>>> > > > with DPDK v16.07?
>>> > > >
>>> > > > --yliu
>>> > > >
>>> > > > > Or that is a seperate problem
>>> > > > > altogether (and hence we would need those fixes) ?
>>> > > > >
>>> > > > > Rgds,
>>> > > > > Gopa.
>>> > > > >
>>> > > > > On Thu, Mar 16, 2017 at 7:06 PM, Yuanhan Liu <
>>> > > yuanhan.liu at linux.intel.com
>>> > > > >
>>> > > > > wrote:
>>> > > > >
>>> > > > > On Thu, Mar 16, 2017 at 12:39:16PM -0700,
>>> Gopakumar
>>> > Choorakkot
>>> > > > Edakkunni
>>> > > > > wrote:
>>> > > > > > So the doc says we should call
>>> rte_eth_dev_close()
>>> > *before*
>>> > > going
>>> > > > down.
>>> > > > > And I
>>> > > > > > know that especially in dpdk-virtionet in the
>>> guest +
>>> > > ovs-dpdk in
>>> > > > the
>>> > > > > host,
>>> > > > > > the ovs ends up getting stalled/stuck (!!) if
>>> I dont
>>> > close
>>> > > the port
>>> > > > > before
>>> > > > > > starting() it when the guest dpdk process
>>> comes back
>>> > up.
>>> > > > >
>>> > > > > I'm assuming you were using an old version,
>>> something
>>> > like dpdk
>>> > > v2.2?
>>> > > > > IIRC, DPDK v16.04 should have fixed your issue.
>>> > > > >
>>> > > > > > Considering that this not done properly can
>>> screw up
>>> > the HOST
>>> > > ovs,
>>> > > > and I
>>> > > > > want
>>> > > > > > to do everything possible to avoid that, I
>>> want to be
>>> > 200%
>>> > > sure
>>> > > > that I
>>> > > > > call
>>> > > > > > close even if my process gets a kill -9 .. So
>>> obviously
>>> > the
>>> > > only
>>> > > > way of
>>> > > > > doing
>>> > > > > > that is to close the port when the dpdk
>>> process comes
>>> > back up
>>> > > and
>>> > > > > *before* we
>>> > > > > > init the port. rte_eth_dev_close() is not
>>> capable of
>>> > doing
>>> > > that as
>>> > > > it
>>> > > > > expects
>>> > > > > > the port parameters to be initialized etc..
>>> before it
>>> > can be
>>> > > > called.
>>> > > > >
>>> > > > > We do virtio reset before init, which is
>>> basically what
>>> > > > rte_eth_dev_close()
>>> > > > > mainly does. So I see no big issue here.
>>> > > > >
>>> > > > > The stuck issue is due to hugepage reset by the
>>> guest
>>> > DPDK
>>> > > > application,
>>> > > > > leading all virtio vring elements being mem
>>> zeroed. The
>>> > old
>>> > > vhost
>>> > > > doesn't
>>> > > > > handle it well, as a result, it got stuck. And
>>> here are
>>> > some
>>> > > relevant
>>> > > > > commits:
>>> > > > >
>>> > > > > a436f53 vhost: avoid dead loop chain
>>> > > > > c687b0b vhost: check for ring descriptors
>>> overflow
>>> > > > > 623bc47 vhost: do sanity check for ring
>>> descriptor
>>> > length
>>> > > > >
>>> > > > > --yliu
>>> > > > >
>>> > > > > > Any other
>>> > > > > > suggestions on what can be done to close on
>>> restart
>>> > rather
>>> > > than
>>> > > > close on
>>> > > > > going
>>> > > > > > down ? Thought of bouncing this by the alias
>>> before I
>>> > add a
>>> > > version
>>> > > > of
>>> > > > > close
>>> > > > > > myself that can do this close-on-restart
>>> > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>>
>>
>>
>
More information about the dev
mailing list