[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

Lu, Wenzhuo wenzhuo.lu at intel.com
Wed Jun 22 07:05:14 CEST 2016



> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Wednesday, June 22, 2016 12:15 PM
> To: Lu, Wenzhuo
> Cc: Ananyev, Konstantin; Stephen Hemminger; dev at dpdk.org; Richardson,
> Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset
> 
> On Wed, Jun 22, 2016 at 03:32:16AM +0000, Lu, Wenzhuo wrote:
> > Hi Jerin,
> >
> > > -----Original Message-----
> > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > Sent: Wednesday, June 22, 2016 10:38 AM
> > > To: Lu, Wenzhuo
> > > Cc: Ananyev, Konstantin; Stephen Hemminger; dev at dpdk.org;
> > > Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing;
> > > Zhang, Helin; thomas.monjalon at 6wind.com
> > > Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support
> > > device reset
> > >
> > > On Wed, Jun 22, 2016 at 01:35:37AM +0000, Lu, Wenzhuo wrote:
> > > > Hi Jerin,
> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > > > Sent: Tuesday, June 21, 2016 10:29 PM
> > > > > To: Ananyev, Konstantin
> > > > > Cc: Lu, Wenzhuo; Stephen Hemminger; dev at dpdk.org; Richardson,
> > > > > Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> > > > > thomas.monjalon at 6wind.com
> > > > > Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support
> > > > > device reset
> > > > >
> > > > > On Tue, Jun 21, 2016 at 02:03:15PM +0000, Ananyev, Konstantin wrote:
> > > > > >
> > > > > >
> > > > > > > > > > > Hi Wenzhuo,
> > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Jun 20, 2016 at 02:24:27PM
> > > > > > > > > > > > > > > > > +0800, Wenzhuo Lu
> > > > > wrote:
> > > > > > > > > > > > > > > > > > Add an API to reset the device.
> > > > > > > > > > > > > > > > > > It's for VF device in this scenario, kernel PF + DPDK
> VF.
> > > > > > > > > > > > > > > > > > When the PF port down->up, APP should
> > > > > > > > > > > > > > > > > > call this API to reset VF port. Most
> > > > > > > > > > > > > > > > > > likely, APP should call it in its
> > > > > > > > > > > > > > > > > > management thread and guarantee the
> > > > > > > > > > > > > > > > > > thread safe. It means APP should stop
> > > > > > > > > > > > > > > > > > the rx/tx and the device, then reset
> > > > > > > > > > > > > > > > > > the device, then
> > > > > recover the device and rx/tx.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Following is _a_ use-case for Device reset.
> > > > > > > > > > > > > > > > > But may be not be _the_ use case. IMO,
> > > > > > > > > > > > > > > > > We need to first say expected behavior
> > > > > > > > > > > > > > > > > of this API and add a use-case
> > > > > later.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Other use-case would be, PCIe VF with
> > > > > > > > > > > > > > > > > functional level reset for SRIOV migration.
> > > > > > > > > > > > > > > > > Are we on same page?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > In my experience with Linux devices, this
> > > > > > > > > > > > > > > > is normally handled by the device driver
> > > > > > > > > > > > > > > > in the start routine.  Since any use case
> > > > > > > > > > > > > > > > which needs this is going to do a
> > > > > > > > > > > > > > > > stop/reset/start sequence, why not just
> > > > > > > > > > > > > > > > have
> > > > > the VF device driver do this in the start routine?.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Adding yet another API and state
> > > > > > > > > > > > > > > > transistion if not necessary increases the
> > > > > > > > > > > > > > > > complexity and required test
> > > > > cases for all devices.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I agree with Stephen here.I think if
> > > > > > > > > > > > > > > application needs to call start after the
> > > > > > > > > > > > > > > device reset then we could add this logic in
> > > > > > > > > > > > > > > start itself rather exposing a yet another
> > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > Do you mean changing the device_start to
> > > > > > > > > > > > > > include all these actions, stop
> > > > > > > > > > > > > device -> stop queue -> re-setup queue -> start
> > > > > > > > > > > > > queue -> start
> > > > > device ?
> > > > > > > > > > > > >
> > > > > > > > > > > > > What was the expected API call sequence when you
> > > > > > > > > > > > > were
> > > > > introduced this API?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Point was to have implicit device reset in the
> > > > > > > > > > > > > API call sequence(Wherever make sense for
> > > > > > > > > > > > > specific PMD)
> > > > > > > > > > > > I think the API call sequence depends on the
> > > > > > > > > > > > implementation of the APP. Let's say if there's
> > > > > > > > > > > > not this reset API, APP can use
> > > > > > > this
> > > > > > > > > API
> > > > > > > > > > > call sequence to handle the PF link down/up event,
> > > > > > > > > > > rte_eth_dev_close -> rte_eth_rx_queue_setup ->
> > > > > > > rte_eth_tx_queue_setup -
> > > > > > > > > >
> > > > > > > > > > > rte_eth_dev_start.
> > > > > > > > > > > > Actually our purpose is to use this reset API
> > > > > > > > > > > > instead of the API call sequence. You can see the
> > > > > > > > > > > > reset API is not necessary. The
> > > > > > > > > benefit
> > > > > > > > > > > is to save the code for APP.
> > > > > > > > > > >
> > > > > > > > > > > Then I am bit confused with original commit log description.
> > > > > > > > > > > |
> > > > > > > > > > > |It means APP should stop the rx/tx and the device,
> > > > > > > > > > > |then reset the device, then recover the device and rx/tx.
> > > > > > > > > > > |
> > > > > > > > > > > I was under impression that it a low level reset API
> > > > > > > > > > > for this device? Is n't it?
> > > > > > > > > > >
> > > > > > > > > > > The other issue is generalized outlook of the API,
> > > > > > > > > > > Certain PMD will not have PF link down/up event?
> > > > > > > > > > > Link down/up and only connected to VF and PF only for
> configuration.
> > > > > > > > > > >
> > > > > > > > > > > How about fixing it more transparently in PMD driver
> > > > > > > > > > > itself as PMD driver knows the PF link up/down
> > > > > > > > > > > event, Is it possible to recover the VF on that
> > > > > > > > > > > event if its only matter of resetting
> > > > > it?
> > > > > > > > > >
> > > > > > > > > > I think we already went through that discussion on the list.
> > > > > > > > > > Unfortunately with current dpdk design it is hardly possible.
> > > > > > > > > > To achieve that we need to introduce some sort of
> > > > > > > > > > synchronisation between IO and control APIs (locking or so).
> > > > > > > > > > Actually I am not sure why having a special reset
> > > > > > > > > > function will be a
> > > > > problem.
> > > > > > > > >
> > > > > > > > > |
> > > > > > > > > |It means APP should stop the rx/tx and the device, then
> > > > > > > > > |reset the device, then recover the device and rx/tx.
> > > > > > > > > |
> > > > > > > > > Just to understand, If application still need  to do the
> > > > > > > > > stop then what value addtion reset API brings on the table?
> > > > > > > >
> > > > > > > > If application calls dev_reset() it doesn't need to call
> > > > > > > > dev_stop() before
> > > it.
> > > > > > > > dev_reset() will take care of it.
> > > > > > > > But it needs to make sure that no other thread will try to
> > > > > > > > modify that device state (either dev_stop/start, or
> > > > > > > > eth_rx_busrst/eth_tx_burst)
> > > > > while the reset op is in place.
> > > > > > >
> > > > > > > OK. This description looks different than commit log and API
> > > > > > > doxygen
> > > > > comment. Please fix it.
> > > > > > > How about a different name for this API. Device reset is too generic?
> > > > Any suggestion? I use this name because I believe what this API do
> > > > is to reset
> > > the device.
> > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Yes, it would exist only for VFs, for PF it could be
> > > > > > > > > > left
> > > unimplemented.
> > > > > > > > > > Though it definitely seems more convenient from user
> > > > > > > > > > point of view, they would know: to handle VF reset
> > > > > > > > > > event, they just need to call that particular
> > > > > > > > > > function, not to re-implement their
> > > own.
> > > > > > > > > What if driver returns "not implemented" then
> > > > > > > > >application will have do  generic
> rte_eth_dev_stop/rte_eth_dev_start.
> > > > > > > > >That way in application  perspective we are NOT solving any
> problem.
> > > > > > > >
> > > > > > > > True, but as I said for PF application would just never receive such
> event.
> > > > > > > What is this event ? Is it VF Link up/down event?
> > > > > > >
> > > > > > > No I was referring to VF itself, Other VF PMD drivers in
> > > > > > > drivers/net where this callback is not implemented.
> > > > > >
> > > > > > Hmm, the only suggestion I have here - Maintainers/developers
> > > > > > of non-Intel PMD will implement it for their VFs?
> > > > >
> > > > > That's fine. But, We have to know what to implement here in PMD
> > > perspective?
> > > > > That's reason being asking about the API expectation and
> > > > > application usage :-)
> > > > >
> > > > > > In case of course they do need to handle similar event.
> > > > > Which is this event and How application get notify it.
> > > > When the PF link is down/up, the PF will use the mailbox to send a
> > > > message to
> > > VF. The event here means the VF receives that message from PF. So VF
> > > can know the physical link state changed. You see it's only for VF.
> > > PF will not receive such kind of message.
> > > > And we use the callback mechanism to let APP notified. APP should
> > > > register a
> > > callback function. When VF driver receives the message it will call
> > > the callback function, then APP can know that.
> > >
> > > How about the standardizing  a name for that event like
> > > RTE_ETH_EVENT_INTR_DOWNSTREAM_LSC or
> RTE_ETH_EVENT_INTR_PF_LSC or
> > > similar (like RTE_ETH_EVENT_INTR_RESET) and counter API in VF to
> > > handle the specific event whose API name similar to selected event
> > > name not eth_dev_reset(reset sounds like more like HW reset, In PCIe
> > > device perspective FLR etc)
> > >
> > > OR
> > >
> > > How about handling in more generic way where a generic alert message
> > > send by PF to VF like RTE_ETH_EVENT_INTR_PF_ALERT or similar.
> > > And have only one handle functions in VF side so that in future we
> > > can keep adding new functionality with out introducing new counter
> > > API in VF
> > >
> > > Jerin
> > Lost here. I think these RTE_ETH_EVENTs are used to connect the APP call
> back functions with the events.
> > Actually I want the APP to register a callback function reset_event_callback for
> the reset event. Like this,
> > 		/* register reset interrupt callback */
> > 		rte_eth_dev_callback_register(portid,
> > 			RTE_ETH_EVENT_INTR_RESET, reset_event_callback,
> NULL); And when the
> > VF driver finds PF link down/up, it  should  use
> _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RESET) to run into
> the callback which is provided by APP. Means reset_event_callback here.
> 
> me too. Their is existing RTE_ETH_EVENT_INTR_RESET event to notify the PF
> reset.I guess it is not for the PF link change or it isfor generic VF reset request
> initiated by PF for everything.
I think this event is for device reset not only for PF but also can for VF. I think we can use this event when the driver want the APP to reset the device. The PF link down/up caused VF reset is one of the cases.

> 
> file: lib/librte_ether/rte_ethdev.h
>         RTE_ETH_EVENT_INTR_RESET,
> 		/**< reset interrupt event, sent to VF on PF reset */
>                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> if application need to call rte_ethdev_reset() on  RTE_ETH_EVENT_INTR_RESET
> event then please mention it commit log or API description.
Good suggestion. I'll try to find where's the good place to add more explanation.
> 



More information about the dev mailing list