[dpdk-dev,v2,2/2] net/i40e: add hot plug monitor in i40e

Message ID	1498648044-57541-2-git-send-email-jia.guo@intel.com (mailing list archive)
State	Superseded, archived
Headers	From: Jeff Guo <jia.guo@intel.com> To: helin.zhang@intel.com, jingjing.wu@intel.com Cc: dev@dpdk.org, jia.guo@intel.com Date: Wed, 28 Jun 2017 19:07:24 +0800 Message-Id: <1498648044-57541-2-git-send-email-jia.guo@intel.com> In-Reply-To: <1498648044-57541-1-git-send-email-jia.guo@intel.com> References: <1495986280-26207-1-git-send-email-jia.guo@intel.com> <1498648044-57541-1-git-send-email-jia.guo@intel.com> Subject: [dpdk-dev] [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e Precedence: list Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org>

Context	Check	Description
ci/Intel-compilation	fail	Compilation issues
ci/checkpatch	success	coding style OK

Guo, Jia June 28, 2017, 11:07 a.m. UTC

  From: "Guo, Jia" <jia.guo@intel.com>

This patch enable the hot plug feature in i40e, by monitoring the
hot plug uevent of the device. When remove event got, call the app
callback function to handle the detach process.

Signed-off-by: Guo, Jia <jia.guo@intel.com>
---
v2->v1: remove unused part for current stage.
---
 drivers/net/i40e/i40e_ethdev.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Jingjing Wu June 29, 2017, 1:41 a.m. UTC | #1

> -----Original Message-----
> From: Guo, Jia
> Sent: Wednesday, June 28, 2017 7:07 PM
> To: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Guo, Jia <jia.guo@intel.com>
> Subject: [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e
> 
> From: "Guo, Jia" <jia.guo@intel.com>
> 
> This patch enable the hot plug feature in i40e, by monitoring the hot plug
> uevent of the device. When remove event got, call the app callback function to
> handle the detach process.
> 
> Signed-off-by: Guo, Jia <jia.guo@intel.com>
> ---
> v2->v1: remove unused part for current stage.
> ---
>  drivers/net/i40e/i40e_ethdev.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 4ee1113..122187e 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -1283,6 +1283,7 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw
> *hw)
> 
>  	/* enable uio intr after callback register */
>  	rte_intr_enable(intr_handle);
> +
>  	/*
>  	 * Add an ethertype filter to drop all flow control frames transmitted
>  	 * from VSIs. By doing so, we stop VF from sending out PAUSE or PFC
> @@ -5832,11 +5833,28 @@ struct i40e_vsi *  {
>  	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
>  	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> +	struct rte_uevent event;
>  	uint32_t icr0;
> +	struct rte_pci_device *pci_dev;
> +	struct rte_intr_handle *intr_handle;
> +
> +	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
> +	intr_handle = &pci_dev->intr_handle;
> 
>  	/* Disable interrupt */
>  	i40e_pf_disable_irq0(hw);
> 
> +	/* check device uevent */
> +	if (rte_uevent_get(intr_handle->uevent_fd, &event) > 0) {

You declare the rte_uevnet_get like

+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_uevent_get(int fd, struct rte_uevent *uevent);


But here you check if it > 0?

> +		if (event.subsystem == RTE_UEVENT_SUBSYSTEM_UIO) {
> +			if (event.action == RTE_UEVENT_REMOVE) {
> +				_rte_eth_dev_callback_process(dev,
> +					RTE_ETH_EVENT_INTR_RMV, NULL);
> +			}
> +		}
> +		goto done;

I think when the remove happen, no need to goto done, you can just return.
> +	}
> +
>  	/* read out interrupt causes */
>  	icr0 = I40E_READ_REG(hw, I40E_PFINT_ICR0);
> 
> --
> 1.8.3.1

Stephen Hemminger June 29, 2017, 3:34 a.m. UTC | #2

On Wed, 28 Jun 2017 19:07:24 +0800
Jeff Guo <jia.guo@intel.com> wrote:

> From: "Guo, Jia" <jia.guo@intel.com>
> 
> This patch enable the hot plug feature in i40e, by monitoring the
> hot plug uevent of the device. When remove event got, call the app
> callback function to handle the detach process.
> 
> Signed-off-by: Guo, Jia <jia.guo@intel.com>
> ---

Hot plug is good and needed.

But it needs to be done in a generic fashion in the bus layer.
There is nothing about uevents that are unique to i40e or even Intel
devices. Plus the way hotplug is handled is OS specific, so this isn't going
to work well on BSD.

Sorry if I sound like a broken record but there has been a repeated pattern
of Intel developers  putting their head down (or in the sand) and creating
functionality inside device driver.

Guo, Jia June 29, 2017, 4:31 a.m. UTC | #3

Yes, if got remove uevent might be directly return to avoid invalid i/o. but if got other uevent such as add and change, must be go done to keep the interrupt process in device. I will refine this part, thanks. 

Best regards,
Jeff Guo


-----Original Message-----
From: Wu, Jingjing 
Sent: Thursday, June 29, 2017 9:42 AM
To: Guo, Jia <jia.guo@intel.com>; Zhang, Helin <helin.zhang@intel.com>
Cc: dev@dpdk.org
Subject: RE: [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e



> -----Original Message-----
> From: Guo, Jia
> Sent: Wednesday, June 28, 2017 7:07 PM
> To: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing 
> <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Guo, Jia <jia.guo@intel.com>
> Subject: [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e
> 
> From: "Guo, Jia" <jia.guo@intel.com>
> 
> This patch enable the hot plug feature in i40e, by monitoring the hot 
> plug uevent of the device. When remove event got, call the app 
> callback function to handle the detach process.
> 
> Signed-off-by: Guo, Jia <jia.guo@intel.com>
> ---
> v2->v1: remove unused part for current stage.
> ---
>  drivers/net/i40e/i40e_ethdev.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev.c 
> b/drivers/net/i40e/i40e_ethdev.c index 4ee1113..122187e 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -1283,6 +1283,7 @@ static inline void i40e_GLQF_reg_init(struct 
> i40e_hw
> *hw)
> 
>  	/* enable uio intr after callback register */
>  	rte_intr_enable(intr_handle);
> +
>  	/*
>  	 * Add an ethertype filter to drop all flow control frames transmitted
>  	 * from VSIs. By doing so, we stop VF from sending out PAUSE or PFC 
> @@ -5832,11 +5833,28 @@ struct i40e_vsi *  {
>  	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
>  	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> +	struct rte_uevent event;
>  	uint32_t icr0;
> +	struct rte_pci_device *pci_dev;
> +	struct rte_intr_handle *intr_handle;
> +
> +	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
> +	intr_handle = &pci_dev->intr_handle;
> 
>  	/* Disable interrupt */
>  	i40e_pf_disable_irq0(hw);
> 
> +	/* check device uevent */
> +	if (rte_uevent_get(intr_handle->uevent_fd, &event) > 0) {

You declare the rte_uevnet_get like

+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_uevent_get(int fd, struct rte_uevent *uevent);


But here you check if it > 0?

> +		if (event.subsystem == RTE_UEVENT_SUBSYSTEM_UIO) {
> +			if (event.action == RTE_UEVENT_REMOVE) {
> +				_rte_eth_dev_callback_process(dev,
> +					RTE_ETH_EVENT_INTR_RMV, NULL);
> +			}
> +		}
> +		goto done;

I think when the remove happen, no need to goto done, you can just return.
> +	}
> +
>  	/* read out interrupt causes */
>  	icr0 = I40E_READ_REG(hw, I40E_PFINT_ICR0);
> 
> --
> 1.8.3.1

Guo, Jia June 29, 2017, 4:37 a.m. UTC | #4

From: "Guo, Jia" <jia.guo@intel.com>

This patch set aim to add a variable "uevent_fd" in structure
"rte_intr_handle" for enable kernel object uevent monitoring,
and add some uevent API in rte eal interrupt, that is
“rte_uevent_connect” and “rte_uevent_get”. The patch use i40e
for example, the driver could use these API to monitor and read
out the uevent, then corresponding to handle these uevent,
such as detach or attach the device.

Guo, Jia (2):
  eal: add uevent api for hot plug
  net/i40e: add hot plug monitor in i40e

 drivers/net/i40e/i40e_ethdev.c                     |  19 +++
 lib/librte_eal/common/eal_common_pci_uio.c         |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 136 ++++++++++++++++++++-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          |   6 +
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  37 ++++++
 5 files changed, 201 insertions(+), 3 deletions(-)

Jingjing Wu June 29, 2017, 4:48 a.m. UTC | #5

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, June 29, 2017 11:35 AM
> To: Guo, Jia <jia.guo@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e
> 
> On Wed, 28 Jun 2017 19:07:24 +0800
> Jeff Guo <jia.guo@intel.com> wrote:
> 
> > From: "Guo, Jia" <jia.guo@intel.com>
> >
> > This patch enable the hot plug feature in i40e, by monitoring the hot
> > plug uevent of the device. When remove event got, call the app
> > callback function to handle the detach process.
> >
> > Signed-off-by: Guo, Jia <jia.guo@intel.com>
> > ---
> 
> Hot plug is good and needed.
> 
> But it needs to be done in a generic fashion in the bus layer.
> There is nothing about uevents that are unique to i40e or even Intel devices.
> Plus the way hotplug is handled is OS specific, so this isn't going to work well on
> BSD.
> 
This patch is not a way to full support hut plug. And we know it is handled in OS specific.
This patch just provides a way to tell DPDK user the remove happened on this device (DPDK dev).

And Mlx driver already supports that with patch
http://dpdk.org/dev/patchwork/patch/23695/

What GuoJia did is just making the EVENT can be process by application through interrupt callback
Mechanisms.

> Sorry if I sound like a broken record but there has been a repeated pattern of
> Intel developers  putting their head down (or in the sand) and creating
> functionality inside device driver.
Sorry, I cannot agree.

Thanks
Jingjing

Guo, Jia June 29, 2017, 5:01 a.m. UTC | #6

From: "Guo, Jia" <jia.guo@intel.com>

This patch set aim to add a variable "uevent_fd" in structure
"rte_intr_handle" for enable kernel object uevent monitoring,
and add some uevent API in rte eal interrupt, that is
“rte_uevent_connect” and “rte_uevent_get”. The patch use i40e
for example, the driver could use these API to monitor and read
out the uevent, then corresponding to handle these uevent,
such as detach or attach the device.

Guo, Jia (2):
  eal: add uevent api for hot plug
  net/i40e: add hot plug monitor in i40e

 drivers/net/i40e/i40e_ethdev.c                     |  19 +++
 lib/librte_eal/common/eal_common_pci_uio.c         |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 136 ++++++++++++++++++++-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          |   6 +
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  37 ++++++
 5 files changed, 201 insertions(+), 3 deletions(-)

Guo, Jia June 29, 2017, 7:47 a.m. UTC | #7

Agree with jingjing.

That patch is definitely not for generic fashion of hot plug,  the uevent just give the adding  approach to monitor the remove event even if the driver not add it as interrupt , we know mlx driver have already implement the event of remove interrupt into their infinite framework driver, but other driver maybe not yet.
So uevent is not unique for i40e or other intel nic, the aim just let more diversity drivers which use pci-uio framework  to use the common hot plug feature in DPDK.

Best regards,
Jeff Guo


-----Original Message-----
From: Wu, Jingjing 
Sent: Thursday, June 29, 2017 12:48 PM
To: Stephen Hemminger <stephen@networkplumber.org>; Guo, Jia <jia.guo@intel.com>
Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Chang, Cunyin <cunyin.chang@intel.com>; Liang, Cunming <cunming.liang@intel.com>
Subject: RE: [dpdk-dev] [PATCH v2 2/2] net/i40e: add hot plug monitor in i40e



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, June 29, 2017 11:35 AM
> To: Guo, Jia <jia.guo@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing 
> <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/i40e: add hot plug monitor 
> in i40e
> 
> On Wed, 28 Jun 2017 19:07:24 +0800
> Jeff Guo <jia.guo@intel.com> wrote:
> 
> > From: "Guo, Jia" <jia.guo@intel.com>
> >
> > This patch enable the hot plug feature in i40e, by monitoring the 
> > hot plug uevent of the device. When remove event got, call the app 
> > callback function to handle the detach process.
> >
> > Signed-off-by: Guo, Jia <jia.guo@intel.com>
> > ---
> 
> Hot plug is good and needed.
> 
> But it needs to be done in a generic fashion in the bus layer.
> There is nothing about uevents that are unique to i40e or even Intel devices.
> Plus the way hotplug is handled is OS specific, so this isn't going to 
> work well on BSD.
> 
This patch is not a way to full support hut plug. And we know it is handled in OS specific.
This patch just provides a way to tell DPDK user the remove happened on this device (DPDK dev).

And Mlx driver already supports that with patch http://dpdk.org/dev/patchwork/patch/23695/

What GuoJia did is just making the EVENT can be process by application through interrupt callback Mechanisms.

> Sorry if I sound like a broken record but there has been a repeated 
> pattern of Intel developers  putting their head down (or in the sand) 
> and creating functionality inside device driver.
Sorry, I cannot agree.

Thanks
Jingjing

Guo, Jia April 13, 2018, 8:30 a.m. UTC | #8

About hot plug in dpdk, We already have proactive way to add/remove devices
through APIs (rte_eal_hotplug_add/remove), and also have fail-safe driver
to offload the fail-safe work from the app user. But there are still lack
of a general mechanism to monitor hotplug event for all driver, now the
hotplug interrupt event is diversity between each device and driver, such
as mlx4, pci driver and others.

Use the hot removal event for example, pci drivers not all exposure the
remove interrupt, so in order to make user to easy use the hot plug
feature for pci driver, something must be done to detect the remove event
at the kernel level and offer a new line of interrupt to the user land.

Base on the uevent of kobject mechanism in kernel, we could use it to
benefit for monitoring the hot plug status of the device which not only
uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express bus devices.

The idea is comming as bellow.

a.The uevent message form FD monitoring like below.
remove@/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
ACTION=remove
DEVPATH=/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
SUBSYSTEM=uio
MAJOR=243
MINOR=2
DEVNAME=uio2
SEQNUM=11366

b.add device event monitor framework:
add several general api to enable uevent monitoring.

c.show example how to use uevent monitor
enable uevent monitoring in testpmd to show device event monitor machenism usage.

TODO: failure handler mechanism for hot plug and driver auto bind for hot insertion.
that would let the next hot plug patch set to cover.

patchset history:
v22->v21:
fix clang compile issue and doc style

v21->v20:
refine release note and some code cleaning.

v20->v19:
add more detail note and socket error handler.

v19->18:
fix some typo and misunderstanding part

v18->v17:
1.add feature announcement in release document, fix bsp compile issue.
2.refine socket configuration.
3.remove hotplug policy and detach/attach process from testpmd, let it
focus on the device event monitoring which the patch set introduced.

v17->v16:
1.add related part of the interrupt handle type adding.
2.add new API into map, fix typo issue, add (void*)-1 value for unregister all callback
3.add new file into meson.build, modify coding sytle and add print info, delete unused part.
4.unregister all user's callback when stop event monitor

v16->v15:
1.remove some linux related code out of eal common layer
2.fix some uneasy readble issue.

v15->v14:
1.use exist eal interrupt epoll to replace of rte service usage for monitor thread,
2.add new device event handle type in eal interrupt.
3.remove the uevent type check and any policy from eal,
let it check and management in user's callback.
4.add "--hot-plug" configure parameter in testpmd to switch the hotplug feature.

v14->v13:
1.add __rte_experimental on function defind and fix bsd build issue

v13->v12:
1.fix some logic issue and null check issue
2.fix monitor stop func issue

v12->v11:
1.identify null param in callback for monitor all devices uevent

v11->v10:
1:modify some typo and add experimental tag in new file.
2:modify callback register calling.

v10->v9:
1.fix prefix issue.
2.use a common callback lists for all device and all type to replace
add callback parameter into device struct.
3.delete some unuse part.

v9->v8:
split the patch set into small and explicit patch

v8->v7:
1.use rte_service to replace pthread management.
2.fix defind issue and copyright issue
3.fix some lock issue

v7->v6:
1.modify vdev part according to the vdev rework
2.re-define and split the func into common and bus specific code
3.fix some incorrect issue.
4.fix the system hung after send packcet issue.

v6->v5:
1.add hot plug policy, in eal, default handle to prepare hot plug work for
all pci device, then let app to manage to deside which device need to
hot plug.
2.modify to manage event callback in each device.
3.fix some system hung issue when igb_uioome typo error.release.
4.modify the pci part to the bus-pci base on the bus rework.
5.add hot plug policy in app, show example to use hotplug list to manage
to deside which device need to hot plug.

v5->v4:
1.Move uevent monitor epolling from eal interrupt to eal device layer.
2.Redefine the eal device API for common, and distinguish between linux and bsd
3.Add failure handler helper api in bus layer.Add function of find device by name.
4.Replace of individual fd bind with single device, use a common fd to polling all device.
5.Add to register hot insertion monitoring and process, add function to auto bind driver befor user add device
6.Refine some coding style and typos issue
7.add new callback to process hot insertion

v4->v3:
1.move uevent monitor api from eal interrupt to eal device layer.
2.create uevent type and struct in eal device.
3.move uevent handler for each driver to eal layer.
4.add uevent failure handler to process signal fault issue.
5.add example for request and use uevent monitoring in testpmd.

v3->v2:
1.refine some return error
2.refine the string searching logic to avoid memory issue

v2->v1:
1.remove global variables of hotplug_fd, add uevent_fd
in rte_intr_handle to let each pci device self maintain it fd,
to fix dual device fd issue.
2.refine some typo error.

Jeff Guo (4):
  eal: add device event handle in interrupt thread
  eal: add device event monitor framework
  eal/linux: uevent parse and process
  app/testpmd: enable device hotplug monitoring

 app/test-pmd/parameters.c                          |   5 +-
 app/test-pmd/testpmd.c                             | 101 +++++++++-
 app/test-pmd/testpmd.h                             |   2 +
 doc/guides/rel_notes/release_18_05.rst             |  12 ++
 doc/guides/testpmd_app_ug/run_app.rst              |   4 +
 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal_dev.c                |  21 ++
 lib/librte_eal/bsdapp/eal/meson.build              |   1 +
 lib/librte_eal/common/eal_common_dev.c             | 161 +++++++++++++++
 lib/librte_eal/common/eal_private.h                |  15 ++
 lib/librte_eal/common/include/rte_dev.h            |  94 +++++++++
 lib/librte_eal/common/include/rte_eal_interrupts.h |   1 +
 lib/librte_eal/linuxapp/eal/Makefile               |   1 +
 lib/librte_eal/linuxapp/eal/eal_dev.c              | 223 +++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |  11 +-
 lib/librte_eal/linuxapp/eal/meson.build            |   1 +
 lib/librte_eal/rte_eal_version.map                 |   4 +
 test/test/test_interrupts.c                        |  39 +++-
 18 files changed, 692 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c

Thomas Monjalon April 13, 2018, 10:03 a.m. UTC | #9

13/04/2018 10:30, Jeff Guo:
> About hot plug in dpdk, We already have proactive way to add/remove devices
> through APIs (rte_eal_hotplug_add/remove), and also have fail-safe driver
> to offload the fail-safe work from the app user. But there are still lack
> of a general mechanism to monitor hotplug event for all driver, now the
> hotplug interrupt event is diversity between each device and driver, such
> as mlx4, pci driver and others.
> 
> Use the hot removal event for example, pci drivers not all exposure the
> remove interrupt, so in order to make user to easy use the hot plug
> feature for pci driver, something must be done to detect the remove event
> at the kernel level and offer a new line of interrupt to the user land.
> 
> Base on the uevent of kobject mechanism in kernel, we could use it to
> benefit for monitoring the hot plug status of the device which not only
> uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express bus devices.
[...]
> Jeff Guo (4):
>   eal: add device event handle in interrupt thread
>   eal: add device event monitor framework
>   eal/linux: uevent parse and process
>   app/testpmd: enable device hotplug monitoring

Applied, thanks

Guo, Jia April 18, 2018, 1:38 p.m. UTC | #10

At the prior, device event monitor framework have been introduced, 
the typical usage of it is for device hot plug. If we want application
would not be break down when device hot plug in or out, we still need some
measures to do recovery to do preparation for device detach, so that we will
not encounter any memory fault after device be hot unplug, that will let
application to keep working.

This patch set will introduces an API to implement the recovery mechanism to 
handle hot plug, and also use testpmd to show example how to
use the API for process hot plug event, let the process could be
smoothly like below:

plug out->failure handle->stop forward->stop port->close port->detach port

with this mechanism, user such as fail-safe driver or testpmd could be able to
develop their own hot plug application.

patchset history:
v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding  

v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.

v18->v15:
add document, add signal bus handler, refine the code to be more clear.

the prior patch history please check the patch set
"add device event monitor framework"

Jeff Guo (4):
  bus/pci: introduce device hot unplug handle
  eal: add failure handler mechanism for hot plug
  igb_uio: fix uio release issue when hot unplug
  app/testpmd: show example to handler hot unplug

 app/test-pmd/testpmd.c                  |  29 ++++++--
 doc/guides/rel_notes/release_18_05.rst  |   6 ++
 drivers/bus/pci/pci_common.c            |  67 +++++++++++++++++
 drivers/bus/pci/pci_common_uio.c        |  32 +++++++++
 drivers/bus/pci/private.h               |  12 ++++
 kernel/linux/igb_uio/igb_uio.c          |   4 ++
 lib/librte_eal/common/include/rte_bus.h |  16 +++++
 lib/librte_eal/common/include/rte_dev.h |  11 +++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 124 +++++++++++++++++++++++++++++++-
 lib/librte_eal/rte_eal_version.map      |   1 +
 10 files changed, 297 insertions(+), 5 deletions(-)

Guo, Jia May 3, 2018, 8:57 a.m. UTC | #11

At the prior, device event monitor framework have been introduced,
the typical usage is for device hot plug. If we want application
would not be break down when device hot plug in or out, we still need
some measures to help app to handle that, such as recovery device for
device detaching, so that app can keep running smoothly but not be
disturbed by any hotplug behaviors.

This patch set will introduces an recovery mechanism to handle hot unplug,
and also use testpmd to show example of how to use this mechanism to process
hot plug event. The process could be shown as below:

plug out->failure handle->stop forward->stop port->close port->detach port

with this mechanism, user such as fail-safe driver or testpmd could be
able to develop their own hot plug application.

patchset history:
v21->v20:
split function in hot unplug ops
sync failure hanlde to fix multiple process issue
fix attach port issue for multiple devices case.  

v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding  

v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.

v18->v15:
add document, add signal bus handler, refine the code to be more clear.

the prior patch history please check the patch set "add device event monitor framework"

Jeff Guo (4):
  bus/pci: handle device hot unplug
  eal: add failure handle mechanism for hot plug
  igb_uio: fix uio release issue when hot unplug
  app/testpmd: show example to handle hot unplug

 app/test-pmd/testpmd.c                  |  28 ++++--
 drivers/bus/pci/pci_common.c            |  65 ++++++++++++++
 drivers/bus/pci/pci_common_uio.c        |  33 +++++++
 drivers/bus/pci/private.h               |  12 +++
 kernel/linux/igb_uio/igb_uio.c          |   4 +
 lib/librte_eal/common/include/rte_bus.h |  16 ++++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 154 +++++++++++++++++++++++++++++++-
 7 files changed, 306 insertions(+), 6 deletions(-)

Guo, Jia May 3, 2018, 10:48 a.m. UTC | #12

At the prior, device event monitor framework have been introduced,
the typical usage is for device hot plug. If we want application
would not be break down when device hot plug in or out, we still need
some measures to help app to handle that, such as recovery device for
device detaching, so that app can keep running smoothly but not be
disturbed by any hotplug behaviors.

This patch set will introduces an recovery mechanism to handle hot unplug,
and also use testpmd to show example of how to use this mechanism to process
hot plug event. The process could be shown as below:

plug out->failure handle->stop forward->stop port->close port->detach port

with this mechanism, user such as fail-safe driver or testpmd could be
able to develop their own hot plug application.

patchset history:
v21->v20:
split function in hot unplug ops
sync failure hanlde to fix multiple process issue
fix attach port issue for multiple devices case.
combind rmv callback function to be only one.

v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding

v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.

v18->v15:
add document, add signal bus handler, refine the code to be more clear.

the prior patch history please check the patch set "add device event monitor framework"

Jeff Guo (4):
  bus/pci: handle device hot unplug
  eal: add failure handle mechanism for hot plug
  igb_uio: fix uio release issue when hot unplug
  app/testpmd: show example to handle hot unplug

 app/test-pmd/testpmd.c                  |  27 ++++--
 drivers/bus/pci/pci_common.c            |  65 ++++++++++++++
 drivers/bus/pci/pci_common_uio.c        |  33 +++++++
 drivers/bus/pci/private.h               |  12 +++
 kernel/linux/igb_uio/igb_uio.c          |   4 +
 lib/librte_eal/common/include/rte_bus.h |  16 ++++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 154 +++++++++++++++++++++++++++++++-
 7 files changed, 301 insertions(+), 10 deletions(-)

Guo, Jia June 29, 2018, 10:30 a.m. UTC | #13

As we know, hot plug is an importance feature, either use for the datacenter
device’s fail-safe, or use for SRIOV Live Migration in SDN/NFV. It could bring
the higher flexibility and continuality to the networking services in multiple
use cases in industry. So let we see, dpdk as an importance networking
framework, what can it help to implement hot plug solution for users.

We already have a general device event detect mechanism, failsafe driver,
bonding driver and hot plug/unplug api in framework, app could use these to
develop their hot plug solution.

let’s see the case of hot unplug, it can happen when a hardware device is
be removed physically, or when the software disables it. App need to call
ether dev API to detach the device, to unplug the device at the bus level and
make access to the device invalid. But the problem is that, the removal of the
device from the software lists is not going to be instantaneous, at this time
if the data(fast) path still read/write the device, it will cause MMIO error
and result of the app crash out.

Seems that we have got fail-safe driver(or app) + RTE_ETH_EVENT_INTR_RMV +
kernel core driver solution to handle it, but still not have failsafe driver
(or app) + RTE_DEV_EVENT_REMOVE + PCIe pmd driver failure handle solution. So
there is an absence in dpdk hot plug solution right now.

Also, we know that kernel only guaranty hot plug on the kernel side, but not for
the user mode side. Firstly we can hardly have a gatekeeper for any MMIO for
multiple PMD driver. Secondly, no more specific 3rd tools such as udev/driverctl
have especially cover these hot plug failure processing. Third, the feasibility
of app’s implement for multiple user mode PMD driver is still a problem. Here,
a general hot plug failure handle mechanism in dpdk framework would be proposed,
it aim to guaranty that, when hot unplug occur, the system will not crash and
app will not be break out, and user space can normally stop and release any
relevant resources, then unplug of the device at the bus level cleanly.

The mechanism should be come across as bellow:

Firstly, app enabled the device event monitor and register the hot plug event’s
callback before running data path. Once the hot unplug behave occur, the
mechanism will detect the removal event and then accordingly do the failure
handle. In order to do that, below functional will be bring in.
- Add a new bus ops “handle_hot_unplug” to handle bus read/write error, it is
bus-specific and each kind of bus can implement its own logic.
- Implement pci bus specific ops “pci_handle_hot_unplug”. It will base on the
failure address to remap memory for the corresponding device that unplugged.

For the data path or other unexpected control from the control path when hot
unplug occur.
- Implement a new sigbus handler, it is registered when start device even
monitoring. The handler is per process. Base on the signal event principle,
control path thread and data path thread will randomly receive the sigbus
error, but will go to the common sigbus handler. Once the MMIO sigbus error
exposure, it will trigger the above hot unplug operation. The sigbus will be
check if it is cause of the hot unplug or not, if not will info exception as
the original sigbus handler. If yes, will do memory remapping.

For the control path and the igb uio release:
- When hot unplug device, the kernel will release the device resource in the
kernel side, such as the fd sys file will disappear, and the irq will be
released. At this time, if igb uio driver still try to release this resource,
it will cause kernel crash.
On the other hand, something like interrupt disable do not automatically
process in kernel side. If not handler it, this redundancy and dirty thing
will affect the interrupt resource be used by other device.
So the igb_uio driver have to check the hot plug status and corresponding
process should be taken in igb uio deriver.
This patch propose to add structure of rte_udev_state into rte_uio_pci_dev
of igb_uio kernel driver, which will record the state of uio device, such as
probed/opened/released/removed/unplug. When detect the unexpected removal
which cause of hot unplug behavior, it will corresponding disable interrupt
resource, while for the part of releasement which kernel have already handle,
just skip it to avoid double free or null pointer kernel crash issue.

The mechanism could be use for fail-safe driver and app which want to use hot
plug solution. At this stage, will only use testpmd as reference to show how to
use the mechanism.
- Enable device event monitor->device unplug->failure handle->stop forwarding->
stop port->close port->detach port.

This process will not breaking the app/fail-safe running, and will not break
other irrelevance device. And app could plug in the device and restart the date
path again by below.
- Device plug in->bind igb_uio driver ->attached device->start port->
start forwarding.

patchset history:
v4->v3:
split patches to be small and clear
change to use new parameter "--hotplug-mode" in testpmd
to identify the eal hotplug and ethdev hotplug

v3->v2:
change bus ops name to bus_hotplug_handler.
add new API and bus ops of bus_signal_handler
distingush handle generic sigbus and hotplug sigbus

v2->v1(v21):
refine some doc and commit log
fix igb uio kernel issue for control path failure
rebase testpmd code

Since the hot plug solution be discussed serval around in the public, the
scope be changed and the patch set be split into many times. Coming to the
recently RFC and feature design, it just focus on the hot unplug failure
handler at this patch set, so in order let this topic more clear and focus,
summarize privours patch set in history “v1(v21)”, the v2 here go ahead
for further track.

"v1(21)" == v21 as below:
v21->v20:
split function in hot unplug ops
sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case.
combind rmv callback function to be only one.

v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding

v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.

v18->v15:
add document, add signal bus handler, refine the code to be more clear.

the prior patch history please check the patch set "add device event monitor framework"

Jeff Guo (9):
bus: introduce hotplug failure handler
bus/pci: implement hotplug handler operation
bus: introduce sigbus handler
bus/pci: implement sigbus handler operation
bus: add helper to handle sigbus
eal: add failure handle mechanism for hot plug
igb_uio: fix uio release issue when hot unplug
app/testpmd: show example to handle hot unplug
app/testpmd: enable device hotplug monitoring

app/test-pmd/parameters.c | 20 ++++++--
app/test-pmd/testpmd.c | 31 +++++++-----
app/test-pmd/testpmd.h | 8 ++-
doc/guides/testpmd_app_ug/run_app.rst | 10 +++-
drivers/bus/pci/pci_common.c | 78 +++++++++++++++++++++++++++++
drivers/bus/pci/pci_common_uio.c | 33 +++++++++++++
drivers/bus/pci/private.h | 12 +++++
kernel/linux/igb_uio/igb_uio.c | 50 +++++++++++++++++--
lib/librte_eal/common/eal_common_bus.c | 34 ++++++++++++-
lib/librte_eal/common/eal_private.h | 11 +++++
lib/librte_eal/common/include/rte_bus.h | 31 ++++++++++++
lib/librte_eal/linuxapp/eal/eal_dev.c | 88 ++++++++++++++++++++++++++++++++-
12 files changed, 381 insertions(+), 25 deletions(-)

Guo, Jia July 5, 2018, 8:21 a.m. UTC | #14

As we know, hot plug is an importance feature, either use for the datacenter
device’s fail-safe, or use for SRIOV Live Migration in SDN/NFV. It could bring
the higher flexibility and continuality to the networking services in multiple
use cases in industry. So let we see, dpdk as an importance networking
framework, what can it help to implement hot plug solution for users.

We already have a general device event detect mechanism, failsafe driver,
bonding driver and hot plug/unplug api in framework, app could use these to
develop their hot plug solution.

let’s see the case of hot unplug, it can happen when a hardware device is
be removed physically, or when the software disables it. App need to call
ether dev API to detach the device, to unplug the device at the bus level and
make access to the device invalid. But the problem is that, the removal of the
device from the software lists is not going to be instantaneous, at this time
if the data(fast) path still read/write the device, it will cause MMIO error
and result of the app crash out.

Seems that we have got fail-safe driver(or app) + RTE_ETH_EVENT_INTR_RMV +
kernel core driver solution to handle it, but still not have failsafe driver
(or app) + RTE_DEV_EVENT_REMOVE + PCIe pmd driver failure handle solution. So
there is an absence in dpdk hot plug solution right now.

Also, we know that kernel only guaranty hot plug on the kernel side, but not for
the user mode side. Firstly we can hardly have a gatekeeper for any MMIO for
multiple PMD driver. Secondly, no more specific 3rd tools such as udev/driverctl
have especially cover these hot plug failure processing. Third, the feasibility
of app’s implement for multiple user mode PMD driver is still a problem. Here,
a general hot plug failure handle mechanism in dpdk framework would be proposed,
it aim to guaranty that, when hot unplug occur, the system will not crash and
app will not be break out, and user space can normally stop and release any
relevant resources, then unplug of the device at the bus level cleanly.

The mechanism should be come across as bellow:

Firstly, app enabled the device event monitor and register the hot plug event’s
callback before running data path. Once the hot unplug behave occur, the
mechanism will detect the removal event and then accordingly do the failure
handle. In order to do that, below functional will be bring in.
- Add a new bus ops “handle_hot_unplug” to handle bus read/write error, it is
bus-specific and each kind of bus can implement its own logic.
- Implement pci bus specific ops “pci_handle_hot_unplug”. It will base on the
failure address to remap memory for the corresponding device that unplugged.

For the data path or other unexpected control from the control path when hot
unplug occur.
- Implement a new sigbus handler, it is registered when start device even
monitoring. The handler is per process. Base on the signal event principle,
control path thread and data path thread will randomly receive the sigbus
error, but will go to the common sigbus handler. Once the MMIO sigbus error
exposure, it will trigger the above hot unplug operation. The sigbus will be
check if it is cause of the hot unplug or not, if not will info exception as
the original sigbus handler. If yes, will do memory remapping.

For the control path and the igb uio release:
- When hot unplug device, the kernel will release the device resource in the
kernel side, such as the fd sys file will disappear, and the irq will be
released. At this time, if igb uio driver still try to release this resource,
it will cause kernel crash.
On the other hand, something like interrupt disable do not automatically
process in kernel side. If not handler it, this redundancy and dirty thing
will affect the interrupt resource be used by other device.
So the igb_uio driver have to check the hot plug status and corresponding
process should be taken in igb uio deriver.
This patch propose to add structure of rte_udev_state into rte_uio_pci_dev
of igb_uio kernel driver, which will record the state of uio device, such as
probed/opened/released/removed/unplug. When detect the unexpected removal
which cause of hot unplug behavior, it will corresponding disable interrupt
resource, while for the part of releasement which kernel have already handle,
just skip it to avoid double free or null pointer kernel crash issue.

The mechanism could be use for fail-safe driver and app which want to use hot
plug solution. let testpmd for example:
- Enable device event monitor->device unplug->failure handle->stop forwarding->
stop port->close port->detach port.

This process will not breaking the app/fail-safe running, and will not break
other irrelevance device. And app could plug in the device and restart the date
path again by below.
- Device plug in->bind igb_uio driver ->attached device->start port->
start forwarding.

patchset history:
v5->v4:
split patches to focus on the failure handle, remove the event usage by testpmd
to another patch.
change the hotplug failure handler name
refine the sigbus handle logic
add lock for udev state in igb uio driver

v4->v3:
split patches to be small and clear
change to use new parameter "--hotplug-mode" in testpmd
to identify the eal hotplug and ethdev hotplug