[dpdk-dev,v4,2/2] app/testpmd: use uevent to monitor hot removal

Message ID 1504453785-15735-3-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues

Commit Message

Guo, Jia Sept. 3, 2017, 3:49 p.m. UTC
  use testpmd for example, to show app how to request and use
uevent monitoring to handle an event of device hot removal.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
 app/test-pmd/testpmd.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)
  

Comments

Jan Blunck Sept. 19, 2017, 6:44 p.m. UTC | #1
On Wed, Sep 20, 2017 at 6:12 AM, Jeff Guo <jia.guo@intel.com> wrote:
> So far, about hot plug in dpdk, we already have hot plug add/remove
> api and fail-safe driver to offload the fail-safe work from the app
> user. But there are still lack of a general event api, since the interrupt
> event, which hot plug related with, is diversity between each device and
> driver, such as mlx4, pci driver and others.
>
> Use the hot removal event for example, pci drivers not all exposure the
> remove interrupt, so in order to make user to easy use the hot plug feature
> for pci driver, something must be done to detect the remove event at the
> kernel level and offer a new line of interrupt to the user land.
>
> Base on the uevent of kobject mechanism in kernel, we could use it to
> benefit for monitoring the hot plug status of the device which not only
> uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express
> bus devices.
>
> The idea is comming as bellow.
>

Jeff,

We already have libudev. Sorry for catching it that late but I don't
believe that replicating the uevent handling belongs in the DPDK. You
might want to look into a helper to find the corresponding rte_device
for a given devnode though.

Also the remap_device function should get removed. There is no
synchronization between the polling for the uevent and the rest of the
drivers. Therefore there is no guarantee that you can remap to "safe"
memory. You should fix the drivers instead.

Thanks,
Jan


> a.The uevent message form FD monitoring which will be useful.
> remove@/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
> ACTION=remove
> DEVPATH=/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
> SUBSYSTEM=uio
> MAJOR=243
> MINOR=2
> DEVNAME=uio2
> SEQNUM=11366
>
> b.add uevent monitoring machanism:
> add several general api to enable uevent monitoring.
>
> c.add common uevent handler and uevent failure handler
> uevent of device should be handler at bus or device layer, and the memory read
> and write failure when hot removal should be handle correctly before detach behaviors.
>
> d.show example how to use uevent monitor
> enable uevent monitoring in testpmd or fail-safe to show usage.
>
> patchset history:
> v5->v4:
> 1.Move uevent monitor epolling from eal interrupt to eal device layer.
> 2.Redefine the eal device API for common, and distinguish between linux and bsd
> 3.Add failure handler helper api in bus layer.Add function of find device by name.
> 4.Replace of individual fd bind with single device, use a common fd to polling all device.
> 5.Add to register hot insertion monitoring and process, add function to auto bind driver befor user add device
> 6.Refine some coding style and typos issue
> 7.add new callback to process hot insertion
>
> v4->v3:
> 1.move uevent monitor api from eal interrupt to eal device layer.
> 2.create uevent type and struct in eal device.
> 3.move uevent handler for each driver to eal layer.
> 4.add uevent failure handler to process signal fault issue.
> 5.add example for request and use uevent monitoring in testpmd.
>
> v3->v2:
> 1.refine some return error
> 2.refine the string searching logic to avoid memory issue
>
> v2->v1:
> 1.remove global variables of hotplug_fd, add uevent_fd
> in rte_intr_handle to let each pci device self maintain it fd,
> to fix dual device fd issue.
> 2.refine some typo error.
>
>
> Jeff Guo (2):
>   eal: add uevent monitor for hot plug
>   app/testpmd: use uevent to monitor hot removal
>
>  app/test-pmd/testpmd.c                             |  90 ++++++
>  lib/librte_eal/bsdapp/eal/eal_dev.c                |  64 ++++
>  .../bsdapp/eal/include/exec-env/rte_dev.h          | 105 ++++++
>  lib/librte_eal/common/eal_common_bus.c             |  31 ++
>  lib/librte_eal/common/eal_common_dev.c             | 223 ++++++++++++-
>  lib/librte_eal/common/eal_common_pci.c             |  69 +++-
>  lib/librte_eal/common/eal_common_pci_uio.c         |  31 +-
>  lib/librte_eal/common/eal_common_vdev.c            |  29 +-
>  lib/librte_eal/common/eal_private.h                |  13 +-
>  lib/librte_eal/common/include/rte_bus.h            |  52 +++
>  lib/librte_eal/common/include/rte_dev.h            | 102 +++++-
>  lib/librte_eal/common/include/rte_pci.h            |  26 ++
>  lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
>  lib/librte_eal/linuxapp/eal/eal_dev.c              | 351 +++++++++++++++++++++
>  lib/librte_eal/linuxapp/eal/eal_pci.c              |  33 ++
>  .../linuxapp/eal/include/exec-env/rte_dev.h        | 105 ++++++
>  16 files changed, 1318 insertions(+), 9 deletions(-)
>  create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
>  create mode 100644 lib/librte_eal/bsdapp/eal/include/exec-env/rte_dev.h
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
>  create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_dev.h
>
> --
> 2.7.4
>
  
Guo, Jia Sept. 20, 2017, 4:12 a.m. UTC | #2
So far, about hot plug in dpdk, we already have hot plug add/remove
api and fail-safe driver to offload the fail-safe work from the app
user. But there are still lack of a general event api, since the interrupt
event, which hot plug related with, is diversity between each device and
driver, such as mlx4, pci driver and others.

Use the hot removal event for example, pci drivers not all exposure the
remove interrupt, so in order to make user to easy use the hot plug feature
for pci driver, something must be done to detect the remove event at the
kernel level and offer a new line of interrupt to the user land.

Base on the uevent of kobject mechanism in kernel, we could use it to
benefit for monitoring the hot plug status of the device which not only
uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express
bus devices.

The idea is comming as bellow.

a.The uevent message form FD monitoring which will be useful.
remove@/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
ACTION=remove
DEVPATH=/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
SUBSYSTEM=uio
MAJOR=243
MINOR=2
DEVNAME=uio2
SEQNUM=11366

b.add uevent monitoring machanism:
add several general api to enable uevent monitoring.

c.add common uevent handler and uevent failure handler
uevent of device should be handler at bus or device layer, and the memory read
and write failure when hot removal should be handle correctly before detach behaviors.

d.show example how to use uevent monitor
enable uevent monitoring in testpmd or fail-safe to show usage.

patchset history:
v5->v4:
1.Move uevent monitor epolling from eal interrupt to eal device layer.
2.Redefine the eal device API for common, and distinguish between linux and bsd
3.Add failure handler helper api in bus layer.Add function of find device by name.
4.Replace of individual fd bind with single device, use a common fd to polling all device.
5.Add to register hot insertion monitoring and process, add function to auto bind driver befor user add device
6.Refine some coding style and typos issue
7.add new callback to process hot insertion

v4->v3:
1.move uevent monitor api from eal interrupt to eal device layer.
2.create uevent type and struct in eal device.
3.move uevent handler for each driver to eal layer.
4.add uevent failure handler to process signal fault issue.
5.add example for request and use uevent monitoring in testpmd.

v3->v2:
1.refine some return error
2.refine the string searching logic to avoid memory issue

v2->v1:
1.remove global variables of hotplug_fd, add uevent_fd
in rte_intr_handle to let each pci device self maintain it fd,
to fix dual device fd issue.
2.refine some typo error.


Jeff Guo (2):
  eal: add uevent monitor for hot plug
  app/testpmd: use uevent to monitor hot removal

 app/test-pmd/testpmd.c                             |  90 ++++++
 lib/librte_eal/bsdapp/eal/eal_dev.c                |  64 ++++
 .../bsdapp/eal/include/exec-env/rte_dev.h          | 105 ++++++
 lib/librte_eal/common/eal_common_bus.c             |  31 ++
 lib/librte_eal/common/eal_common_dev.c             | 223 ++++++++++++-
 lib/librte_eal/common/eal_common_pci.c             |  69 +++-
 lib/librte_eal/common/eal_common_pci_uio.c         |  31 +-
 lib/librte_eal/common/eal_common_vdev.c            |  29 +-
 lib/librte_eal/common/eal_private.h                |  13 +-
 lib/librte_eal/common/include/rte_bus.h            |  52 +++
 lib/librte_eal/common/include/rte_dev.h            | 102 +++++-
 lib/librte_eal/common/include/rte_pci.h            |  26 ++
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 lib/librte_eal/linuxapp/eal/eal_dev.c              | 351 +++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_pci.c              |  33 ++
 .../linuxapp/eal/include/exec-env/rte_dev.h        | 105 ++++++
 16 files changed, 1318 insertions(+), 9 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/bsdapp/eal/include/exec-env/rte_dev.h
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_dev.h
  
Guo, Jia Sept. 20, 2017, 6:51 a.m. UTC | #3
hi,jan


On 9/20/2017 2:44 AM, Jan Blunck wrote:
> On Wed, Sep 20, 2017 at 6:12 AM, Jeff Guo <jia.guo@intel.com> wrote:
>> So far, about hot plug in dpdk, we already have hot plug add/remove
>> api and fail-safe driver to offload the fail-safe work from the app
>> user. But there are still lack of a general event api, since the interrupt
>> event, which hot plug related with, is diversity between each device and
>> driver, such as mlx4, pci driver and others.
>>
>> Use the hot removal event for example, pci drivers not all exposure the
>> remove interrupt, so in order to make user to easy use the hot plug feature
>> for pci driver, something must be done to detect the remove event at the
>> kernel level and offer a new line of interrupt to the user land.
>>
>> Base on the uevent of kobject mechanism in kernel, we could use it to
>> benefit for monitoring the hot plug status of the device which not only
>> uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express
>> bus devices.
>>
>> The idea is comming as bellow.
>>
> Jeff,
>
> We already have libudev. Sorry for catching it that late but I don't
> believe that replicating the uevent handling belongs in the DPDK. You
> might want to look into a helper to find the corresponding rte_device
> for a given devnode though.
i think the kobject netlink message could be used by bunch of app,such 
as customize hotplug app or libudev,  off course it could used by dpdk, 
and since it would related with
some different behaviors of diversity drivers and user space complex 
device management. so aim to let dpdk user easy to develop hotplug , we 
want to offload the monitoring work and
coherenceprocession  into eal layer.
> Also the remap_device function should get removed. There is no
> synchronization between the polling for the uevent and the rest of the
> drivers. Therefore there is no guarantee that you can remap to "safe"
> memory. You should fix the drivers instead.
agree with you that would no have sync guarantee, maybe we could add the 
sigaction or other sync condition in drivers to handle the signal bus 
error when unsafe memory control occur.
but i think it is the sync problem , would not affect the remap_device 
functional, and would not affect to use a common bus layer memory 
control to let caller to use to remap to "safe".
am i right? also i want to collect info that it there any other 
conversion about that remap_device api?
> Thanks,
> Jan
>
>> a.The uevent message form FD monitoring which will be useful.
>> remove@/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
>> ACTION=remove
>> DEVPATH=/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
>> SUBSYSTEM=uio
>> MAJOR=243
>> MINOR=2
>> DEVNAME=uio2
>> SEQNUM=11366
>>
>> b.add uevent monitoring machanism:
>> add several general api to enable uevent monitoring.
>>
>> c.add common uevent handler and uevent failure handler
>> uevent of device should be handler at bus or device layer, and the memory read
>> and write failure when hot removal should be handle correctly before detach behaviors.
>>
>> d.show example how to use uevent monitor
>> enable uevent monitoring in testpmd or fail-safe to show usage.
>>
>> patchset history:
>> v5->v4:
>> 1.Move uevent monitor epolling from eal interrupt to eal device layer.
>> 2.Redefine the eal device API for common, and distinguish between linux and bsd
>> 3.Add failure handler helper api in bus layer.Add function of find device by name.
>> 4.Replace of individual fd bind with single device, use a common fd to polling all device.
>> 5.Add to register hot insertion monitoring and process, add function to auto bind driver befor user add device
>> 6.Refine some coding style and typos issue
>> 7.add new callback to process hot insertion
>>
>> v4->v3:
>> 1.move uevent monitor api from eal interrupt to eal device layer.
>> 2.create uevent type and struct in eal device.
>> 3.move uevent handler for each driver to eal layer.
>> 4.add uevent failure handler to process signal fault issue.
>> 5.add example for request and use uevent monitoring in testpmd.
>>
>> v3->v2:
>> 1.refine some return error
>> 2.refine the string searching logic to avoid memory issue
>>
>> v2->v1:
>> 1.remove global variables of hotplug_fd, add uevent_fd
>> in rte_intr_handle to let each pci device self maintain it fd,
>> to fix dual device fd issue.
>> 2.refine some typo error.
>>
>>
>> Jeff Guo (2):
>>    eal: add uevent monitor for hot plug
>>    app/testpmd: use uevent to monitor hot removal
>>
>>   app/test-pmd/testpmd.c                             |  90 ++++++
>>   lib/librte_eal/bsdapp/eal/eal_dev.c                |  64 ++++
>>   .../bsdapp/eal/include/exec-env/rte_dev.h          | 105 ++++++
>>   lib/librte_eal/common/eal_common_bus.c             |  31 ++
>>   lib/librte_eal/common/eal_common_dev.c             | 223 ++++++++++++-
>>   lib/librte_eal/common/eal_common_pci.c             |  69 +++-
>>   lib/librte_eal/common/eal_common_pci_uio.c         |  31 +-
>>   lib/librte_eal/common/eal_common_vdev.c            |  29 +-
>>   lib/librte_eal/common/eal_private.h                |  13 +-
>>   lib/librte_eal/common/include/rte_bus.h            |  52 +++
>>   lib/librte_eal/common/include/rte_dev.h            | 102 +++++-
>>   lib/librte_eal/common/include/rte_pci.h            |  26 ++
>>   lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
>>   lib/librte_eal/linuxapp/eal/eal_dev.c              | 351 +++++++++++++++++++++
>>   lib/librte_eal/linuxapp/eal/eal_pci.c              |  33 ++
>>   .../linuxapp/eal/include/exec-env/rte_dev.h        | 105 ++++++
>>   16 files changed, 1318 insertions(+), 9 deletions(-)
>>   create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
>>   create mode 100644 lib/librte_eal/bsdapp/eal/include/exec-env/rte_dev.h
>>   create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
>>   create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_dev.h
>>
>> --
>> 2.7.4
>>
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9a36e66..b5ff7d7 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -393,6 +393,10 @@  static void check_all_ports_link_status(uint32_t port_mask);
 static int eth_event_callback(uint8_t port_id,
 			      enum rte_eth_event_type type,
 			      void *param, void *ret_param);
+static int eth_uevent_callback(struct rte_device *dev,
+			      enum rte_eal_uevent_type type,
+			      void *param, void *ret_param);
+
 
 /*
  * Check if all the ports are started.
@@ -1413,6 +1417,7 @@  start_port(portid_t pid)
 	struct rte_port *port;
 	struct ether_addr mac_addr;
 	enum rte_eth_event_type event_type;
+	enum rte_eal_uevent_type uevent_type;
 
 	if (port_id_is_invalid(pid, ENABLED_WARN))
 		return 0;
@@ -1548,6 +1553,18 @@  start_port(portid_t pid)
 			}
 		}
 
+		for (uevent_type = RTE_EAL_UEVENT_UNKNOWN;
+		     uevent_type < RTE_EAL_UEVENT_MAX;
+		     uevent_type++) {
+			diag = rte_eal_uev_callback_register(&port->dev_info.pci_dev->device,uevent_type,
+							eth_uevent_callback, NULL);
+			if (diag) {
+				printf("Failed to setup uevent callback for uevent %d\n",
+					uevent_type);
+				return -1;
+			}
+		}
+
 		/* start port */
 		if (rte_eth_dev_start(pi) < 0) {
 			printf("Fail to start port %d\n", pi);
@@ -1842,6 +1859,16 @@  rmv_event_callback(void *arg)
 			dev->device->name);
 }
 
+static void
+rmv_uevent_callback(void *arg)
+{
+	struct rte_device *dev = (struct rte_device *)arg;
+	printf("removing device %s\n", dev->name);
+	if (rte_eal_dev_detach(dev))
+		RTE_LOG(ERR, USER1, "Failed to detach device %s\n",
+			dev->name);
+}
+
 /* This function is used by the interrupt thread */
 static int
 eth_event_callback(uint8_t port_id, enum rte_eth_event_type type, void *param,
@@ -1883,6 +1910,41 @@  eth_event_callback(uint8_t port_id, enum rte_eth_event_type type, void *param,
 	return 0;
 }
 
+/* This function is used by the interrupt thread */
+static int
+eth_uevent_callback(struct rte_device *dev, enum rte_eal_uevent_type type, void *param,
+		  void *ret_param)
+{
+	static const char * const event_desc[] = {
+		[RTE_EAL_UEVENT_UNKNOWN] = "Unknown",
+		[RTE_EAL_UEVENT_REMOVE] = "remove",
+	};
+
+	RTE_SET_USED(param);
+	RTE_SET_USED(ret_param);
+
+	if (type >= RTE_EAL_UEVENT_MAX) {
+		fprintf(stderr, "%s called upon invalid event %d\n",
+			__func__, type);
+		fflush(stderr);
+	} else if (event_print_mask & (UINT32_C(1) << type)) {
+		printf("%s event\n",
+			event_desc[type]);
+		fflush(stdout);
+	}
+
+	switch (type) {
+	case RTE_EAL_UEVENT_REMOVE:
+		if (rte_eal_alarm_set(100000,
+				rmv_uevent_callback, (void *)dev))
+			fprintf(stderr, "Could not set up deferred device removal\n");
+		break;
+	default:
+		break;
+	}
+	return 0;
+}
+
 static int
 set_tx_queue_stats_mapping_registers(uint8_t port_id, struct rte_port *port)
 {